From 50d7f7d035217d2316bd8c306571e08788d62a36 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Wed, 22 Apr 2026 16:29:13 +0100 Subject: [PATCH 01/20] bloblang(v2): Add V2 language specification Adds the design specification for Bloblang V2 under internal/bloblang2/spec/, split across thirteen numbered chapters (overview, type system, expressions, control flow, maps, imports, execution model, error handling, special features, grammar, common patterns, implementation guide, standard library) plus a top-level PROPOSAL.md and README.md. The spec is the source of truth for the Go and TypeScript runtimes and the V1 -> V2 migrator that follow. --- internal/bloblang2/spec/01_overview.md | 78 + internal/bloblang2/spec/02_type_system.md | 261 +++ internal/bloblang2/spec/03_expressions.md | 502 ++++++ internal/bloblang2/spec/04_control_flow.md | 373 +++++ internal/bloblang2/spec/05_maps.md | 206 +++ internal/bloblang2/spec/06_imports.md | 111 ++ internal/bloblang2/spec/07_execution_model.md | 338 ++++ internal/bloblang2/spec/08_error_handling.md | 254 +++ .../bloblang2/spec/09_special_features.md | 307 ++++ internal/bloblang2/spec/10_grammar.md | 248 +++ internal/bloblang2/spec/11_common_patterns.md | 140 ++ .../bloblang2/spec/12_implementation_guide.md | 131 ++ .../bloblang2/spec/13_standard_library.md | 1434 +++++++++++++++++ internal/bloblang2/spec/PROPOSAL.md | 188 +++ internal/bloblang2/spec/README.md | 105 ++ 15 files changed, 4676 insertions(+) create mode 100644 internal/bloblang2/spec/01_overview.md create mode 100644 internal/bloblang2/spec/02_type_system.md create mode 100644 internal/bloblang2/spec/03_expressions.md create mode 100644 internal/bloblang2/spec/04_control_flow.md create mode 100644 internal/bloblang2/spec/05_maps.md create mode 100644 internal/bloblang2/spec/06_imports.md create mode 100644 internal/bloblang2/spec/07_execution_model.md create mode 100644 internal/bloblang2/spec/08_error_handling.md create mode 100644 internal/bloblang2/spec/09_special_features.md create mode 100644 internal/bloblang2/spec/10_grammar.md create mode 100644 internal/bloblang2/spec/11_common_patterns.md create mode 100644 internal/bloblang2/spec/12_implementation_guide.md create mode 100644 internal/bloblang2/spec/13_standard_library.md create mode 100644 internal/bloblang2/spec/PROPOSAL.md create mode 100644 internal/bloblang2/spec/README.md diff --git a/internal/bloblang2/spec/01_overview.md b/internal/bloblang2/spec/01_overview.md new file mode 100644 index 000000000..448e88ff3 --- /dev/null +++ b/internal/bloblang2/spec/01_overview.md @@ -0,0 +1,78 @@ +# 1. Overview & Lexical Structure + +**Bloblang V2** is a domain-specific mapping language for stream processing with explicit context management and predictable behavior. + +## 1.1 Design Principles + +1. **Radical Explicitness** - No implicit context shifting, all references explicit +2. **One Clear Way** - Single obvious approach for each operation +3. **Consistent Syntax** - Symmetrical keywords (`input`/`output`), consistent prefixes +4. **Fail Loudly** - Errors are explicit, not silent + +## 1.2 Quick Start + +```bloblang +# Basic assignment +output.user_id = input.user.id +output.email = input.user.email.lowercase() + +# Null-safe navigation +output.city = input.user?.address?.city.or("Unknown") + +# Functional pipeline +output.active_users = input.users + .filter(user -> user.active) + .map(user -> user.name) + .sort() + +# Pattern matching +output.category = match input.score as s { + s >= 80 => "high", + s >= 50 => "medium", + _ => "low", +} + +# Named transformation (isolated function) +map normalize_user(data) { + { + "id": data.user_id, + "name": data.full_name + } +} +output.user = normalize_user(input.user_data) +``` + +## 1.3 Lexical Structure + +**Keywords:** `input`, `output`, `if`, `else`, `match`, `as`, `map`, `import`, `true`, `false`, `null`, `_` + +**Reserved function names:** `deleted`, `throw`, `void` — these parse as regular function calls but have special semantics (see Sections 8.4, 9.2, 12.3, and 13.1). Like keywords, reserved function names cannot be used as identifiers — they cannot be variable names, map names, or parameter names. They remain valid as field names (e.g., `input.deleted`, `output.throw`, `input.void`) since field access uses the broader `word` pattern. + +`_` has context-dependent roles: it serves as the wildcard in match cases (Section 4.2) and as a **discard parameter** in map and lambda parameter lists (Sections 3.4, 5.1). + +**Operators:** `.`, `?.`, `@`, `::`, `=`, `+`, `-`, `*`, `/`, `%`, `!`, `>`, `>=`, `==`, `!=`, `<`, `<=`, `&&`, `||`, `=>`, `->` (`?.` applies to field access, indexing, and method calls) + +**Delimiters:** `(`, `)`, `{`, `}`, `[`, `]`, `?[`, `,`, `:` + +**Variables:** `$name` (declaration and reference) + +**Metadata:** `input@.key` (read), `output@.key` (write) + +**Literals:** +- Numbers: `42`, `3.14` (negative numbers use unary minus: `-10`). Integer literals are int64; float literals are float64. Float literals require digits on both sides of the decimal point — `.5` and `5.` are invalid, write `0.5` and `5.0` instead. Exponent notation (e.g., `1e3`) is not supported in literals — use `.parse_json()` or explicit arithmetic instead. Literals that exceed the range of their type are a compile-time error. **Note:** `-10` is not a single token — it is unary minus applied to `10`. Since method calls bind tighter than unary minus, `-10.string()` parses as `-(10.string())` which is an error. Use `(-10).string()` instead. +- Strings: `"hello"`, `"escape\n"`, `"\u{1F600}"`, or `` `raw multiline` `` +- Booleans: `true`, `false` +- Null: `null` +- Arrays: `[1, 2, 3]`, `["a", input.field, uuid_v4()]` +- Objects: `{"name": "value", "count": 42}` + +**Comments:** `#` to end-of-line + +**Identifiers:** `[a-zA-Z_][a-zA-Z0-9_]*` excluding keywords and reserved function names (notably `_` alone is not a valid identifier). Used for variable names, map names, and parameter names — these cannot be keywords or reserved function names (`deleted`, `throw`, `void`). The exception is `_`, which is permitted as a discard parameter in map and lambda parameter lists (Sections 3.4, 5.1). + +**Field names:** Field names after `.` and `?.` accept any word (`[a-zA-Z_][a-zA-Z0-9_]*` including keywords) — `input.map`, `output.if`, `data.match` are all valid. Use `."quoted"` for names with special characters or spaces: +```bloblang +input.map # Valid: keyword as field name +input."field with spaces" # Quoting needed: spaces +output."special.field" # Quoting needed: contains dot +``` diff --git a/internal/bloblang2/spec/02_type_system.md b/internal/bloblang2/spec/02_type_system.md new file mode 100644 index 000000000..cb7b62b2e --- /dev/null +++ b/internal/bloblang2/spec/02_type_system.md @@ -0,0 +1,261 @@ +# 2. Type System & Coercion + +Bloblang V2 is **dynamically typed** - types determined at runtime. + +## 2.1 Runtime Types + +| Type | Description | Examples | +|------|-------------|----------| +| `string` | UTF-8 text (operations are codepoint-based) | `"hello"`, `""` | +| `int32` | 32-bit signed integer | `42.int32()` | +| `int64` | 64-bit signed integer (default for integer literals) | `42`, `-10` (unary minus) | +| `uint32` | 32-bit unsigned integer | `42.uint32()` | +| `uint64` | 64-bit unsigned integer | `42.uint64()`, `"18446744073709551615".uint64()` | +| `float32` | 32-bit IEEE 754 float | `3.14.float32()` | +| `float64` | 64-bit IEEE 754 float (default for float literals) | `3.14`, `-10.5` (unary minus) | +| `bool` | Boolean | `true`, `false` | +| `null` | Null value | `null` | +| `bytes` | Byte array (operations are byte-based; no implicit JSON serialization — see Section 13.11) | `"hello".bytes()` | +| `array` | Ordered collection | `[1, "two", true]` | +| `object` | Key-value map | `{"key": "value"}` | +| `timestamp` | Point in time with nanosecond precision | `now()`, `"2024-03-01".ts_parse("%Y-%m-%d")` | + +**Large uint64 values:** Integer literals are always int64, so values exceeding int64 range (> 9223372036854775807) cannot be written as bare literals. To create large uint64 values, parse from a string: `"18446744073709551615".uint64()`. Writing the value as a bare literal (e.g., `18446744073709551615.uint64()`) is a compile error because the literal exceeds int64 range before the conversion method is applied. + +**Important:** String operations (indexing, `.length()`, etc.) work on **Unicode codepoints**, not grapheme clusters. This means complex emoji and combining characters may span multiple codepoints. Byte operations work on individual bytes in the UTF-8 encoding. + +**No Unicode normalization:** Strings are compared codepoint-by-codepoint without normalization. Different Unicode representations of the same visual character (e.g., precomposed `é` U+00E9 vs decomposed `e` U+0065 + `◌́` U+0301) are **not equal** and may have different `.length()` values. This matches the behavior of Go, Rust, and most systems languages. If input data may contain mixed normalization forms, use an explicit normalization step before comparison. + +**Void:** Void is not a runtime type — it is the absence of a value, produced implicitly (an if-expression without `else` when the condition is false, or a match expression without `_` when no case matches — see Section 4.1) or explicitly via the `void()` builtin (Section 13.1). Void cannot be stored in variables, passed as arguments, used in expressions, or included in collection literals (all are errors). It only exists transiently to signal "no value was produced," and is only meaningful in assignments where it causes the assignment to be skipped (a no-op). The exceptions are `.or()`, which rescues void by returning its argument (Section 8.3), and `.catch()`, which passes void through unchanged (Section 8.2). All other method calls on void are errors — e.g., `.type()` on void is not possible. See Section 4.1 for full void semantics. + +## 2.2 Type Introspection + +```bloblang +output.type = input.value.type() # Returns type name as string + +# Type checking +output.is_array = input.items.type() == "array" +output.is_null = input.maybe.type() == "null" +``` + +## 2.3 Type Coercion + +**The `+` Operator:** +- Both strings: string concatenation +- Both bytes: byte concatenation +- Both numeric: addition (with promotion, see below) +- Any cross-family mix (string + number, bytes + string, etc.): **error** + +```bloblang +output.sum = 5 + 3 # 8 (int64) +output.concat = "hello" + " world" # "hello world" (string) +output.joined = "ab".bytes() + "cd".bytes() # byte concatenation +output.bad = 5 + "3" # ERROR: cannot add int64 and string +output.bad2 = "hello" + "world".bytes() # ERROR: cannot add string and bytes +output.ok = 5.string() + "3" # "53" (explicit conversion) +``` + +**Other Operators:** +- Arithmetic (`-`, `*`, `/`, `%`): Require numeric types (null errors), with promotion +- Comparison (`>`, `<`, `>=`, `<=`): Require comparable same types (null errors), with numeric promotion. Comparable types are: numeric types (with promotion), timestamps, strings (lexicographic by Unicode codepoint), and bytes (lexicographic by byte value) +- Equality (`==`, `!=`): Numeric types use promotion then compare by value; non-numeric types require same type and value; cross-family is always `false` (see below) +- Logical (`!`, `&&`, `||`): Require booleans + +### Numeric Type Promotion + +When arithmetic, comparison, or equality operators are applied to operands of different numeric types, both operands are promoted to a common type before the operation. Non-numeric types (string, bool, null, etc.) are never promoted — mixing them with numbers is always an error (for arithmetic/comparison) or always `false` (for equality). + +**Promotion rules (applied in order):** + +| Operand types | Promoted to | Error condition | +|---------------|-------------|----------------| +| Same type | No promotion | `int64 + int64 → int64` | +| Same signedness, different width | Wider type | `int32 + int64 → int64`, `float32 + float64 → float64` | +| Signed + unsigned integer | int64 | Error if uint64 value > 2^63-1 (cannot fit in int64) | +| Any integer + any float | float64 | Error if integer magnitude > 2^53 (cannot be represented exactly) | + +**Promotion is checked, not silent.** All widening promotions that are lossless (e.g., int32 → int64, float32 → float64) always succeed. Promotions that may lose data are validated at runtime and **throw an error** if the value cannot be represented exactly in the target type. This applies to both operands — if either operand cannot be safely promoted, the operation errors. + +**Division always produces a float:** + +| Operand types | Result type | +|---------------|-------------| +| float32 / float32 | float32 | +| All other combinations | float64 | + +There is no integer division operator. To get an integer result, convert explicitly: `(7 / 2).int64()` truncates toward zero (result: `3`), `(7 / 2).floor().int64()` floors (result: `3`). These differ for negative operands: `(-7 / 2).int64()` is `-3` (truncation), `(-7 / 2).floor().int64()` is `-4` (floor). + +**Modulo follows standard promotion rules** (not the division rule). The result type is determined by the promoted operand type. For float operands, modulo uses **truncated division remainder** semantics (equivalent to C `fmod`), where the result has the same sign as the dividend: + +| Operand types | Result type | Example | +|---------------|-------------|---------| +| int64 % int64 | int64 | `7 % 2 → 1` | +| int32 % int64 | int64 | Promoted to int64 | +| Any integer % any float | float64 | `7 % 2.0 → 1.0` | +| float64 % float64 | float64 | `7.5 % 2.0 → 1.5` | + +```bloblang +# Same type: no promotion +output.a = 5 + 3 # 8 (int64) +output.b = 5.0 + 3.0 # 8.0 (float64) + +# Different width: promote to wider (always lossless) +output.c = 5.int32() + 10 # 15 (int64: int32 promoted to int64) + +# Signed + unsigned: promote to int64 (checked) +output.d = 5.int32() + 10.uint32() # 15 (int64: both fit) +output.bad = 5 + "9999999999999999999".uint64() # ERROR: uint64 value exceeds int64 range + +# Integer + float: promote to float64 (checked) +output.e = 5 + 3.0 # 8.0 (float64: 5 fits exactly) +output.bad = 9007199254740993 + 1.0 # ERROR: int64 value exceeds float64 exact range (> 2^53) + +# Division: always float +output.f = 7 / 2 # 3.5 (float64) +output.g = 20 / 4 / 2 # 2.5 (float64) +output.h = 10.0 / 3.0 # 3.333... (float64) + +# Modulo: follows standard promotion (not the division rule) +output.i = 7 % 2 # 1 (int64) +output.j = 7.0 % 2.0 # 1.0 (float64, fmod) +output.k = 7.5 % 2.0 # 1.5 (float64, fmod) +output.l = 7 % 2.0 # 1.0 (float64: int64 promoted to float64, fmod) + +# Division by zero: always an error +output.bad = 7 / 0 # ERROR: division by zero +output.bad = 7.0 / 0.0 # ERROR: division by zero + +# Modulo by zero: always an error +output.bad = 7 % 0 # ERROR: modulo by zero +``` + +**Integer overflow:** Integer arithmetic that overflows the result type is always a runtime error. This applies to all integer types (int32, int64, uint32, uint64) and all arithmetic operators (`+`, `-`, `*`, `%`). Implementations must detect overflow and throw an error rather than wrapping or saturating. + +```bloblang +output.bad = 9223372036854775807 + 1 # ERROR: int64 overflow +output.bad = (-2147483648).int32() - 1.int32() # ERROR: int32 overflow +output.ok = 9223372036854775807.uint64() + 1.uint64() # 9223372036854775808 (uint64, no overflow) +``` + +**Special float values (NaN, Infinity):** Division by zero is always an error — it does not produce Infinity or NaN. However, NaN and Infinity values may enter the system through input data. When they do, Bloblang follows IEEE 754 semantics: +- `NaN == NaN` is `false` (NaN is not equal to anything, including itself) +- `NaN != NaN` is `true` +- `NaN > x`, `NaN < x`, `NaN >= x`, `NaN <= x` are all `false` for any `x` +- Arithmetic with NaN produces NaN +- Infinity compares normally (`Infinity > 1.0` is `true`, `Infinity == Infinity` is `true`) +- Negative zero: `-0.0 == 0.0` is `true`, `-0.0 < 0.0` is `false` (they are equal per IEEE 754). `.string()` normalizes to `"0.0"` (not `"-0.0"`). + +**NaN in other contexts:** `.sort()` uses total ordering for NaN — NaN sorts after all other numeric values, not IEEE 754 comparison (Section 13.6). `.unique()` treats all NaN values as equal, consistent with sort's total ordering (Section 13.6). `.bool()` on NaN is an error — NaN is neither zero nor non-zero (Section 13.2). + +**Equality Semantics:** + +For non-numeric types, both type and value must match for equality to return `true`. Strings compare codepoint-by-codepoint, bytes compare byte-by-byte, arrays compare element-by-element, and objects compare by key-value pairs regardless of key order. Different non-numeric types always return `false` (not an error). + +For numeric types, the same promotion rules used for arithmetic apply before comparison. Both operands are promoted to a common numeric type, then compared by value. This means `5 == 5.0` is `true` (int64 promoted to float64, values match). + +Cross-family comparisons (numeric vs non-numeric) always return `false`. + +```bloblang +# Numeric equality: promotion rules applied +5 == 5.0 # true (int64 promoted to float64, same value) +5.int32() == 5 # true (int32 promoted to int64, same value) +5.int32() == 5.float64() # true (both promoted to float64, same value) +5 == 6.0 # false (promoted, different value) + +# Non-numeric types: type and value must match +"hello" == "hello" # true +true == true # true +null == null # true +"a" == "b" # false + +# Cross-family: always false +5 == "5" # false (numeric vs string) +true == 1 # false (bool vs numeric) +null == 0 # false (null vs numeric) + +# Collections: structural equality (value-based, numeric promotion applies within) +[1, 2] == [1, 2] # true (same contents) +[1, 2] == [2, 1] # false (different order — arrays are ordered) +{"a": 1} == {"a": 1} # true (same keys and values) +{"a": 1, "b": 2} == {"b": 2, "a": 1} # true (key order irrelevant for objects) +``` + +**Object key ordering:** Object key ordering is **not preserved**. Programs must not depend on iteration order in `.iter()`, JSON serialization order, or any other context where keys are enumerated. Object equality compares keys and values regardless of order. + +**Timestamp semantics:** Timestamps represent a point in time with nanosecond precision. In addition to the instant, each timestamp carries a **stored zone** that controls how it is rendered as a string. Comparison, subtraction, and `.ts_unix*()` methods operate on the instant (the stored zone is not consulted); `.ts_format()` and `.string()` use the stored zone to produce the displayed clock and offset (Section 13.9). They support: + +- **Equality and comparison:** Timestamps can be compared with `==`, `!=`, `<`, `>`, `<=`, `>=`. Earlier times are less than later times. Equality ignores the stored zone — two timestamps representing the same instant are equal even if their stored zones differ. +- **Arithmetic:** `timestamp - timestamp` returns an int64 (duration in nanoseconds). All other arithmetic involving timestamps is an error — including `timestamp + timestamp`, `timestamp + number`, `timestamp - number`, `number - timestamp`, and any use of `*`, `/`, or `%` with a timestamp operand. Use `.ts_add(nanos)` to offset a timestamp by a duration, or `.ts_unix()` and related methods for numeric conversions. **Note:** int64 nanoseconds can represent approximately ±292 years; subtracting timestamps further apart than this is an integer overflow error (Section 2.3). +- **Methods:** `.ts_format()`, `.ts_add()`, `.ts_unix()`, `.ts_unix_milli()`, `.ts_unix_micro()`, `.ts_unix_nano()`, `.type()`, `.string()`. +- **Construction from numeric:** `.ts_from_unix()` on any numeric type (integers for second precision; floats for sub-second precision, limited by float64's ~15-17 significant digits). For exact sub-second precision, use `.ts_from_unix_milli()`, `.ts_from_unix_micro()`, or `.ts_from_unix_nano()` on int64 values. See Section 13.9. +- **Serialization:** When serialized to JSON, timestamps are formatted as RFC 3339 strings using the stored zone. When converted with `.string()`, the result is also RFC 3339. Trailing fractional zeros are trimmed (e.g., `.500000000` becomes `.5`; whole-second timestamps omit the fractional part entirely). See Section 13.9 for which construction paths produce UTC-zoned versus local-zoned timestamps. + +```bloblang +$a = now() +$b = now() +$a < $b # true (earlier < later) +$a == $a # true +$b - $a # int64: nanoseconds between the two timestamps +$a + 1 # ERROR: cannot add timestamp and int64 +$a.string() # RFC 3339 using $a's stored zone (e.g., "2024-03-01T14:32:05+01:00" + # in Europe/London summer, or "2024-03-01T12:00:00Z" if stored UTC) +``` + +## 2.4 Null Handling + +**Null errors immediately in most operations:** +```bloblang +# ❌ Errors +null + 5 # ERROR: arithmetic requires numbers +null.uppercase() # ERROR: method doesn't support null +null > 5 # ERROR: ordering requires comparable types + +# ✅ Equality comparisons work with null +null == null # true (same type and value) +null != null # false +null == 5 # false (different types: null vs int64) +null != 5 # true + +# ✅ Null-safe navigation prevents errors +input.user?.name # null if user is null (no error) +input.items?[0] # null if items is null (no error) + +# ✅ .or() provides defaults for null, void, or deleted() +input.value.or("default") # "default" if value is null, void, or deleted() +``` + +## 2.5 Type Conversions + +**Required** conversion methods (these are the only way to create non-default numeric types since literals are always int64 or float64): +- `.string()` - Convert to string +- `.int32()` - Convert to int32 +- `.int64()` - Convert to int64 +- `.uint32()` - Convert to uint32 +- `.uint64()` - Convert to uint64 +- `.float32()` - Convert to float32 +- `.float64()` - Convert to float64 +- `.bool()` - Convert to boolean +- `.bytes()` - Convert to byte array +- `.char()` - Convert integer codepoint to single-character string +- `.type()` - Get type name + +```bloblang +output.str = input.count.string() # "42" +output.i32 = "42".int32() # 42 (int32) +output.i64 = "42".int64() # 42 (int64) +output.u32 = "255".uint32() # 255 (uint32) +output.u64 = "1000".uint64() # 1000 (uint64) +output.f32 = "3.14".float32() # 3.14 (float32) +output.f64 = "3.14".float64() # 3.14 (float64) +output.bool = "true".bool() # true +output.bytes = "hello".bytes() # byte array +output.ch = "hello"[0].char() # "h" +``` + +**Type promotion in arithmetic:** Mixed numeric types are automatically promoted (see Section 2.3). Non-numeric types always require explicit conversion: + +```bloblang +output.sum = 5 + 10 # int64 + int64 = int64 +output.sum = 5 + 10.0 # int64 + float64 = float64 (promoted) +output.sum = 5 + "10" # ERROR: cannot add int64 and string +output.sum = 5 + "10".int64() # int64 + int64 = int64 (explicit conversion) diff --git a/internal/bloblang2/spec/03_expressions.md b/internal/bloblang2/spec/03_expressions.md new file mode 100644 index 000000000..1291977a4 --- /dev/null +++ b/internal/bloblang2/spec/03_expressions.md @@ -0,0 +1,502 @@ +# 3. Expressions & Statements + +## 3.1 Path Expressions + +Access nested data: `input.user.email`, `output.result.id` + +**Path roots:** +- **Top-level (in assignments):** `input`, `output`, or `$variable` only +- **In expressions within maps/lambdas:** Parameters available as bare identifiers (e.g., `data.field` where `data` is a parameter). Parameters are **read-only** and can only appear in expressions, never as assignment targets. +- **Match with `as`:** Bound variable available as bare identifier in expressions (e.g., `match input as x { x.field ... }`) + +**Metadata:** `input@.key`, `output@.key` + +**Important:** Bare identifiers (parameters and match bindings) are read-only and can only be used in expressions on the right-hand side. They cannot be assigned to. + +**Name resolution:** Every bare identifier in an expression must resolve to a bound name — a map parameter, lambda parameter, match `as` binding, map name, or standard library function name. Namespace-qualified references (`namespace::name`) also resolve to maps from imported modules. An unresolved bare identifier is a **compile-time error**. This catches typos like `inpt.field` (instead of `input.field`) at compile time rather than allowing them to parse and fail later. Resolution priority (innermost wins): parameters > maps > standard library functions. User-defined maps shadow standard library functions of the same name. + +**Map and function name references:** When a bare identifier or qualified name resolves to a map or standard library function, it is only valid in two contexts: (1) as a call with parentheses — `double(x)`, `math::double(x)` — or (2) as an argument to a higher-order method — `.map(double)`, `.filter(math::is_positive)`. Using a map or function name as a general-purpose expression (e.g., `output.x = double`, `$fn = uuid_v4`) is a **compile-time error**. See Section 5.5 for details. + +**Field names:** Keywords are valid as field names without quoting — `input.map`, `output.if`, `data.match` all work. Use `."quoted"` for fields with special characters, spaces, or names starting with digits: +```bloblang +input.map # Valid: keyword as field name +input."field with spaces" # Quoting needed: spaces +output."special-chars" # Quoting needed: contains hyphen +data."123" # Quoting needed: starts with digit +``` + +### Indexing + +```bloblang +input.items[0] # Array: first element +input.items[-1] # Array: last element +input.items[-2] # Array: second-to-last element +input["field"] # Object: dynamic field access +input[$var] # Object: dynamic field access with variable +input.name[0] # String: first codepoint as int64 (Unicode codepoint value) +input.name[-1] # String: last codepoint as int64 +input.data[0] # Bytes: first byte as int64 (0-255) +input.data[-1] # Bytes: last byte as int64 +``` + +**Negative indexing:** For arrays, strings, and bytes, negative indices count from the end: `-1` is last, `-2` is second-to-last, etc. Out-of-bounds negative indices throw errors. + +**Semantics:** +- **Objects:** Indexed by string, returns field value (dynamic field access). Non-string indices are an error (no implicit conversion). +- **Arrays:** Indexed by number, returns element at position. The index value must be a whole number — float values like `2.0` are accepted but `1.5` is a runtime error. Non-numeric indices are an error. +- **Strings:** Indexed by number (codepoint position), returns int64 (Unicode codepoint value). The same whole-number requirement applies. Negative indices count from the end. Use `.char()` to convert back to a string. +- **Bytes:** Indexed by number (byte position), returns int64 (byte value 0-255). The same whole-number requirement applies. Negative indices count from the end. +- **All other types** (bool, numeric, null, timestamp): indexing is a runtime error. + +**String indexing is codepoint-based, not grapheme-based:** +```bloblang +# Simple characters (1 codepoint each) +"hello"[0] # 104 (int64: codepoint for 'h') +"café"[3] # 233 (int64: codepoint for 'é') + +# Emoji (1 codepoint) +"😀"[0] # 128512 (int64: codepoint for 😀) + +# Complex graphemes (multiple codepoints) +"👋🏽"[0] # 128075 (int64: base emoji 👋) +"👋🏽"[1] # 127995 (int64: skin tone modifier 🏽) + +# Family emoji with ZWJ (zero-width joiners) +"👨‍👩‍👧‍👦"[0] # 128104 (int64: man 👨) +"👨‍👩‍👧‍👦"[1] # 8205 (int64: zero-width joiner) + +# Round-trip: index to codepoint, .char() back to string +"hello"[0].char() # "h" +"café"[3].char() # "é" +``` + +**All string operations are codepoint-based:** +```bloblang +"hello".length() # 5 (codepoints) +"👋🏽".length() # 2 (codepoints: base emoji + skin tone modifier) +"café".length() # 4 (codepoints) +``` + +**Byte operations are byte-based:** +```bloblang +"hello".bytes()[0] # 104 (int64: byte value of 'h') +"hello".bytes().length() # 5 (bytes) +"👋".bytes().length() # 4 (UTF-8 encoding uses 4 bytes) +``` + +Out-of-bounds indexing throws error. Use `.catch(err -> ...)` for safety. + +### Null-Safe Navigation + +```bloblang +input.user?.address?.city # null if any part is null +input.items?[0]?.name # null-safe indexing + +# Mix with .or() for defaults +input.contact?.email.or("no-email@example.com") +``` + +**Null-safe method calls:** `?.` also works before method calls. If the receiver is null, the method is not called and `null` is returned. Arguments are not evaluated. +```bloblang +input.value?.uppercase() # null if value is null (method not called) +input.user?.name?.trim() # chains null-safe field access and null-safe method call +``` + +**Note:** `?.`, `?[]`, and `?.method()` only short-circuit on `null` values. Type errors (e.g., accessing a field on a string, or calling a string method on a number) still throw errors: +```bloblang +null?.uppercase() # null (short-circuited: value is null) +5?.uppercase() # ERROR: uppercase requires string (value is not null, wrong type) +"hello"?.nonfield?.trim() # ERROR: cannot access field on string (not null, wrong type) +``` + +## 3.2 Operators + +**Precedence** (high to low): +1. Field access, indexing, method calls: `.`, `?.`, `[]`, `?[]`, `.method()`, `?.method()` +2. Unary: `!`, `-` +3. Multiplicative: `*`, `/`, `%` +4. Additive: `+`, `-` +5. Comparison: `>`, `>=`, `<`, `<=` +6. Equality: `==`, `!=` +7. Logical AND: `&&` +8. Logical OR: `||` + +**Associativity:** +- **Left-associative:** Arithmetic (`+`, `-`, `*`, `/`, `%`), Logical (`&&`, `||`) +- **Non-associative:** Comparison (`>`, `>=`, `<`, `<=`), Equality (`==`, `!=`) + +**Lambda arrow (`->`):** The `->` token is not a binary operator and does not participate in the precedence hierarchy. A lambda is recognized by its distinct prefix — `identifier ->` or `(params) ->` — which the parser can identify before any precedence comparison. The arrow then consumes the entire right-hand side as the lambda body. For example, `x -> x + 1` parses as `x -> (x + 1)`. + +Lambdas are **not** general expressions — they appear only as arguments to function and method calls (positional or named). Using a lambda in any other expression position — assignment RHS (`$fn = x -> x`), collection literal element (`[x -> x]`), operator operand (`5 + x -> x * 2`), etc. — is a **parse error**. See Section 10 for the grammar. + +```bloblang +# Precedence examples +output.calc = input.a + input.b * 2 # * before + +output.check = input.x > 10 && input.y < 20 # > before && +output.neg = -input.value # Unary minus +output.not = !input.flag # Logical not + +# Precedence trap: method calls bind tighter than unary minus +# -10.string() # ERROR: parses as -(10.string()) = -("10") +output.neg_str = (-10).string() # OK: "-10" + +# Associativity examples (left-associative) +output.result = 10 - 5 - 2 # (10 - 5) - 2 = 3 +output.result = 20 / 4 / 2 # (20 / 4) / 2 = 2.5 +output.result = a && b && c # (a && b) && c + +# Non-associative (must use parentheses) +output.invalid = a < b < c # ERROR: cannot chain comparisons +output.valid = a < b && b < c # OK: explicit logical combination +``` + +**Non-associative operator enforcement:** Chaining non-associative operators (e.g., `a < b < c`, `a == b == c`) is a parse error. Implementations must detect and reject such expressions during parsing rather than allowing them to fail at runtime. + +## 3.3 Functions & Methods + +**Functions** (standalone): +```bloblang +output.id = uuid_v4() +output.time = now() +output.roll = random_int(1, 6) +``` + +**Named arguments:** Functions and user maps support named arguments: +```bloblang +# Positional (order matters) +output.result = some_function(arg1, arg2, arg3) + +# Named (order doesn't matter) +output.result = some_function(param1: arg1, param2: arg2, param3: arg3) +output.result = some_function(param3: arg3, param1: arg1, param2: arg2) + +# Cannot mix positional and named +output.result = some_function(arg1, param2: arg2) # ERROR + +# Duplicate named arguments are a compile-time error +output.result = some_function(param1: arg1, param1: arg2) # ERROR +``` + +**Methods** (chained): +```bloblang +output.upper = input.text.uppercase() +output.len = input.items.length() +output.parsed = input.date.ts_parse("%Y-%m-%d") +output.parsed = input.date.ts_parse(format: "%Y-%m-%d") # Named +``` + +**Method Chaining:** +```bloblang +output.result = input.text + .trim() + .lowercase() + .replace_all(" ", "-") +``` + +**Method resolution:** Method names are resolved at compile time against the set of known methods (standard library + implementation extensions). Calling an unknown method is a **compile-time error**. Type compatibility between the receiver and the method is checked at **runtime** (since types are dynamic). + +```bloblang +input.value.nonexistent() # Compile-time error: unknown method +input.value.uppercase() # OK at compile time; runtime error if value is not a string +``` + +**Type requirements:** Methods work on specific types. Calling a method on an incompatible type (including null) results in a runtime error. Use null-safe operators to skip methods when values might be null: +```bloblang +input.value?.uppercase() # Skip method if value is null +input.value.uppercase() # ERROR if value is null (uppercase requires string) +input.value.type() # Works on any type including null +``` + +## 3.4 Lambda Expressions + +Lambdas are inline syntax for method and function call arguments — they are not values that can be stored in variables, embedded in collection literals, or used as operator operands. The grammar (Section 10) permits a lambda only in the `arg_value` position of positional or named arguments; any other position is a **parse error**. For example, `$fn = x -> x * 2`, `[x -> x]`, `{"a": x -> x}`, and `5 + x -> x * 2` are all parse errors. For reusable transforms, use named maps (Section 5). + +Whether a call actually accepts a lambda at a given argument position is a separate semantic check. Most standard library methods that accept a lambda document it explicitly (e.g., `.map()`, `.filter()`, `.sort_by()`). Passing a lambda to a method or function whose signature does not accept one (e.g., `.or(x -> x)`, `some_map(x -> x)`) is a **compile error**. + +Lambda parameters are **read-only** and available as bare identifiers within the lambda body. + +**Single parameter:** +```bloblang +input.items.map(item -> item.value * 2) # 'item' is read-only parameter +input.items.filter(x -> x > 10) +``` + +**Multiple parameters:** +```bloblang +input.data.map_values(v -> v.uppercase()) +``` + +**Multi-statement body:** +```bloblang +input.items.map(item -> { + $base = item.price * item.quantity + $tax = $base * 0.1 + $base + $tax +}) +``` + +Lambda blocks must end with an expression (the return value). Statement-only blocks are invalid. + +**Discard parameters (`_`):** +```bloblang +# Discard unused parameters with _ +input.data.map_entries((_, v) -> v * 2) # Discard key, use value +input.items.fold(0, (_, elem) -> elem.value) # Discard accumulator + +# Multiple discards allowed +input.data.map_entries((_, _) -> "constant") +``` + +`_` as a parameter means "this argument is required by the call signature but unused." It is not bound — referencing `_` in the body is a compile error (it remains a keyword). Multiple `_` parameters are allowed in the same parameter list. + +**Default parameters:** Parameters with defaults must come after all required parameters. Default values must be literals (`42`, `"hello"`, `true`, `false`, `null`). Discard parameters (`_`) cannot have defaults. Default parameters follow the same rules as in maps (Section 5.1). + +**Parameter shadowing:** Lambda parameter names shadow any map names with the same name within the lambda body. The parameter always wins. Imported namespaces are not affected since they use `::` syntax. +```bloblang +map double(x) { x * 2 } +input.items.map(double -> double * 2) # double is the parameter, not the map +``` + +**Purity:** Lambdas cannot assign to `output` or `output@` (no side effects). Because lambda bodies are expression contexts (Section 3.8), any assignment to a variable name from an outer scope creates a new shadow binding in the lambda scope, leaving the outer variable unchanged. + +**Context inheritance:** Lambdas inherit the read permissions of their enclosing context. A top-level lambda can read `input` and `output`; a lambda inside a map body cannot (maps are isolated — Section 5.3). See Section 7.5 for the full scoping rules. + +## 3.5 Conditional Expressions + +If and match can be used as expressions (returning a value) or as statements (containing assignments). See Section 4 for full semantics including void behavior, match forms, and the expression/statement distinction. + +```bloblang +# If expression +output.category = if input.score >= 80 { "high" } else { "low" } + +# If expression with else-if +output.tier = if input.score >= 90 { "gold" } else if input.score >= 50 { "silver" } else { "bronze" } + +# Match: equality, boolean with 'as', boolean without expression +output.sound = match input.animal { "cat" => "meow", _ => "unknown" } +output.tier = match input.score as s { s >= 100 => "gold", _ => "other" } +output.grade = match { input.score >= 90 => "A", _ => "F" } +``` + +Conditional expressions cannot assign to `output` or `output@`. + +## 3.6 Literals + +**Strings:** + +Regular strings use double quotes with backslash escape sequences: +```bloblang +"hello world" +"line one\nline two" # \n newline +"tab\there" # \t tab +"quote: \"hi\"" # \" escaped quote +"backslash: \\" # \\ literal backslash +``` + +Escape sequences: `\\`, `\"`, `\n`, `\t`, `\r`, `\uXXXX` (4-digit Unicode codepoint, BMP only), `\u{X...}` (1–6 hex digit Unicode codepoint, any plane). Examples: `\u0041` for 'A', `\u{1F600}` for '😀', `\u{41}` for 'A'. + +Raw strings use backticks. No escape processing — content is used as-is: +```bloblang +`This is a raw string. +It can contain "quotes" without escaping. +Backslashes are literal: C:\path\to\file +Newlines are preserved as-is.` +``` + +Raw string rules: +- Content between backticks is taken strictly verbatim (no escape sequences, no stripping) +- All characters between the backticks are included, including any leading or trailing newlines +- Cannot contain a literal backtick character (use a regular double-quoted string instead — backticks do not need escaping in regular strings) + +**Arrays:** (trailing commas are permitted) +```bloblang +[1, 2, 3] +["a", input.field, uuid_v4()] +``` + +**Objects:** (trailing commas are permitted) +```bloblang +{"name": "Alice", "age": 30} +{"id": input.id, "timestamp": now()} +{"field with spaces": "value"} + +# Keys can be expressions (must evaluate to string) +{$key: $value} # OK if $key is string, ERROR otherwise +{"prefix_" + input.type: input.value} # OK: concatenation yields string +{input.field_name: input.field_value} # OK if field_name is string, ERROR otherwise +{input.count.string(): input.value} # Explicit conversion to string +``` + +**Key type requirement:** Object keys must be strings. If a key expression evaluates to a non-string type (number, boolean, null, etc.), a runtime error occurs. Use `.string()` for explicit conversion. + +## 3.7 Statements + +**Assignment:** +```bloblang +output.field = expression +output.user.id = input.id # Creates nested structure +output."special.field" = value # Quoted field (dot required) +output."field with spaces" = value # Spaces in field name +``` + +**Auto-creation of intermediate structures:** Assigning to a nested path automatically creates intermediate objects (and arrays when using index syntax) as needed: +```bloblang +# output starts as {} +output.user.address.city = "London" +# output is now {"user": {"address": {"city": "London"}}} + +# Array auto-creation with index syntax +output.items[0].name = "first" +# output.items created as array, output.items[0] created as object + +# Dynamic index: auto-creation type determined by index type at runtime +$key = "name" +output.data[$key] = "Alice" # $key is string → output.data created as object +$idx = 0 +output.list[$idx] = "first" # $idx is int → output.list created as array + +# Collision with non-object/non-array value is an error +output.user = "Alice" +output.user.name = "Alice" # ERROR: output.user is a string, not an object +``` + +**Array index gaps:** Assigning to an index beyond the current length of an array fills intermediate indices with `null`: +```bloblang +$arr = [10, 20] +$arr[5] = 30 # [10, 20, null, null, null, 30] +output.items[2] = "x" # output.items is [null, null, "x"] +``` + +**Variable Declaration:** +```bloblang +$user_id = input.user.id +$name = input.name.uppercase() +``` + +Variables are **mutable** and can be reassigned: +```bloblang +$count = 0 +$count = $count + 1 +$count = $count * 2 +``` + +**Variable path assignment:** Variables support field and index assignment with the same semantics as `output`, including auto-creation of intermediate structures: +```bloblang +$user = {"name": "Alice"} +$user.name = "Bob" # Deep mutation: {"name": "Bob"} +$user.address.city = "London" # Auto-creates intermediates +$user.tags[0] = "admin" # Index assignment +$user.address = deleted() # Removes the field from the object inside $user + +$val = "hello" +$val.field = "x" # ERROR: cannot assign field on string +``` + +**Path assignment to an undeclared variable is a declaration.** If the variable does not yet exist in the current scope, the first path assignment introduces it and auto-creates its root value using the same type-inference rules as `output` auto-creation (Section 3.7): a field component (`.field`) creates an object, a numeric index (`[0]`, `[$idx]` where `$idx` is numeric) creates an array, and a string-valued dynamic index creates an object. This is a new declaration in the current scope — subsequent statements in the same scope refer to the same variable. + +```bloblang +# Undeclared $user — path assignment auto-creates $user as {} +$user.name = "Alice" # $user is {"name": "Alice"} +$user.address.city = "London" # $user is {"name": "Alice", "address": {"city": "London"}} +output.u = $user + +# Undeclared $arr — numeric index auto-creates $arr as [] +$arr[0] = "first" # $arr is ["first"] +$arr[2] = "third" # $arr is ["first", null, "third"] + +# Type collision still errors when the variable already exists with an incompatible type +$val = "hello" +$val.field = "x" # ERROR: cannot assign field on string +``` + +Assigning a value to a variable always creates a logical copy, regardless of source (`input`, `output`, or another variable). Mutations to the variable never affect the original, and vice versa: +```bloblang +$data = input.record +$data.status = "processed" # Mutates $data only; input unchanged + +$snap = output.user +output.user.name = "changed" # $snap unaffected +``` + +Variable path assignment (`$var.field = expr`) is available in all contexts — both statement contexts (top-level, if/match statement bodies) and expression contexts (if/match expressions, lambda bodies, map bodies). In expression contexts, only variable assignments are allowed (no `output` assignments). + +Variables are **block-scoped** with shadowing support (inner blocks can declare new variables with the same name). + +**Metadata Assignment:** +```bloblang +output@.kafka_topic = "processed" +output@.kafka_key = input.id +``` + +**Deletion:** +```bloblang +output.password = deleted() # Remove field +output = deleted() # Drop message, exit mapping +``` + +## 3.8 Variable Scope & Shadowing + +Variables are block-scoped with different rules for **statement** and **expression** contexts: + +**In expression contexts** (if/match expressions, lambda bodies, map bodies): assigning to a variable name that exists in an outer scope **shadows** it — a new variable is created in the inner scope, and the outer variable is unchanged. This preserves the functional, side-effect-free nature of expressions. + +```bloblang +$x = 1 + +output.result = if true { + $x = 3 # Shadowing: NEW variable in inner scope + $x # 3 +} + +output.outer = $x # Still 1 (inner $x doesn't affect outer) +``` + +**In statement contexts** (if/match statements at top-level): assigning to a variable that exists in an outer scope **modifies** it. New variables declared inside a statement block are **block-scoped** — they are not visible in the outer scope. + +```bloblang +$count = 0 +if input.flag { + $count = 1 # Modifies outer $count (already exists) + $temp = "found" # Block-scoped: only visible inside this block +} +output.count = $count # 1 if flag was true, 0 if false +output.temp = $temp # Compile-time error: $temp does not exist +``` + +To use a variable after a conditional, pre-declare it: +```bloblang +$temp = null +if input.flag { + $temp = "found" # Modifies outer $temp (already exists) +} +output.temp = $temp # OK: null or "found" +``` + +**Reassignment at the same scope level:** Assigning to a variable that was declared in the *same* scope is always reassignment (mutation), not shadowing — this applies in both statement and expression contexts. Shadowing only occurs when an inner scope references a variable from an outer scope. +```bloblang +$x = 1 +$x = 2 # Reassignment: same variable, now has value 2 +output.a = $x # 2 + +output.b = if true { + $a = 1 + $a = 2 # Reassignment within the same expression body (not shadowing) + $a # 2 +} +``` + +**Rationale:** Bloblang is mostly functional, but if/match statements are an intentional imperative escape hatch — they can assign to `output` and modify existing outer variables. New variable declarations are always block-scoped in both statement and expression contexts. The key difference: in statement contexts, assigning to an *existing* outer variable modifies it; in expression contexts, it shadows (creates a new inner variable). Neither context leaks new variables to the outer scope. + +## 3.9 Statements vs Expressions + +**Statements** (cause side effects): +- Assignments: `output.field = value`, `output@.key = value` +- Variable declarations: `$var = value` +- If/match statements (with multiple assignments) + +**Expressions** (return values): +- All operators, function calls, method chains +- If/match expressions (return single value) +- Lambdas + +**Rule:** Expressions cannot contain assignments to `output` or `output@`. diff --git a/internal/bloblang2/spec/04_control_flow.md b/internal/bloblang2/spec/04_control_flow.md new file mode 100644 index 000000000..dfb7cefdd --- /dev/null +++ b/internal/bloblang2/spec/04_control_flow.md @@ -0,0 +1,373 @@ +# 4. Control Flow + +## 4.1 If Expressions vs Statements + +**If Expression** (returns value, used in assignment): +```bloblang +output.result = if condition { value } else { other_value } + +# Without else: assignment doesn't execute if condition false +output.category = if input.score > 80 { "high" } + +# Else-if chains +output.tier = if input.score >= 90 { + "gold" +} else if input.score >= 50 { + "silver" +} else { + "bronze" +} + +# Else-if without final else: void if no branch matches +output.tier = if input.score >= 90 { "gold" } else if input.score >= 50 { "silver" } +# void if score < 50 — assignment skipped +``` + +**If Statement** (standalone, contains output assignments): +```bloblang +if input.type == "user" { + output.role = "member" + output.permissions = ["read"] +} + +# Else-if statement chains +if input.type == "admin" { + output.role = "admin" + output.permissions = ["read", "write", "delete"] +} else if input.type == "mod" { + output.role = "moderator" + output.permissions = ["read", "write"] +} else { + output.role = "user" + output.permissions = ["read"] +} +``` + +**Distinction:** +- **Expression:** Used in assignment context, contains pure expressions (no `output`/`output@` assignments) +- **Statement:** Standalone, contains `output`/`output@` assignments, **cannot end with expression** (parse error). Empty statement bodies (e.g., `if condition { }`) are valid and are no-ops. + +**Parsing disambiguation:** The syntactic context determines which form: +- **If statement:** Top-level in mapping, or inside another statement body (where `output` assignments are allowed) +- **If expression:** Inside assignment RHS, variable declarations, lambda bodies, map bodies, or expression contexts + +```bloblang +# Statement context (top-level) +if input.type == "user" { + output.role = "member" # Statement: assigns to output +} + +# Expression context (assignment RHS) +output.value = if input.flag { + input.value # Expression: returns value +} + +# Expression context (variable declaration) +$result = if input.score > 80 { "high" } else { "low" } + +# Expression context (lambda body) +input.items.map(x -> if x > 0 { x * 2 } else { 0 }) + +# ERROR: Statement body cannot end with expression +if input.flag { + $x = 10 + $x + 5 # Parse error: trailing expression in statement context +} +``` + +**If expressions without `else`:** When the condition is false, the expression produces **void** — the absence of a value. No value is produced at all. Void is only meaningful in assignments (where it causes a no-op); in most other contexts it is an error (see summary table below). + +**Void in assignments:** The assignment does not execute. The target field is neither created nor modified. + +**Case 1: No prior assignment** +```bloblang +output.category = if input.score > 80 { "high" } +# If score <= 80: void, assignment skipped, field doesn't exist +# Reading output.category returns null (field is absent) +# JSON output: field omitted entirely +``` + +**Case 2: Has prior assignment** +```bloblang +output.status = "pending" +output.status = if false { "override" } # Void: assignment skipped +# output.status keeps its existing value: "pending" +# Reading output.status returns "pending" (not null!) +# JSON output: {"status": "pending"} +``` + +**Case 3: Explicit null vs non-existent** +```bloblang +output.field1 = null # Field exists with null value +output.field2 = if false { "value" } # Void: field doesn't exist (no prior assignment) +# field1 reads as null, field2 reads as null - but differ structurally +# JSON output: {"field1": null} (field2 omitted) +``` + +**Void in collection literals (array and object):** Void is an **error** in collection literals. Use `deleted()` to conditionally omit elements/fields, or add an `else` branch to provide a value in all cases. +```bloblang +# Arrays +output.items = [1, if false { 2 }, 3] # ERROR: void in array literal +output.items = [1, if false { 2 } else { deleted() }, 3] # OK: [1, 3] +output.items = [1, if false { 2 } else { 0 }, 3] # OK: [1, 0, 3] + +# Objects +output.user = { + "id": input.id, + "email": if input.verified { input.email } # ERROR: void in object literal +} +output.user = { + "id": input.id, + "email": if input.verified { input.email } else { deleted() } # OK: field omitted if not verified +} +``` + +**Void vs `deleted()`:** These are different concepts. Void means "no value was produced" — nothing happens. `deleted()` is an active deletion marker that removes existing fields and elements (see Section 9.2). Void is only meaningful in assignments (where it causes a no-op); in all other contexts it is an error. The distinction in assignments: +```bloblang +output.status = "pending" +output.status = if false { "override" } # Void: keeps "pending" (no-op) +output.status = deleted() # Deleted: removes the field entirely +``` + +**Void in variable declarations:** A variable declaration (the first assignment to a name in a given scope) **cannot** have a void-producing expression as its right-hand side. If void reaches a variable declaration at runtime, it is a **runtime error**. This ensures every declared variable always has a value — there is no "uninitialized variable" state. +```bloblang +$x = if input.flag { 42 } # RUNTIME ERROR if input.flag is false (void) +$x = match input.x { "a" => 1 } # RUNTIME ERROR if no case matches (void) +$x = my_map(input) # RUNTIME ERROR if the map produces void + +$x = (if input.flag { 42 }).or(0) # OK: .or() rescues void, always produces a value +$x = if input.flag { 42 } else { 0 } # OK: else branch ensures a value +``` + +**Void in variable reassignment:** If a variable already exists and is reassigned a void expression, the assignment is skipped and the variable retains its prior value. +```bloblang +$x = 10 +$x = if false { 42 } # Void: assignment skipped, $x keeps its value +output.result = $x # 10 +``` + +**Void as a function/map argument:** Passing void as an argument is invalid and causes a runtime error (similar to `deleted()`). +```bloblang +map double(val) { val * 2 } +output.result = double(if false { 42 }) # ERROR: void argument +``` + +**Void in expression context:** If an operator encounters void as an operand, it causes an error. +```bloblang +output.result = (if false { 42 }) + 1 # ERROR: void in expression +output.flag = !(if false { true }) # ERROR: void in expression +``` + +**Void as a lambda return value:** Void propagates transparently out of a lambda — the lambda itself does not error. The consuming context then determines what happens: + +- **`map`**: void is an error — the lambda must return a value for every element. Use an explicit `else` branch to keep elements unchanged, or return `deleted()` to remove them. Extension methods may also support `deleted()` as a lambda return value. +- **`filter`**: requires a boolean — void is an error. +- Other methods that require a specific type will error if they receive void. + +```bloblang +# map: void is an error, must always return a value +input.items.map(x -> if x > 0 { x * 2 } else { x }) # Positive doubled, others kept +input.items.map(x -> if x > 0 { x * 2 }) # ERROR when x <= 0: void +input.items.map(x -> if x > 0 { x } else { deleted() }) # Non-positive elements removed + +# filter requires a boolean: receiving void is an error +input.items.filter(x -> if x > 0 { true }) # ERROR when x <= 0: filter received void, not bool +input.items.filter(x -> if x > 0 { true } else { false }) # OK: always returns bool +``` + +**Void in match arms:** Match arms are transparent — void produced by a case arm flows out of the match expression and behaves exactly as it would from any other expression. In an assignment context, void causes the assignment to be skipped; in other contexts (collection literals, expressions, etc.) void is an error. +```bloblang +output.result = match input.x { + "a" => if false { "value" }, # Void: assignment skipped, prior value (if any) preserved + _ => "default", +} +``` + +**Sources of void:** Void is produced by an if-expression without a final `else` when no condition is true (including `else if` chains without a final `else`), by a match expression without `_` when no case matches (Section 4.2), by certain standard library methods when no result exists (e.g., `.find()` when no element matches — see Section 13.6), and by the `void()` builtin (Section 13.1) when the author wants to produce void explicitly. In all cases, void follows the same rules: + +**Summary of void behavior by context:** + +| Context | Behavior | +|---------|----------| +| Output field assignment (`output.x = void`) | Assignment skipped; prior value (if any) preserved | +| Root output assignment (`output = void`) | Assignment skipped; prior value preserved (`{}` if no prior assignment) | +| Variable declaration (`$x = void`) | Runtime error | +| Variable reassignment (`$x = void`, `$x` exists) | Assignment skipped; prior value preserved | +| Collection literal (`[1, void, 3]`) | Error | +| Object literal (`{"a": void}`) | Error | +| Function/map argument (`f(void)`) | Error | +| `map` lambda return | Error (value required) | +| `filter` lambda return | Error (boolean required) | +| Other lambda return | Propagates to consuming context | +| `.catch()` receiver (`void.catch(...)`) | Void passes through (catch not triggered — void is not an error) | +| `.or()` receiver (`void.or(x)`) | Returns `x` (void rescued) | +| Other method call (`void.type()`) | Error | +| Expression operand (`void + 1`) | Error | + +**Note:** `.or()` also rescues `deleted()` — see Section 8.3. + +## 4.2 Match Expressions vs Statements + +**Match Expression** (returns value): +```bloblang +output.sound = match input.animal { + "cat" => "meow", + "dog" => "woof", + _ => "unknown", +} +``` + +**Exhaustiveness:** Match expressions and statements are **not required** to be exhaustive. If no case matches, the match produces **void** — exactly like an if-expression without `else`. The void behavior follows the same rules as Section 4.1: + +- **Match expression** (in assignment): void causes the assignment to be skipped (no-op) +- **Match statement**: no-op (no side effects, execution continues) +- **Match in collection literal**: void is an error (use `_` or `deleted()`) + +```bloblang +# Not exhaustive - void if animal is "bird" (assignment skipped) +output.sound = match input.animal { + "cat" => "meow", + "dog" => "woof", +} + +# Exhaustive - always produces a value +output.sound = match input.animal { + "cat" => "meow", + "dog" => "woof", + _ => "unknown", # Catch all other values +} +``` + +**Match Statement** (multiple assignments): +```bloblang +match input.type() { + "object" => { + output = input.map_values(v -> transform(v)) + }, + "array" => { + output = input.map(elem -> transform(elem)) + }, + _ => { + output = input + }, +} +``` + +**Parsing disambiguation:** Like `if`, the syntactic context determines statement vs expression form. Match statements are only valid at top-level or inside other statement bodies. + +**Case body syntax:** Expression match cases allow either a bare expression or a braced body (`"cat" => "meow"` or `"cat" => { $x = "me"; $x + "ow" }`). Statement match cases always require braces (`"cat" => { output.sound = "meow" }`) because braces are needed to delimit the statement body. Empty statement case bodies (e.g., `"ignore" => { }`) are valid and are no-ops. + +### Three Match Forms + +**1. Equality match (`match expr { value => ... }`):** The matched expression is evaluated **once**, then each case value is compared against it using equality (`==`). Cases are evaluated in order; the first case that matches is selected and subsequent case expressions are not evaluated. Case expressions are ordinary expressions with the same scope access as the surrounding context (variables, `input`, `output`, etc. as appropriate). If a case expression evaluates to a **boolean**, an error is thrown — this catches the common mistake of writing conditions in equality match instead of using `as`. Use `if`/`else` to match against boolean values directly. Boolean literals (`true`, `false`) as case expressions **must** be rejected at compile time. Cases involving dynamic values that happen to be boolean at runtime are runtime errors. + +```bloblang +output.sound = match input.animal { + "cat" => "meow", + "dog" => "woof", + _ => "unknown", +} + +# Equivalent to: +output.sound = match input.animal as a { + a == "cat" => "meow", + a == "dog" => "woof", + _ => "unknown", +} + +# Boolean case values are an error in equality match: +output.tier = match input.score { + input.score >= 100 => "gold", # ERROR: case evaluated to boolean in equality match +} +# Fix: use 'as' for boolean conditions +output.tier = match input.score as s { + s >= 100 => "gold", + _ => "other", +} + +# Note: this also means you cannot equality-match on boolean values, +# since the case literals true/false are themselves booleans: +output.label = match input.flag { + true => "yes", # ERROR: case evaluated to boolean in equality match + false => "no", +} +# Fix: use if/else for boolean values +output.label = if input.flag { "yes" } else { "no" } +``` + +**Rationale for the boolean restriction:** In equality match, a case like `input.score >= 100` is almost always a mistake — the user meant to use `as` for boolean conditions, not compare the matched value against `true`/`false`. Rejecting boolean cases catches this common error. The trade-off is that you cannot equality-match on boolean values (`match input.flag { true => ..., false => ... }`). This is intentional: `if`/`else` handles the boolean case more clearly, and multi-way dispatch on a value that could be `true`, `false`, or a non-boolean is better expressed with `match ... as` or `if`/`else if`/`else` chains. + +**Compile-time vs runtime detection:** Boolean literals (`true`, `false`) as case expressions are caught at compile time as a convenience — their type is statically known. Dynamic expressions whose type is not known until runtime (e.g., `match x { $var => ... }` where `$var` happens to be boolean) produce the same error at runtime. In both cases, the fix is the same: use `match ... as` for boolean conditions, or `if`/`else` for boolean dispatch. The split in error timing reflects what the compiler can prove statically, not a semantic difference. + +**The runtime boolean-case check is lazy.** It only applies to cases that are actually evaluated. Cases in an equality match are evaluated in order, and the first one that matches selects the arm; subsequent case expressions are not evaluated and therefore do not trigger the boolean-case error. A later case that would have been boolean is only a runtime error if execution reaches it. The compile-time rejection of boolean *literals* is unaffected — literal `true`/`false` cases are rejected regardless of whether they would be reached at runtime. + +```bloblang +$b = true +output.v = match "x" { + "x" => "hit", # matches; subsequent cases are not evaluated + $b => "boom", # never evaluated — no error thrown at runtime +} +# Result: output.v == "hit" + +output.w = match "y" { + "x" => "hit", # does not match + $b => "boom", # evaluated — runtime error (boolean case in equality match) +} +``` + +**2. Boolean match with `as` (`match expr as x { bool => ... }`):** The matched expression is evaluated **once** and bound to the variable. The `as` binding is available in case conditions, result expressions, and statement bodies (for match statements). It is block-scoped to the match — it cannot be referenced after the match closes. Each case must be a **boolean expression** (evaluated in order, first `true` wins). If a case evaluates to a non-boolean value, an error is thrown. The wildcard `_` is exempt from this requirement — it always matches unconditionally. + +```bloblang +output.tier = match input.score as s { + s >= 100 => "gold", + s >= 50 => "silver", + _ => "bronze", +} +``` + +Use `as` when you need range checks or complex conditions against the matched value. + +**3. Boolean match (`match { bool => ... }`):** No matched expression. Each case must be a **boolean expression**. Cases are evaluated in order, and the first one that yields `true` is selected. If a case evaluates to a non-boolean value, an error is thrown. The wildcard `_` is exempt — it always matches unconditionally. + +```bloblang +output.category = match { + input.score >= 90 => "A", + input.score >= 80 => "B", + input.score >= 70 => "C", + _ => "F", +} +``` + +**Key distinction:** Without `as`, case values are compared by equality against the matched expression (and boolean case values are an error). With `as`, case expressions must be booleans. + +**Wildcard `_`:** In all three match forms, `_` is an unconditional catch-all — it always matches regardless of context. In equality match it matches any value; in boolean match forms it is not evaluated as a boolean expression but simply matches unconditionally. `_` is a syntactic form, not an expression — it can only appear as a match case pattern, not in arbitrary expression positions. + +**Non-exhaustive match:** If no case matches and there is no `_` catch-all, match *expressions* produce void (see Section 4.1) and match *statements* are no-ops (no assignments are executed). The `_` wildcard is the only catch-all mechanism — there is no `else` keyword for match. + +## 4.3 Block-Scoped Variables + +```bloblang +output.processed = if input.has_discount { + $rate = input.discount_rate.or(0.10) + $base = input.price + $base * (1 - $rate) +} else { + input.price +} + +output.formatted = match input.currency { + "USD" => { + $symbol = "$" + $amount = input.amount.round(2) + $symbol + $amount.string() + }, + "EUR" => { + $amount = input.amount.round(2) + $amount.string() + " EUR" + }, + _ => { + $amount = input.amount.round(2) + input.currency + " " + $amount.string() + }, +} +``` diff --git a/internal/bloblang2/spec/05_maps.md b/internal/bloblang2/spec/05_maps.md new file mode 100644 index 000000000..2044ec980 --- /dev/null +++ b/internal/bloblang2/spec/05_maps.md @@ -0,0 +1,206 @@ +# 5. Maps (User-Defined Functions) + +Isolated, reusable transformations called as functions. + +## 5.1 Syntax + +```bloblang +# Zero parameters (useful for common structures/macros) +map default_headers() { + {"content_type": "application/json", "version": "2.0"} +} + +# Single parameter +map name(parameter) { + # optional variable declarations + # final expression (return value) + expression +} + +# Multiple parameters +map calculate(x, y, z) { + x + y * z +} + +# Default parameters (must come after required parameters) +map format_price(amount, currency = "USD", decimals = 2) { + currency + " " + amount.round(decimals).string() +} + +# Discard parameters (match a call signature, ignore unused args) +map handle_event(_, _, payload) { + payload.uppercase() +} + +# Invocation +output.headers = default_headers() +output.result = name(input.data) +output.calc = calculate(1, 2, 3) + +# Invocation - named arguments (for maps with parameters) +output.result = name(parameter: input.data) +output.calc = calculate(x: 1, y: 2, z: 3) + +# Using defaults — positional (trailing optional args omitted) +output.price = format_price(99.99) # "USD 99.99" +output.price = format_price(99.99, "EUR") # "EUR 99.99" +output.price = format_price(99.99, "EUR", 0) # "EUR 100" + +# Using defaults — named (missing optional args use defaults) +output.price = format_price(amount: 99.99) # "USD 99.99" +output.price = format_price(amount: 99.99, decimals: 0) # "USD 100" +output.price = format_price(amount: 99.99, currency: "EUR") # "EUR 99.99" +``` + +Maps are **isolated functions**: they take zero or more parameters, optionally declare variables, and return a value. They cannot reference `input` or `output`. + +**Argument styles:** Functions can be called with positional or named arguments, but not both in the same call. + +**Default parameters:** Parameters may have default values (`param = literal`). Parameters with defaults must come after all required parameters. Default values must be literals (`42`, `"hello"`, `true`, `false`, `null`) — expressions, function calls, and references to other parameters are not allowed in defaults. **Note:** Since there is no timestamp literal syntax, timestamp defaults are not possible. Use `null` with `.or()` as a workaround: `map query(start = null) { $s = start.or(now()); ... }`. + +**Dynamic defaults pattern:** When a parameter's default depends on other parameters or on computed values, use `null` as a sentinel default and compute the actual value in the map body. This is the standard pattern for truly optional parameters in user-defined maps: + +```bloblang +map connect(host, port = null) { + $p = port.or(if host.has_prefix("https") { 443 } else { 80 }) + host + ":" + $p.string() +} + +connect("https://example.com") # "https://example.com:443" +connect("http://example.com") # "http://example.com:80" +connect("http://example.com", 8080) # "http://example.com:8080" +``` + +This pattern cannot distinguish "caller passed null explicitly" from "caller omitted the argument." If null is a meaningful value for a parameter, use a different sentinel or restructure the map. + +## 5.2 Examples + +**Single parameter:** +```bloblang +map extract_user(data) { + { + "id": data.user_id, + "name": data.full_name, + "email": data.email + } +} + +output.customer = extract_user(input.customer_data) +output.customer = extract_user(data: input.customer_data) # Named +``` + +**Multiple parameters:** +```bloblang +map format_price(amount, currency, decimals) { + currency + " " + amount.round(decimals).string() +} + +# Positional +output.price = format_price(99.99, "USD", 2) + +# Named +output.price = format_price(amount: 99.99, currency: "USD", decimals: 2) +``` + +**With variables:** +```bloblang +map calculate_total(subtotal, tax_rate) { + $tax = subtotal * tax_rate + subtotal + $tax +} + +output.total = calculate_total(100, 0.1) +output.total = calculate_total(subtotal: 100, tax_rate: 0.1) +``` + +**Recursion:** +```bloblang +map walk_tree(node) { + match node.type() { + "object" => node.map_values(v -> walk_tree(v)), + "array" => node.map(elem -> walk_tree(elem)), + "string" => node.uppercase(), + _ => node, + } +} + +output = walk_tree(input) +``` + +**Recursion limits:** Maximum recursion depth is implementation-defined. Implementations **must** support at least 1000 levels of recursion depth to ensure basic portability. Exceeding the recursion limit throws a runtime error that stops execution immediately and **cannot be caught** with `.catch()`. Mutual recursion (map A calls map B which calls map A) is valid — maps are hoisted (Section 7.7) — and shares the same depth limit. + +## 5.3 Parameter Semantics + +- **Maps are isolated** — they can only access their parameters and variables declared within the map body. They cannot access `input`, `output`, or top-level `$variables`. The result is determined entirely by the parameter values. +- Parameters are **read-only** — they cannot be reassigned or used as assignment targets +- Parameters are available as bare identifiers within the map body (e.g., `data.field`) +- Variables declared within maps (using `$`) can be reassigned +- **Discard parameters (`_`):** `_` can be used as a parameter name to accept and ignore an argument. It is not bound — referencing `_` in the body is a compile error. Multiple `_` parameters are allowed in the same parameter list. Discard parameters cannot have defaults. +- Call with positional arguments (match order) or named arguments (match names) +- **Cannot mix** positional and named arguments in the same call +- **`_` restricts to positional calls:** Maps with any `_` parameters can only be called positionally. Named calls to such maps are a compile error since `_` has no name to target. +- **Arity:** Positional calls must provide at least the required parameter count and at most the total parameter count. Named calls must provide all required parameters; missing parameters with defaults use their defaults. Extra or unknown arguments are errors. Arity mismatches are compile-time errors when detectable, runtime errors otherwise. +- **Parameter shadowing:** Parameter names shadow any map names with the same name within the map body. The parameter always wins. Imported namespaces are not affected since they use `::` syntax — `namespace::func()` is always unambiguous regardless of parameter names. + +```bloblang +map example(data) { + $copy = data # ✅ Valid: variable declaration + data.field # ✅ Valid: read from parameter (final expression) +} + +map invalid(data) { + data = input.x # ❌ Invalid: cannot assign to parameter +} + +map also_invalid(data) { + $val = $top_level_var # ❌ Invalid: cannot access top-level variables + data.field +} + +# Parameter and namespace coexist — :: makes namespace calls unambiguous +import "./math.blobl" as math +map transform(math) { + math::add(math, 2) # math:: is the namespace call, math is the parameter +} +``` + +## 5.4 Scope Restrictions + +To use external context inside a map, pass it as a parameter: + +```bloblang +# ❌ Cannot access input inside a map +map invalid(items) { + items.map(x -> x * input.multiplier) # ERROR: cannot access input +} + +# ✅ Pass external context as a parameter instead +map scale(items, multiplier) { + items.map(x -> x * multiplier) +} +output.result = scale(input.items, input.multiplier) +``` + +## 5.5 Maps as Method Arguments + +Map names, namespace-qualified references, and standard library function names can be passed directly to higher-order methods like `.map()`, `.filter()`, and `.sort_by()`. The compiler resolves the name to the definition at compile time — this is syntactic sugar for an inline lambda that calls the map/function, not a runtime value. + +```bloblang +map double(x) { x * 2 } + +# Pass map directly to higher-order methods +output.doubled = input.items.map(double) # Same as: map(x -> double(x)) + +# Namespace-qualified references also work +import "./math.blobl" as math +output.results = input.items.map(math::double) # Same as: map(x -> math::double(x)) +``` + +These names are **compile-time references**, not runtime values. They cannot be stored in variables or used as general-purpose expressions: +```bloblang +$fn = double # ERROR: cannot store a map reference in a variable +$fn = math::double # ERROR: cannot store a namespace reference in a variable +output.x = double # ERROR: bare map name is not a valid expression here +``` + +**Void from map bodies:** If a map body's final expression is an if-without-else or match-without-`_`, the map can produce void when the condition is false or no case matches. Void from a map call follows the same propagation rules as void from any other expression (Section 4.1) — it will be a runtime error in most calling contexts (variable declarations, collection literals, function arguments, etc.). To avoid this, always include an `else` branch or `_` case in a map body's final expression. diff --git a/internal/bloblang2/spec/06_imports.md b/internal/bloblang2/spec/06_imports.md new file mode 100644 index 000000000..0d304bfbc --- /dev/null +++ b/internal/bloblang2/spec/06_imports.md @@ -0,0 +1,111 @@ +# 6. Imports & Modules + +## 6.1 Namespace Imports + +```bloblang +import "path" as namespace +``` + +All maps from the file available via namespace. + +## 6.2 Example + +```bloblang +# user_transforms.blobl +map extract_user(data) { + { + "id": data.user_id, + "name": data.full_name + } +} + +map format_name(data) { + data.first_name + " " + data.last_name +} + +# main.blobl +import "./user_transforms.blobl" as users + +output.user = users::extract_user(input.user_data) +output.display_name = users::format_name(input.user) +``` + +## 6.3 Path Resolution + +**Relative paths:** Relative to importing file's directory +```bloblang +import "./sibling.blobl" as sibling +import "../parent/file.blobl" as parent +``` + +**Absolute paths:** Used as-is +```bloblang +import "/etc/benthos/common.blobl" as common +``` + +## 6.4 Visibility & File Constraints + +**All top-level maps are exported automatically.** Maps are accessible through the namespace. + +**Imported files may only contain map declarations and import statements.** Top-level statements (assignments, variable declarations, if/match statements) are a compile-time error in imported files. Since map bodies cannot access top-level variables, `input`, or `output` (Section 5.3), there is no useful purpose for top-level statements in library files. + +```bloblang +# utils.blobl — valid imported file +import "./helpers.blobl" as helpers # ✅ Imports allowed +map transform(data) { data.value * 2 } # ✅ Map declarations allowed + +# invalid_utils.blobl — would fail when imported +$internal = 42 # ❌ Compile error: statement in imported file +output.side_effect = "hello" # ❌ Compile error: statement in imported file +map transform(data) { data.value * 2 } + +# main.blobl +import "./utils.blobl" as utils +output.result = utils::transform(input) # ✅ Works: maps are exported +``` + +## 6.5 Error Handling + +- **File not found:** Error at import +- **Duplicate namespace:** Error if same name used twice +- **Circular imports:** Detected at compile time and error +- **Statements in imported file:** Compile-time error if an imported file contains top-level statements +- **Map not found:** Error when calling non-existent map + +**Circular import detection:** Import cycles are not allowed. If file A imports B (directly or transitively through other files), then B cannot import A. + +```bloblang +# a.blobl +import "./b.blobl" as b +map foo(x) { b::bar(x) } + +# b.blobl +import "./a.blobl" as a # ERROR: Circular import (A->B->A) +map bar(x) { a::foo(x) } +``` + +This restriction prevents mutual recursion across files. Implementations must detect cycles at compile time before execution. + +## 6.6 Recursion + +Maps can call themselves without namespace prefix: +```bloblang +map walk(node) { + match node.type() { + "object" => node.map_values(v -> walk(v)), + _ => node, + } +} +``` + +**Mutual recursion** within the same file is also supported. Since map declarations are hoisted (Section 7.7), two maps can call each other regardless of declaration order: +```bloblang +map is_even(n) { + if n == 0 { true } else { is_odd(n - 1) } +} +map is_odd(n) { + if n == 0 { false } else { is_even(n - 1) } +} +``` + +**Note:** Mutual recursion across files is not possible — circular imports are prohibited (Section 6.5). Same-file mutual recursion is subject to the same recursion depth limit as self-recursion (Section 5.2). diff --git a/internal/bloblang2/spec/07_execution_model.md b/internal/bloblang2/spec/07_execution_model.md new file mode 100644 index 000000000..64f0b66ce --- /dev/null +++ b/internal/bloblang2/spec/07_execution_model.md @@ -0,0 +1,338 @@ +# 7. Execution Model + +## 7.1 Immutable Input, Mutable Output + +**Input (document + metadata) is always immutable:** +```bloblang +output.invitees = input.invitees.filter(i -> i.mood >= 0.5) +output.rejected = input.invitees.filter(i -> i.mood < 0.5) +# input.invitees unchanged - both see original + +# Input metadata also immutable +output.original_topic = input@.kafka_topic +output@.kafka_topic = "processed" +output.still_original = input@.kafka_topic # Still original value +``` + +**Output (document + metadata) built incrementally:** +```bloblang +output.user.id = input.id # Creates output.user.id +output.user.name = input.name # Adds output.user.name +output@.kafka_topic = "processed" # Adds output metadata +``` + +**Initial state:** `output` starts as empty object `{}`. `output@` (metadata) starts as empty object `{}`. A mapping with no statements produces `{}` with empty metadata. Reading `output` itself (without a path) before any assignment returns `{}` — e.g., `output.type()` returns `"object"`. + +**Input type:** `input` holds the incoming message document, which can be any type — object, array, string, bytes, number, bool, or null. Most commonly it is an object (parsed from JSON), but raw/unstructured messages arrive as bytes or string. The type of `input` is determined by the runtime environment, not by Bloblang. Use `.type()` to check, and methods like `.parse_json()` or `.string()` to convert. + +**Reading non-existent fields:** Accessing a field that doesn't exist returns `null` rather than erroring: +```bloblang +# output is initially {} +output.field # Returns null (field doesn't exist) + +# After assignment +output.field = "value" +output.field # Returns "value" +``` + +**Unset vs Null distinction:** +- **Non-existent field:** Field is not present in the object structure; reading it returns `null` +- **Explicit null:** Field exists in the object with `null` as its value: `output.field = null` +- **Practical impact:** In JSON output, non-existent fields are omitted; fields with `null` values are serialized as `"field": null` + +```bloblang +output.exists_null = null # Field present: {"exists_null": null} +output.not_created = if false { "x" } # Field absent: {} +# Both return null when read, but differ structurally +``` + +**Root output assignment:** Assigning to bare `output` (without a path) replaces the entire output document with the assigned value, regardless of its type and regardless of any prior assignments. The new value completely replaces the old one — previous field assignments are discarded. `output` can hold any type (object, array, string, number, etc.): +```bloblang +output.x = 1 +output.y = 2 +output = "foo" # Replaces {"x": 1, "y": 2} with "foo" +output.field # ERROR: cannot access field of string + +output = "foo" +output = {"a": 1} # Replaces "foo" with {"a": 1} +output.b = 2 # {"a": 1, "b": 2} +``` + +## 7.2 Copy-and-Modify Pattern + +```bloblang +# Copy document +output = input +output.password = deleted() +output.processed_at = now() + +# Copy metadata +output@ = input@ +output@.kafka_topic = "new-topic" +``` + +`output = input` performs a **logical copy** (copy-on-write) of the entire document — `output` is fully independent of `input`. Subsequent mutations to `output` never affect `input`. This is the same COW semantics used for variable assignment (Section 3.7) and metadata copy (`output@ = input@`). + +## 7.3 Contexts + +**Top-level mapping contexts:** + +**Input Context:** +- `input.field` - Document field (immutable) +- `input@.key` - Metadata key (immutable) +- Always refers to original input message + +**Output Context:** +- `output.field` - Document field (mutable) +- `output@.key` - Metadata key (mutable) +- Built incrementally during execution + +**Variables:** +- `$variable` - Block-scoped, mutable +- Can shadow variables from outer scopes + +**Map body contexts:** +- Parameter: Bare identifier (e.g., `data.field`) +- Variables: `$variable` (local to map) +- **No access** to `input` or `output` (isolated functions) + +## 7.4 Metadata + +Messages have metadata separate from document payload. Metadata (`output@`) is always an object (key-value map) — unlike `output`, which can hold any type. This distinction affects deletion behavior (see Section 9.2). + +**Access:** +```bloblang +# Read input metadata (immutable) +output.topic = input@.kafka_topic +output.partition = input@.kafka_partition + +# Write output metadata (mutable) +output@.kafka_topic = "processed-topic" +output@.kafka_key = input.id +output@.content_type = "application/json" + +# Dynamic metadata access (string index) +$key = "kafka_topic" +output.topic = input@[$key] +output@[$key] = "new-topic" + +# Delete metadata key +output@.kafka_key = deleted() + +# Clear all metadata +output@ = {} # Removes all keys + +# Cannot delete metadata object itself +output@ = deleted() # ERROR: cannot delete metadata object + +# Metadata root must be an object +output@ = "string" # ERROR: metadata must be an object +output@ = [1, 2, 3] # ERROR: metadata must be an object +output@ = 42 # ERROR: metadata must be an object +``` + +**Types:** +Metadata values can be any serializable type (string, number, bool, null, bytes, array, object, timestamp). +```bloblang +output@.retry_count = 5 +output@.tags = ["urgent", "customer-service"] +output@.routing = {"region": "us-west", "priority": 10} +output@.created_at = now() +``` + +**Nested metadata paths:** Metadata paths support the same auto-creation semantics as output paths (Section 3.7). Assigning to a nested metadata path auto-creates intermediate objects: +```bloblang +output@.routing.region = "us-west" # Auto-creates output@.routing as {} +output@.routing.priority = 10 # output@.routing is {"region": "us-west", "priority": 10} +``` + +**Note:** While the language allows any metadata type, message systems (Kafka, AMQP, etc.) often only support string metadata. In practice, implementations serialize non-string values to JSON strings when interfacing with such systems. For example, `output@.tags = ["a", "b"]` would be stored as the string `'["a","b"]'` in Kafka metadata. Bytes values in metadata are an error during serialization — use `.encode("base64")` or `.encode("hex")` before storing in metadata that will be serialized. + +**Reading metadata as a whole:** `input@` and `output@` without a path component evaluate to the entire metadata object. The result is always an object (`.type()` returns `"object"`), even when empty. +```bloblang +output.all_meta = input@ # Read all input metadata as an object +output.meta_type = input@.type() # "object" +output.has_meta = input@.length() # Number of metadata keys +output.keys = input@.keys() # Array of metadata key names +``` + +**Copy all metadata:** +```bloblang +output@ = input@ # Logical copy (COW) all metadata +output@.kafka_topic = "new-topic" # Override specific +``` + +`output@ = input@` performs a **logical copy** (copy-on-write) — output metadata is fully independent of input metadata. Modifying output metadata (including nested values like `output@.tags[0]`) never affects input metadata. This is the same COW semantics used for document copy and variable assignment. + +Undefined metadata keys return `null`. + +## 7.5 Scoping Rules + +**Context Access Permissions:** + +| Context | Read `input` | Read `output` | Write `output`/`output@` | Read enclosing params | Read/Write `$var` | +|---------|--------------|---------------|--------------------------|-----------------------|-------------------| +| Top-level mapping | ✅ | ✅ | ✅ | n/a | ✅ | +| Map body | ❌ | ❌ | ❌ | ✅ (own params) | ✅ (locally declared only) | +| Expression context (top-level) | ✅ | ✅ | ❌ | n/a | ✅ | +| Expression context (inside map) | ❌ | ❌ | ❌ | ✅ (enclosing map's params + own lambda params) | ✅ (enclosing map's local variables) | +| Match `as` binding (top-level) | ✅ | ✅ | ❌ | n/a | ✅ | +| Match `as` binding (inside map) | ❌ | ❌ | ❌ | ✅ (enclosing map's params) | ✅ (enclosing map's local variables) | + +**Key principle:** Map bodies cannot access `input`, `output`, or top-level `$variables` — the only data available inside a map is its parameters and variables declared within the map body. Inline lambdas and expressions at the top-level can read (but not write) `input` and `output`. Lambdas inherit the read permissions of their enclosing context (Section 3.4), which is why a lambda nested inside a map can still read that map's parameters and local variables (but cannot reach `input` / `output` / top-level `$var`). + +**Examples:** +```bloblang +# Top-level: full access +output.x = input.y # ✅ Read input, write output + +# Top-level inline lambda: can read input/output +output.items = input.data.map(x -> { + $multiplier = input.config.multiplier # ✅ Can read input + $base = output.base_value # ✅ Can read output + x * $multiplier +}) + +# Map body: no input/output access +map transform(data) { + $temp = input.value # ❌ ERROR: cannot access input + data.field * 2 # ✅ OK: use parameters +} + +# Lambda inside map: also no input/output access +map process(items) { + items.map(x -> { + $val = input.config # ❌ ERROR: cannot access input + x * 2 # ✅ OK: use parameters and variables + }) +} +``` + +**Top-level scope:** +- Variables accessible throughout mapping +- Maps accessible globally (or via namespace if imported) + +**Block scope:** +- New variable declarations in `if`, `match`, lambda, and map bodies are block-scoped +- Only accessible within declaring block and nested blocks +- In expression contexts: assigning to an existing outer variable name creates a new inner variable (shadow) +- In statement contexts: assigning to an existing outer variable modifies it + +```bloblang +$global = 10 + +output.result = if input.flag { + $local = 20 # Only in this block + $global + $local # Can access both +} + +# $local not accessible here +output.final = $global # Still 10 +``` + +**Expression contexts (shadowing):** +```bloblang +$value = 10 + +output.inner = if input.flag { + $value = 20 # NEW variable, shadows outer (expression context) + $value # Returns 20 +} + +output.outer = $value # Still 10 (outer variable unchanged) +``` + +**Statement contexts (mutation):** +```bloblang +$value = 10 + +if input.flag { + $value = 20 # Modifies outer $value (statement context) + $new = "hello" # Block-scoped: NOT visible outside +} + +output.result = $value # 20 if flag was true, 10 if false +output.new = $new # Compile-time error: $new does not exist +``` + +## 7.6 Variable Reassignment + +Variables can be reassigned in the same scope: +```bloblang +$value = 10 +$value = 20 # OK: reassignment +output.x = $value # 20 + +# Reassignment vs shadowing +output.y = if true { + $value = 30 # Shadowing: new variable in inner scope + $value # 30 +} +output.z = $value # Still 20 (inner scope doesn't affect outer) +``` + +**Variable path assignment:** Variables support field and index assignment with the same semantics as `output`. Assigning to a nested path within a variable mutates the variable's value in place, with auto-creation of intermediate structures: + +```bloblang +$record = {"name": "Alice", "scores": [10, 20]} +$record.name = "Bob" # {"name": "Bob", "scores": [10, 20]} +$record.scores[2] = 30 # {"name": "Bob", "scores": [10, 20, 30]} +$record.scores[5] = 99 # Gaps filled with null: [10, 20, 30, null, null, 99] +$record.address.city = "London" # Auto-creates: {"name": "Bob", ..., "address": {"city": "London"}} +$record.name = deleted() # Removes field: {"scores": [...], "address": {...}} +``` + +**Copy-on-write:** Assigning a value to a variable always creates a logical copy, regardless of the source. Subsequent mutations to the variable never affect the original, and subsequent mutations to the original never affect the variable: + +```bloblang +# From input (immutable source) +$data = input.user +$data.status = "processed" # Mutates $data only; input.user unchanged + +# From output (mutable source) +$snapshot = output.user +output.user.name = "changed" # Mutates output only; $snapshot unchanged +$snapshot.age = 30 # Mutates $snapshot only; output unchanged + +# Between variables +$a = {"x": 1} +$b = $a +$b.x = 2 # $b is {"x": 2}, $a is still {"x": 1} +``` + +Variable path assignment (`$var.field = expr`) is available in all contexts. In expression contexts (if/match expressions, lambdas, map bodies), only variable assignments are allowed (no `output` assignments). In statement contexts (top-level, if/match statements), both variable and `output` assignments are allowed. + +## 7.7 Evaluation Order + +Statements execute sequentially, top-to-bottom. +Variables must be declared before being read. A variable is declared either by direct assignment (`$x = expr`) or by path assignment to an undeclared name (`$x.field = expr`, `$x[0] = expr`), which auto-creates the root value per Section 3.7's type-inference rules. Reading an undeclared variable (`output.y = $never`) is a compile-time error. +**Map declarations are hoisted** — maps can be called before their declaration in the file. All maps are resolved before execution begins, so declaration order does not matter. Duplicate map names within the same file are a compile-time error. + +```bloblang +# Map used before its declaration — valid +output.result = transform(input.data) + +map transform(data) { + data.value * 2 +} +``` + +Later statements can reference earlier `output` fields: + +```bloblang +output.price = input.price +output.tax = output.price * 0.1 # Uses earlier output +output.total = output.price + output.tax +``` + +**Note:** Reading an `output` field that has not yet been assigned returns `null` (consistent with Section 7.1 — non-existent fields return null). This means reordering statements can silently change behavior: +```bloblang +output.a = output.b # null — output.b not yet assigned +output.b = 42 +``` + +This includes inline lambdas in method arguments — a lambda reading `output` sees its value at the time the enclosing statement executes: +```bloblang +output.multiplier = input.rate +output.items = input.data.map(x -> x * output.multiplier) # OK: multiplier already assigned +``` diff --git a/internal/bloblang2/spec/08_error_handling.md b/internal/bloblang2/spec/08_error_handling.md new file mode 100644 index 000000000..30cf78d35 --- /dev/null +++ b/internal/bloblang2/spec/08_error_handling.md @@ -0,0 +1,254 @@ +# 8. Error Handling + +## 8.1 Error Propagation + +Errors propagate through expressions: +```bloblang +output.parsed = input.date.ts_parse("%Y-%m-%d") +# Throws error if parsing fails +``` + +Common error sources: +- Type mismatches (e.g., `5 + "text"`) +- Failed method calls (e.g., parsing, out-of-bounds access) +- Explicit `throw()` calls + +## 8.2 Catch Method + +Handle errors with `.catch()`. The method takes a lambda with a single parameter — the error object — and is called only when the expression to its left produces an error. If the expression succeeds, `.catch()` returns its value unchanged. If the lambda itself errors, that error propagates and can be caught by a subsequent `.catch()`. + +**Scope:** `.catch()` catches any error produced by its receiver expression — the entire expression that the grammar parses as the left-hand side of the `.catch()` method call. Errors propagate through postfix chains: if any postfix operation (method call, field access, or index access) errors, all subsequent postfix operations are skipped and the error flows to the next `.catch()`. + +```bloblang +# Catches errors from trim_suffix or ts_parse (either one) +input.date.trim_suffix("TS:").ts_parse("%Y-%m-%d").catch(err -> null) + +# Field access and indexing are also skipped on error +# If ts_parse errors, .year (field access) is skipped and the error reaches .catch() +input.date.ts_parse("%Y-%m-%d").year.string().catch(err -> "unknown") + +# Parentheses define the boundary — catches errors from the addition and .string() +(input.a + input.b).string().catch(err -> "0") + +# Catches errors from .map() (e.g., lambda errors), not from inside individual elements +input.items.map(x -> x.value / x.count).catch(err -> []) +``` + +All runtime errors are catchable with `.catch()` — the sole exception is exceeding the recursion limit (Section 5.2), which halts execution immediately. + +**Void and `deleted()` pass through `.catch()` unchanged.** Neither is an error — void is the absence of a value, and `deleted()` is a deletion marker. `.catch()` only activates on errors, so both flow through transparently. If either then encounters a method that requires a value, *that* produces an error which can be caught by a subsequent `.catch()`: +```bloblang +(if false { 1 }).catch(err -> 0) # void (catch not triggered, no error occurred) +(if false { 1 }).string().catch(err -> "boo!") # "boo!" (.string() errors on void, catch triggers) +(if false { 1 }).or(0) # 0 (to rescue void, use .or() — not .catch()) +``` + +**The error object** is a plain object (`{"what": "..."}`) with a single field: +- `.what` — a string containing the error message + +The error is structured as an object (rather than a plain string) to allow future extension with additional fields (e.g., error codes, source locations) without breaking existing handlers. + +```bloblang +# Inspect the error +output.parsed = input.date.ts_parse("%Y-%m-%d").catch(err -> { + $msg = "parse failed: " + err.what + throw($msg) +}) + +# Ignore the error, provide fallback value +output.parsed = input.date.ts_parse("%Y-%m-%d").catch(err -> null) + +# Chain multiple attempts +output.parsed = input.date + .ts_parse("%Y-%m-%d") # Try format 1 + .catch(err -> input.date.ts_parse("%Y/%m/%d")) # If format 1 fails, try format 2 + .catch(err -> null) # If format 2 also fails, use null +``` + +## 8.3 Or Method + +Provide default for null, void, or deleted values. `.or()` uses **short-circuit evaluation**: the argument expression is only evaluated if the receiver is null, void, or `deleted()`. If the receiver has a value, the argument is never evaluated and the receiver value is returned directly. + +`.or()` and `.catch()` are the only methods that can be called on void or `deleted()` — all other method calls on void or `deleted()` are errors. `.catch()` passes void and `deleted()` through unchanged (they are not errors), while `.or()` actively rescues them by returning its argument. This makes `.or()` useful for providing defaults in deeply nested expressions involving if-without-else, non-exhaustive match, or expressions that may yield `deleted()`: + +```bloblang +output.name = input.user.name.or("Anonymous") +output.count = input.items?.length().or(0) + +# Short-circuit: throw() is only evaluated if name is null +output.name = input.name.or(throw("name is required")) + +# Rescues void from if-without-else +output.label = (if input.premium { "VIP" }).or("standard") + +# Rescues void from non-exhaustive match +output.sound = (match input.animal { "cat" => "meow", "dog" => "woof" }).or("unknown") + +# Rescues deleted() — useful when calling maps that may return deleted() +output.field = some_map(input.value).or("placeholder") + +# .or() can itself return deleted() — deletion rules then apply in the calling context +output.field = input.name.or(deleted()) +# If name is null: .or() returns deleted(), field is removed from output +# If name has a value: .or() returns the value, field is assigned normally +``` + +## 8.4 Throw Function + +Throw custom errors. `throw()` requires exactly one string argument: +```bloblang +output.value = if input.value != null { + input.value +} else { + throw("Value is required") +} +``` + +Non-string literal arguments are a compile-time error; dynamic arguments that evaluate to a non-string type at runtime are a runtime error: +```bloblang +throw(42) # COMPILE ERROR: throw() requires a string argument +throw(null) # COMPILE ERROR: throw() requires a string argument +throw() # COMPILE ERROR: throw() requires exactly one string argument +throw($var) # Runtime error if $var is not a string +``` + +**Error propagation:** `throw()` produces an error that propagates like any other error. It can be caught with `.catch()`: +```bloblang +# Caught: provides fallback value +output.result = throw("bad value").catch(err -> "fallback") # "fallback" + +# Caught with error inspection +output.result = throw("bad value").catch(err -> { + $default = "fallback" + $default # err.what == "bad value" +}) + +# Caught in expression context +output.name = input.name.or(throw("name is required")).catch(err -> "Anonymous") + +# Uncaught: halts the mapping +output.result = throw("fatal error") # No .catch(), stops execution +``` + +When a `throw()` error is **not caught** by `.catch()`, it halts the entire mapping and no subsequent statements execute. + +**Conditional throw in statement context:** To validate input and halt the mapping when a condition fails, use `throw()` on the right-hand side of an assignment inside an if statement. The error propagates past the assignment and halts the mapping — the assignment itself never completes: +```bloblang +if input.amount < 0 { + output = throw("amount must be non-negative") +} +# Execution continues here only if amount >= 0 +output.amount = input.amount +``` + +## 8.5 Null-Safe vs Error-Safe + +**Null-safe operators** (`?.`, `?[]`, `?.method()`): Handle `null`, not errors +```bloblang +input.user?.name # null if user is null, error if user is non-object + +# ?. only short-circuits on null, not type mismatches +null?.name # OK: returns null +input.user?.name # OK: returns null if user is null, or user.name if user is object +"string"?.name # ERROR: cannot access field on string (not null, wrong type) +5?.name # ERROR: cannot access field on int64 (not null, wrong type) +``` + +**`.catch(lambda)`**: Handles errors, not `null` +```bloblang +input.date.ts_parse("format").catch(err -> null) # null if parse fails +``` + +**`.or()`**: Handles `null`, `void`, and `deleted()`, not errors. Short-circuits: argument only evaluated if receiver is null, void, or deleted. If the receiver is an error, the error propagates through `.or()` uncaught. +```bloblang +input.name.or("default") # "default" if name is null +(if false { "hello" }).or("world") # "world" (void rescued) +(match input.x { "a" => 1 }).or(0) # 0 if no case matched (void rescued) +some_map(input.value).or("fallback") # "fallback" if map returned deleted() +(5 / 0).or("default") # ERROR propagates: .or() does not catch errors +``` + +**Combine for both:** +```bloblang +input.user?.age.or(0).catch(err -> -1) +# null-safe → default for null → fallback for errors +``` + +## 8.6 Composing `.or()` and `.catch()` + +`.or()` and `.catch()` handle disjoint failure modes — `.or()` rescues null, void, and `deleted()`, while `.catch()` rescues errors. When both are needed, either ordering produces the same result **as long as the `.or()` default never errors and the `.catch()` handler never returns null/void/deleted**: + +```bloblang +# These two are equivalent when defaults are simple literals: +input.user?.age.or(0).catch(err -> -1) +input.user?.age.catch(err -> -1).or(0) +``` + +When the default or handler is more complex, ordering matters because the output of one feeds into the other: + +```bloblang +# .or() default errors → .catch() catches it +input.age.or(compute_default()).catch(err -> -1) +# If age is null: .or() evaluates compute_default(). +# If that errors, .catch() catches it → -1. + +# .catch() first → .or() never sees the error +input.age.catch(err -> -1).or(compute_default()) +# If age is null: .catch() passes null through, .or() evaluates compute_default(). +# If compute_default() errors here, nothing catches it. +``` + +```bloblang +# .catch() handler returns null → .or() rescues it +input.age.catch(err -> null).or(0) +# If age is an error: .catch() → null, .or() rescues null → 0. + +# .or() first → null from .catch() is not rescued +input.age.or(0).catch(err -> null) +# If age is an error: .or() passes error through, .catch() → null. +# null is the final result (no further .or() to rescue it). +``` + +**Rule of thumb:** For simple literal defaults, ordering doesn't matter. If the default or handler is a non-trivial expression, put the one whose argument you want protected by the other method first. + +## 8.7 Method Chaining with Null + +**Method type requirements:** Methods work on specific types, and calling a method on an incompatible type (including null) results in an error. Some methods like `.type()` accept any type including null, while data transformation methods typically require specific types. + +```bloblang +# Method requires specific type (string) +input.value.uppercase() # ERROR if value is null (or any non-string type) + +# Use null-safe operator to skip method call +input.value?.uppercase() # Returns null if value is null (method not called) + +# Method accepts any type including null +input.value.type() # Returns "null" if value is null (method called) + +# Chaining with null-safe operators +input.user?.address?.city.or("Unknown") # Combine null-safe navigation with defaults +``` + +**When a method returns null:** The null propagates to the next operation: +```bloblang +input.items[0]?.uppercase() # OK: returns null if first element is null (null-safe skips uppercase) +input.items[0].uppercase() # ERROR if first element is null (uppercase requires string) + +# Out-of-bounds errors — use .catch() for fallback +input.items[0].catch(err -> "").uppercase() # OK: provides default on empty array +``` + +## 8.8 Validation Methods + +```bloblang +# type() - check type +# Type checking - check for any signed integer type +output.valid = if [ "int32", "int64" ].contains(input.value.type()) { + input.value +} else { + throw("Value must be a signed integer") +} + +# not_null() - assert non-null +output.name = input.name.not_null("name is required") +``` diff --git a/internal/bloblang2/spec/09_special_features.md b/internal/bloblang2/spec/09_special_features.md new file mode 100644 index 000000000..2d019516e --- /dev/null +++ b/internal/bloblang2/spec/09_special_features.md @@ -0,0 +1,307 @@ +# 9. Special Features + +## 9.1 Dynamic Field Names + +Use string indexing for dynamic field access on objects: + +```bloblang +# Dynamic field read +$field_name = "user_id" +output.value = input[$field_name] + +# Dynamic field write +$key = "dynamic_field" +output[$key] = "value" + +# With literals +output.first = input["user_id"] +output["computed_" + input.type] = input.value + +# Null-safe dynamic access +output.value = input?.user?[$field_name] +``` + +## 9.2 Message Filtering & Deletion + +**The `deleted()` function** returns a special deletion marker that instructs assignments to remove the target. + +### Deletion Semantics + +```bloblang +# Delete output field +output.field = deleted() # Field removed from output + +# Drop entire message (immediately exits the mapping) +output = deleted() # Message dropped, no further statements execute + +# Delete metadata key +output@.key = deleted() # Specific key removed + +# Clear all metadata (replace with empty object) +output@ = {} # All metadata keys removed + +# Replace all metadata +output@ = {"key": "value"} # Replaces all metadata with this object + +# Cannot delete metadata — it is always an object +output@ = deleted() # ERROR: cannot delete metadata object +``` + +**`output = deleted()` — immediate message drop:** + +`output = deleted()` drops the entire message (document + metadata) from the stream and **immediately exits the mapping**. No subsequent statements execute. This is a terminal operation — there is no way to "restore" a deleted output. It is specifically the *assignment* of `deleted()` to `output` (the root document) that triggers the exit — merely evaluating `deleted()` in an expression does not exit the mapping. + +```bloblang +output = deleted() +output.field = "value" # Never executes — mapping already exited +output@.kafka_topic = "topic" # Never executes — mapping already exited +``` + +To conditionally drop messages, use an if expression or match: +```bloblang +# Conditional drop — if spam, message is dropped and mapping exits +output = if input.spam { deleted() } else { input } + +# Using match +output = match input.type { + "spam" => deleted(), + _ => input, +} +``` + +**`output@` cannot be deleted:** + +`output@` is always an object (a key-value map of metadata). It cannot be deleted — `output@ = deleted()` is an error. To clear all metadata keys, assign an empty object: `output@ = {}`. You can also replace all metadata with an object literal or copy from input: `output@ = input@`. + +```bloblang +# Metadata: cannot be deleted, only cleared or replaced +output@ = deleted() # ERROR: cannot delete metadata object +output@ = {} # OK: clears all metadata keys +output@.key = "value" # OK: assigning key to the metadata object +``` + +**Variable assignment:** Assigning `deleted()` to a variable is a runtime error — variables cannot be deleted (variables must always hold a value — see Section 4.1). +```bloblang +$val = deleted() # ERROR: cannot assign deleted() to a variable +``` + +### deleted() in Expressions + +**Any expression can yield `deleted()`**, including maps, lambdas, if expressions, and match expressions. When `deleted()` flows through expressions and is assigned to a field or included in a collection, it causes removal. When it flows to a root output assignment (`output = deleted()`), it drops the message and exits the mapping. + +**In array operations:** +```bloblang +# Array literal - deleted elements removed +output.items = [1, deleted(), 3] # Result: [1, 3] + +# map - deleted elements filtered out +output.positive = input.numbers.map(x -> if x > 0 { x } else { deleted() }) +# Input: [-1, 2, -3, 4] → Output: [2, 4] +``` + +**In object literals:** +```bloblang +# Using deleted() explicitly +output.user = { + "id": input.id, + "email": if input.email_verified { input.email } else { deleted() }, + "phone": input.phone +} +# If email not verified, field "email" is removed from object +``` + +**`deleted()` vs void:** These are different concepts. `deleted()` is an active deletion marker — it expresses intentional removal. Void means "no value was produced" — it typically indicates a missing code path. Void is produced implicitly (an `if` without `else`, a non-exhaustive `match`, `.find()` with no result), or explicitly via the `void()` builtin. The distinction is enforced: `deleted()` is accepted in contexts where removal makes sense (collection literals, `.map()` lambdas), while void is an error in those same contexts, catching the likely bug of a missing `else` branch or `_` case. +```bloblang +# In collection literals +output.items = [1, deleted(), 3] # [1, 3] — deleted: intentional removal +output.items = [1, if false { 2 }, 3] # ERROR: void — missing else branch + +# In .map() lambdas +input.items.map(x -> if x > 0 { x } else { deleted() }) # Negatives removed (intentional) +input.items.map(x -> if x > 0 { x }) # ERROR: void when x <= 0 (missing else) +``` +See Section 4.1 for full void semantics. + +**Nested structures (recursive deletion):** +```bloblang +# Nested arrays +output.matrix = [[1, deleted(), 3], [4, 5]] +# Result: [[1, 3], [4, 5]] + +# Deeply nested - deletion propagates through all levels +output.nested = [deleted(), [deleted(), 3]] +# Result: [[3]] (first element deleted, then inner first element deleted) + +# Nested objects +output.user = { + "name": "Alice", + "contact": { + "email": if input.verified { input.email } else { deleted() }, + "phone": input.phone + } +} +# If not verified: {"name": "Alice", "contact": {"phone": "..."}} + +# Arrays of objects +output.users = [ + {"name": "Alice", "email": if true { "a@example.com" } else { deleted() }}, + {"name": "Bob", "email": if false { "b@example.com" } else { deleted() }} +] +# Result: [{"name": "Alice", "email": "a@example.com"}, {"name": "Bob"}] +``` + +`deleted()` is evaluated independently at each nesting level during literal construction — each `deleted()` call acts locally in the literal where it appears, removing the element or field at that position. There is no "recursive propagation"; inner literals are constructed first, and their `deleted()` markers are resolved before the outer literal sees the result. + +**In conditional expressions and match arms:** Match arms are transparent — `deleted()` produced by a case arm flows out of the match expression and behaves exactly as it would from any other expression. +```bloblang +# Field actively removed when expression yields deleted() +output.category = if input.spam { deleted() } else { input.category } +# If spam, output.category field doesn't exist (not even with null value) + +# Message drop — deleted() flows to root assignment, exits mapping +output = if input.spam { deleted() } else { input } +# If spam, message is dropped and mapping exits immediately + +# deleted() from a match arm flows out identically +output.result = match input.x { + "b" => deleted(), # Field actively removed if x == "b" + _ => input.x, +} + +# Message drop via match +output = match input.type { + "spam" => deleted(), # Message dropped, mapping exits + _ => input, +} +``` + +**In maps and lambdas:** +```bloblang +map filter_negative(val) { + if val < 0 { deleted() } else { val } +} +output.result = filter_negative(input.value) # Field deleted if value < 0 +``` + +**Operations on `deleted()` are errors (except `.or()` and `.catch()`):** +```bloblang +deleted() + 5 # ERROR: cannot perform arithmetic on deleted +deleted() == deleted() # ERROR: cannot compare deleted values +deleted().type() # ERROR: cannot call methods on deleted value +deleted()?.field # ERROR: ?. only short-circuits on null, not deleted +deleted().or("fallback") # OK: returns "fallback" (.or() rescues deleted) +deleted().catch(err -> "x") # OK: deleted passes through (.catch() not triggered) +``` + +These operations result in **runtime errors**. The exceptions are `.or()`, which rescues `deleted()` the same way it rescues null and void (Section 8.3), and `.catch()`, which passes `deleted()` through unchanged (Section 8.2). **Method chain propagation:** When `deleted()` hits an unsupported method, the method produces an error. That error then propagates through subsequent methods (skipping them) until caught by `.catch()`, following normal error propagation rules (Section 8.2). For example, `deleted().uppercase().catch(err -> "recovered")` errors at `.uppercase()`, then `.catch()` catches the error and returns `"recovered"`. + +**When deleted() Causes Errors vs Deletion:** + +`deleted()` behaves differently depending on context: + +**Triggers deletion (no error):** +- Field assignment: `output.field = deleted()` — removes the field +- Root output assignment: `output = deleted()` — drops the message and exits the mapping +- Metadata key assignment: `output@.key = deleted()` — removes the key +- Variable field assignment: `$var.field = deleted()` — removes the field from the variable's value +- Array index assignment: `$arr[0] = deleted()`, `output.items[0] = deleted()` — removes the element at the given index and shifts remaining elements down (consistent with `.map()` + `deleted()` and `.without_index()`) +- Array index assignment with further path: `output.items[0].name = deleted()` removes the `name` field from the object at index 0 — this is a field deletion, not an array element deletion. The `deleted()` applies to the final path component (the field), not the array element. +- Collection literals: `[1, deleted(), 3]` → `[1, 3]`, `{"a": deleted()}` → `{}` +- Return values from expressions used in assignments: `output.x = if spam { deleted() } else { value }` +- Lambda return value in the methods that explicitly support it — `.map()`, `.map_values()`, `.map_keys()`, `.map_entries()` (element/entry omitted from result), and `.catch()` (result flows to calling context with normal semantics). All other methods treat `deleted()` lambda returns as runtime errors (see Section 13 preamble) + +**Causes runtime error:** +- Variable assignment: `$var = deleted()` (cannot assign deleted to a variable — variables must always hold a value; see Section 4.1) +- Metadata root assignment: `output@ = deleted()` (cannot delete metadata object) +- Binary operators: `deleted() + 5`, `deleted() == deleted()`, `deleted() && true` +- Method calls (except `.or()`): `deleted().type()`, `deleted().uppercase()` +- Used as function arguments: `some_function(deleted())`. This includes `deleted()` that flows indirectly through expressions — e.g., `some_map(match input.x { "remove" => deleted(), _ => input.x })` is a runtime error if the match arm produces `deleted()` +- Lambda return values in methods that do not support deletion (e.g., `filter`, `sort`). See individual method documentation in Section 13 for which methods support `deleted()` as a lambda return value. + +The distinction: `deleted()` is a special marker that triggers deletion when flowing into a field/metadata/index assignment or collection, but cannot be used as a normal value in computations. Assigning `deleted()` to a variable (`$var = deleted()`) is an error; however, assigning `deleted()` to a field *within* a variable (`$var.field = deleted()`) removes that field, and assigning `deleted()` to an array index (`$arr[0] = deleted()`) removes the element and shifts remaining elements down. The sole exception to method restrictions is `.or()`, which rescues `deleted()` and returns its argument (Section 8.3). When `deleted()` flows to the root output assignment, it drops the entire message and immediately exits the mapping. + +**Array index deletion edge cases:** Assigning `deleted()` to an array index follows the same index semantics as `.without_index()` (Section 13.6). Negative indices count from the end (`$arr[-1] = deleted()` removes the last element). Out-of-bounds indices — both positive and negative — are a runtime error. The target must be an array; assigning `deleted()` to an index on a non-array value (e.g., a string or object) is a runtime error. These rules apply uniformly to both `output` paths and variable paths: +```bloblang +$arr = [10, 20, 30] +$arr[-1] = deleted() # [10, 20] (last element removed) +$arr[99] = deleted() # ERROR: index out of bounds +output.items[0] = deleted() # Removes first element, shifts remaining down +``` + +**Deletion through a non-existent path:** When the target path does not fully exist, deletion follows the same auto-creation rules as assignment (Section 3.7). Missing object intermediates are auto-created as empty objects; missing array intermediates are auto-created as empty arrays (based on index type); the final deletion is then applied to the auto-created structure. For object field deletion the final step is a no-op if the field is absent (deletion is idempotent). For array index deletion the final step still requires the index to be in bounds — deleting from an auto-created empty array is a runtime error, consistent with the array edge-case rules above. + +```bloblang +# output is {} +output.a.b.c = deleted() # Auto-creates output.a = {} and output.a.b = {}; + # deleting "c" from {} is a no-op. + # Result: output = {"a": {"b": {}}} + +# output is {"a": {"b": {"c": 1, "d": 2}}} +output.a.b.c = deleted() # output = {"a": {"b": {"d": 2}}} + +# Variable paths — same auto-creation + idempotent-delete behavior +$v = {} +$v.x.y = deleted() # $v = {"x": {}} (auto-created, nothing to delete) + +# Array index deletion still errors on out-of-bounds, including freshly +# auto-created empty arrays +output.xs[0] = deleted() # output.xs auto-created as []; deleting [0] is + # a runtime error (out of bounds) + +# Type collision on an intermediate is still a runtime error (Section 3.7) +output.a = "string" +output.a.b = deleted() # ERROR: output.a is a string, not an object +``` + +Metadata path deletion (`output@.a.b = deleted()`) follows the same rules — intermediate metadata objects are auto-created, and the final delete is a no-op if the key is absent. + +**Routing messages instead of dropping them:** + +To route failed/spam messages to a dead letter topic (rather than dropping them), the output document must exist: +```bloblang +# Route spam to dead letter with document intact +output = input # Keep document (or create error document) +output@.reason = "spam_detected" # Metadata for routing +output@.kafka_topic = "dead_letter" # Route to dead letter topic +``` + +## 9.3 Non-Structured Data + +Handle raw strings/bytes: +```bloblang +# If input is raw string +output.parsed = input.parse_json() + +# If input is raw bytes +output.decoded = input.string() +``` + +## 9.4 Conditional Literals + +Build dynamic structures: +```bloblang +output.user = { + "id": input.id, + "name": input.name, + "email": if input.email_verified { + input.email + } else { + null + } +} + +# Conditional array elements - use deleted() to omit elements +output.items = [ + input.a, + if input.b != null { input.b } else { deleted() }, # Omitted if b is null + input.c +] +# If b is null: [input.a, input.c] + +# Void is an error in collection literals — always use deleted() or an else branch +output.items = [ + input.a, + if input.b != null { input.b }, # ERROR: void in array literal when b is null + input.c +] +``` diff --git a/internal/bloblang2/spec/10_grammar.md b/internal/bloblang2/spec/10_grammar.md new file mode 100644 index 000000000..833cc4983 --- /dev/null +++ b/internal/bloblang2/spec/10_grammar.md @@ -0,0 +1,248 @@ +# 10. Grammar Reference + +**Statement separation:** Statements are separated by newlines. Multiple statements on a single line are not allowed — each statement must begin on its own line. Newlines inside parentheses `()` and brackets `[]` are treated as whitespace and do not produce separator tokens, allowing argument lists, array literals, and grouped expressions to span multiple lines freely. Newlines inside braces `{}` are still significant — they separate statements within block bodies (if/match/map/lambda). In object literals (also delimited by `{}`), entries are comma-separated, so newlines between entries are ignored. + +**Postfix continuation:** Newlines are also suppressed when the next line begins with a postfix operator token — `.`, `?.`, `[`, `?[`, or `else`. This allows method chains, indexing, and if/else chains to span multiple lines without requiring same-line braces: +```bloblang +output.result = input.text + .trim() + .lowercase() + .replace_all(" ", "-") + +output.tier = if input.score >= 90 { "gold" } +else if input.score >= 50 { "silver" } +else { "bronze" } +``` +This is safe because none of these tokens can begin a valid statement (`statement := assignment | if_stmt | match_stmt`), so there is no ambiguity between continuation and a new statement. + +**Operator continuation:** Newlines are also suppressed when the preceding non-whitespace token cannot end a complete expression — specifically: binary operators (`+`, `-`, `*`, `/`, `%`, `==`, `!=`, `>`, `>=`, `<`, `<=`, `&&`, `||`), unary operators (`!`, `-`), assignment (`=`), arrows (`=>`, `->`), and colons (`:`). This allows multi-line expressions without requiring parentheses: +```bloblang +output.total = input.price + + input.tax + + input.shipping + +output.valid = input.count > 0 && + input.count < 100 + +output.result = + input.items + .filter(x -> x > 0) + .map(x -> x * 2) +``` +This is safe because none of these tokens can be the final token of a valid expression — they always require a right-hand operand. A dangling operator caused by a typo (e.g., accidental `+` at end of line) will produce a clear parse error when the next line cannot be parsed as a continuation. + +``` +# --- Lexical: newline handling --- +# NL represents one or more newline characters that act as statement separators. +# Inside parentheses () and brackets [] — newlines are treated as whitespace +# and do not produce NL tokens. The lexer tracks () and [] nesting depth; NL +# tokens are suppressed when inside these delimiters. +# Inside braces {} — newlines still produce NL tokens (needed to separate +# statements in block bodies). Object literals use comma separation, so NL +# tokens between entries are simply consumed as optional whitespace by the +# comma-separated list productions. +# Postfix continuation — NL tokens are suppressed when the next non-whitespace +# token is a postfix operator or continuation keyword: '.', '?.', '[', '?[', +# or 'else'. This enables multi-line method chains, indexing, and if/else +# chains without requiring same-line braces. +# Operator continuation — NL tokens are suppressed when the preceding +# non-whitespace token cannot end a complete expression: binary operators +# (+, -, *, /, %, ==, !=, >, >=, <, <=, &&, ||), unary operators (!, -), +# assignment (=), arrows (=>, ->), and colons (:). This enables multi-line +# binary expressions and assignments without requiring parentheses. +# Blank lines (consecutive newlines) and trailing newlines are allowed and +# collapsed — NL means "one or more newline boundaries." +# A program may optionally begin or end with NL (leading/trailing blank lines). + +program := NL? (top_level_statement (NL top_level_statement)*)? NL? +top_level_statement := statement | map_decl | import_stmt +statement := assignment | if_stmt | match_stmt + +# Statement contexts (top-level, if/match statement bodies): can assign to output, metadata, or variables +assignment := assign_target '=' expression +assign_target := 'output' metadata_accessor? path_component* | var_ref path_component* + +# Expression contexts (map bodies, lambda blocks, if/match expressions): can only assign to variables +var_assignment := var_ref path_component* '=' expression + +map_decl := 'map' identifier '(' [param_list] ')' '{' NL? (var_assignment NL)* expression NL? '}' +param_list := param (',' param)* +param := identifier | identifier '=' literal | '_' +import_stmt := 'import' string_literal 'as' identifier + +expression := postfix_expr | binary_expr | unary_expr +control_expr := if_expr | match_expr + +# Lambdas are NOT general expressions. They only appear as call arguments +# (positional or named) — see `arg_value` below. A lambda in any other +# expression position (assignment RHS, collection literal element, operator +# operand, paren_expr, etc.) is a parse error. + +# Postfix expressions: a primary followed by zero or more field access, indexing, or method calls. +# This unified production allows chaining any postfix operation on any expression result: +# input.items.filter(x -> x > 0)[0] — index a method result +# extract_user(input.data).name — field access on a call result +# "a,b,c".split(",")[0] — index a method on a literal +postfix_expr := primary_expr postfix_op* +primary_expr := literal | context_root | call_expr | control_expr | paren_expr +context_root := ('output' | 'input') metadata_accessor? | var_ref | qualified_name | identifier + +postfix_op := path_component | method_call +path_component := '.' field_name | '?.' field_name | '[' expression ']' | '?[' expression ']' +# Note: '.' field_name and '.' word '(' ... ')' both start with '.'. Disambiguation: +# after '.' word, if '(' follows it is a method call; otherwise it is field access. +method_call := '.' word '(' [arg_list] ')' | '?.' word '(' [arg_list] ')' + +metadata_accessor := '@' +field_name := word | string_literal +var_ref := '$' identifier + +call_expr := (identifier | qualified_name | reserved_name) '(' [arg_list] ')' +qualified_name := identifier '::' identifier + +if_expr := 'if' expression '{' NL? expr_body NL? '}' + (NL? 'else' 'if' expression '{' NL? expr_body NL? '}')* + (NL? 'else' '{' NL? expr_body NL? '}')? +if_stmt := 'if' expression '{' NL? stmt_body NL? '}' + (NL? 'else' 'if' expression '{' NL? stmt_body NL? '}')* + (NL? 'else' '{' NL? stmt_body NL? '}')? +expr_body := (var_assignment NL)* expression +stmt_body := (statement (NL statement)*)? + +match_expr := 'match' expression ('as' identifier)? '{' NL? expr_match_case (',' NL? expr_match_case)* ','? NL? '}' + | 'match' '{' NL? expr_match_case (',' NL? expr_match_case)* ','? NL? '}' +match_stmt := 'match' expression ('as' identifier)? '{' NL? stmt_match_case (',' NL? stmt_match_case)* ','? NL? '}' + | 'match' '{' NL? stmt_match_case (',' NL? stmt_match_case)* ','? NL? '}' +expr_match_case := (expression | '_') '=>' (expression | '{' NL? expr_body NL? '}') +stmt_match_case := (expression | '_') '=>' '{' NL? stmt_body NL? '}' + +# Note: The grammar uses the same case production for all three match forms +# (equality, boolean with 'as', boolean without expression). The distinction +# between these forms is semantic, not syntactic. A semantic pass must enforce: +# - With 'as': case expressions must evaluate to boolean (Section 4.2) +# - Without 'as' (equality form): case expressions that evaluate to boolean +# are a runtime error (Section 4.2) +# - Without expression: case expressions must evaluate to boolean (Section 4.2) + +# Note: this is a simplified flat production. Operator precedence, associativity, +# and non-associativity rules are defined in Section 3.2 and must be applied by +# the parser. In particular, chaining non-associative operators (e.g., a < b < c) +# is a parse error. +binary_expr := expression binary_op expression +binary_op := '+' | '-' | '*' | '/' | '%' | + '==' | '!=' | '>' | '>=' | '<' | '<=' | '&&' | '||' +unary_expr := unary_op expression +unary_op := '!' | '-' + +lambda_expr := lambda_params '->' (expression | lambda_block) +lambda_params := identifier | '_' | '(' param (',' param)* ')' +lambda_block := '{' NL? (var_assignment NL)* expression NL? '}' +paren_expr := '(' expression ')' + +literal := float_literal | int_literal | string_literal | boolean | null | array | object +int_literal := [0-9]+ # Must fit int64; overflow is a compile-time error +float_literal := [0-9]+ '.' [0-9]+ +boolean := 'true' | 'false' +null := 'null' +string_literal := '"' string_char* '"' | '`' raw_char* '`' +string_char := [^"\\\n] | escape_seq +escape_seq := '\\' ( '"' | '\\' | 'n' | 't' | 'r' | 'u' hex hex hex hex | 'u{' hex+ '}' ) + # \uXXXX: exactly 4 hex digits (BMP only, U+0000–U+FFFF) + # \u{X...}: the grammar accepts 1+ hex digits; a semantic pass must reject + # any value > U+10FFFF or any surrogate codepoint (U+D800–U+DFFF) as a + # compile error. Surrogate codepoints are also invalid in the \uXXXX form. +hex := [0-9a-fA-F] +raw_char := [^`] +array := '[' [expression (',' expression)* ','?] ']' +object := '{' NL? [key_value (',' NL? key_value)* ','?] NL? '}' +key_value := expression ':' expression + +arg_list := positional_args | named_args +positional_args := arg_value (',' arg_value)* ','? +named_args := identifier ':' arg_value (',' identifier ':' arg_value)* ','? +arg_value := expression | lambda_expr # Lambdas are only valid here +word := [a-zA-Z_][a-zA-Z0-9_]* # Raw lexical pattern (includes keywords and reserved names) +identifier := word - keyword - reserved_name # Excludes keywords and reserved names; used for variable/map/param names +keyword := 'input' | 'output' | 'if' | 'else' | 'match' | 'as' | 'map' | 'import' | 'true' | 'false' | 'null' | '_' +reserved_name := 'deleted' | 'throw' | 'void' # Reserved function names (Section 1.3); cannot be used as identifiers +``` + +## Key Points + +- **Disambiguation of `match` with `{`:** After `match`, if the next token is `{`, it is always the match body (boolean match without expression), never an object literal as the matched expression. This eliminates the ambiguity between `match { cases... }` and `match { cases... }`. In practice, matching on a literal is dead code — the matched expression should always be dynamic. If parenthesization is ever needed for clarity, `match (expr) { ... }` works for any expression. +- **Disambiguation of `call_expr` vs `context_root`:** Both productions can start with `identifier` (or `qualified_name`). The parser must use one token of lookahead after the identifier (or after the second identifier in `qualified_name`) to check for `(`: if present, it is a `call_expr`; otherwise, it is a `context_root`. Reserved names (`deleted`, `throw`, `void`) always require `(` — they appear in `call_expr` but not `context_root`, so they can only be called, not used as bare values. This is standard LL(1) lookahead — the grammar is unambiguous but implementers should be aware of the need for it. +- **Unified postfix chains:** The `postfix_expr` production unifies field access, indexing, and method calls into a single chain. Any expression result can be followed by any combination of `.field`, `[index]`, and `.method()` operations. This means `func().field`, `expr.method()[0]`, and `literal["key"]` are all valid. +- **Top-level only:** Map declarations (`map_decl`) and imports (`import_stmt`) can only appear at the top level of a program, not inside statement bodies. Control flow statements (`if_stmt`, `match_stmt`) can be nested. +- **Variables:** `$var` for declaration and reference. The grammar has two assignment productions that reflect context restrictions: `assignment` (used in statement contexts: top-level, if/match statement bodies) can target `output`, `output@`, or `$variables`; `var_assignment` (used in expression contexts: map bodies, lambda blocks, if/match expressions) can only target `$variables`. Both support path assignment (`$var.field = expr`, `$var[0] = expr`) with the same field access, indexing, and auto-creation semantics as `output`. +- **Metadata:** `input@.key` (read), `output@.key` (write). Root metadata assignment (`output@ = expr`) requires the value to be an object at runtime (error otherwise); `output@ = deleted()` is also an error since metadata cannot be deleted (Section 9.2). +- **Context-dependent paths:** + - **Top-level assignments:** Targets must be `output` (with optional `@` for metadata) or `$variable`. `input` is read-only and cannot be assigned to. + - **Map/lambda bodies:** Can use bare identifiers for parameters **in expressions only** (e.g., `data.field` where `data` is a parameter). Parameters are read-only and cannot be assigned to. + - **Match with `as`:** Creates a read-only binding in expressions (e.g., `match input.x as val { val.field ... }`) + - **Name resolution:** The grammar's `context_root` accepts `identifier` and `qualified_name` in all expression contexts. A separate semantic pass must verify that every bare identifier resolves to a bound name — a map parameter, lambda parameter, match `as` binding, map name, or standard library function name. Namespace-qualified references (`namespace::name`) resolve to maps from imported modules. Unresolved identifiers are a compile-time error (Section 3.1). Resolution priority (innermost wins): parameters > maps > standard library functions. User-defined maps shadow standard library functions of the same name. When an identifier resolves to a map or standard library function name, it is only valid with parentheses (a call) or as an argument to a higher-order method (Section 5.5). Bare map/function names in other expression positions are a compile-time error. +- **Field and method names:** Field names after `.` and `?.` use `word` (any `[a-zA-Z_][a-zA-Z0-9_]*` token, including keywords). Keywords are valid as field names without quoting: `input.map`, `output.if`. Use `."string"` for names with special characters or spaces: `input."field name"`. Method names in `method_call` also use `word`, so standard library methods like `.map()` work despite `map` being a keyword. Declarations (variables, maps, parameters) use `identifier`, which excludes keywords. +- **Object literals:** Keys are expressions that **must** evaluate to strings at runtime (error if not): `{"key": value}` or `{$var: value}`. Use `.string()` for explicit type conversion. +- **Indexing:** `[expr]` on objects (string index), arrays (numeric index, must be whole number), strings (codepoint position, returns int64), bytes (byte position, returns int64). Negative indices supported for arrays, strings, and bytes. Indexing is a `postfix_op` and can follow any expression — including function calls, method chains, and literals. +- **Null-safe:** `?.` and `?[` short-circuit to `null` for field access and indexing; `?.method()` short-circuits to `null` for method calls. +- **Map calls:** `name(arg)` or `namespace::name(arg)` (positional or named arguments). Map names and namespace-qualified references can be passed directly as arguments to higher-order methods (e.g., `.map(double)`) — the compiler resolves them at compile time (Section 5.5). They are not runtime values and cannot be stored in variables. +- **Named arguments:** `func(a: 1, b: 2)` - cannot mix with positional arguments. Duplicate named arguments are a compile-time error. +- **Default parameters:** `map foo(x, y = 10) { ... }` or `(x, y = 10) -> expr`. Parameters with defaults must come after required parameters. Default values must be literals (`42`, `"hello"`, `true`, `false`, `null`). Discard parameters (`_`) cannot have defaults. +- **Discard parameters:** `_` is allowed as a parameter in maps and lambdas. It accepts an argument but does not bind it — `_` cannot be referenced in the body. Multiple `_` parameters are allowed. Maps or lambdas with `_` parameters can only be called positionally (named calls are a compile error). +- **Arity:** Positional calls must provide at least the required parameter count and at most the total count. Named calls must provide all required parameters; missing parameters with defaults use their defaults. Extra or unknown arguments are errors. Arity mismatches are compile-time errors when detectable, runtime errors otherwise. +- **Lambdas:** Single param `x -> expr`, multi-param `(a, b) -> expr`, with defaults `(a, b = 0) -> expr`, discard `_ -> expr` or `(_, b) -> expr`, block `x -> { ... }`. Lambda parameters are available as bare identifiers within the lambda body. `_` parameters are not bound and cannot be referenced. **Position restriction:** lambdas only appear as the `arg_value` of a positional or named argument — they are not general expressions. Any other position (assignment RHS, collection literal, operator operand, paren_expr, etc.) is a parse error. Whether a specific callee accepts a lambda at a given argument position is enforced by a semantic pass against the callee's signature. +- **Side effects:** + - Expressions cannot assign to `output` or `output@` + - Lambda blocks: Variable assignments + final expression (no `output` side effects) + - Map bodies: Same as lambda blocks — isolated functions that return values + - Maps cannot reference `input`, `output`, or top-level `$variables` (only their parameters) +- **Control flow forms:** + - `if_expr` / `match_expr`: Used in assignments, contain `expr_body` (no `output` assignments) + - `if_stmt` / `match_stmt`: Standalone statements, contain `stmt_body` (may assign to `output`) + - `expr_body`: Variable assignments + final expression (no `output` side effects) + - `stmt_body`: Zero or more statements (no trailing expression). Empty bodies are valid (no-op). +- **Void:** Not represented in the grammar — void is a semantic concept (absence of a value), not a syntactic form. It arises implicitly from if-expressions without `else` when the condition is false, from match expressions without `_` when no case matches, and from stdlib calls with no result (e.g. `.find()` on an array where no element matches); it can also be produced explicitly by the `void()` builtin (Section 13.1). Void is a purely runtime semantic: if void reaches a variable declaration (the first assignment to a name in a scope), it is a **runtime error** (Section 4.1). This ensures every declared variable always has a value. +- **Type coercion:** `+` requires same type family (no cross-family implicit conversion). Numeric types are promoted using promotion rules; non-numeric types require exact type match. +- **Operator precedence and associativity:** The `binary_expr` production is a simplified flat rule. Implementations must apply the precedence, associativity, and non-associativity rules from Section 3.2. Precedence (high to low): postfix operations (field access, indexing, method calls) > unary > multiplicative > additive > comparison > equality > logical AND > logical OR. Arithmetic and logical operators are left-associative; comparison and equality operators are non-associative (chaining is a parse error). +- **`{}` disambiguation:** In contexts where `{}` could be either an empty object literal or an empty block (e.g., inside a map body or lambda block), it is parsed as an empty object literal. Blocks (`expr_body`, `lambda_block`) require at least one expression, so an empty `{}` cannot be a valid block. + +## Context Examples + +**Top-level assignments:** +```bloblang +output.result = input.value # ✅ Valid: explicit context roots +$var = input.x # ✅ Valid: variable declaration +output.y = $var # ✅ Valid: variable in expression + +result = input.value # ❌ Invalid: bare identifier not allowed at top-level +``` + +**Map body (bare identifiers only in expressions, never assignments):** +```bloblang +map transform(data) { + $temp = data.field # ✅ Valid: 'data' in expression (RHS) + $temp.extra = "added" # ✅ Valid: variable path assignment + data.field * 2 # ✅ Valid: 'data' in expression +} + +map invalid(data) { + data = input.x # ❌ Invalid: cannot assign to parameter + data.field = 10 # ❌ Invalid: cannot assign through parameter + output.x = data.field # ❌ Invalid: cannot assign to output in map body +} +``` + +**Lambda body (bare identifiers only in expressions):** +```bloblang +input.items.map(item -> item.value * 2) # ✅ Valid: 'item' in expression +input.items.filter(x -> x.active) # ✅ Valid: 'x' in expression +``` + +**Match with `as` (creates read-only binding):** +```bloblang +output.result = match input.user as u { + u.type == "admin" => u.name, # ✅ Valid: 'u' in expression + _ => "guest", +} +``` + +**Key rule:** Bare identifiers (parameters and match `as` bindings) are **read-only** and can only appear in expressions, never as assignment targets. diff --git a/internal/bloblang2/spec/11_common_patterns.md b/internal/bloblang2/spec/11_common_patterns.md new file mode 100644 index 000000000..2bd2ce1d2 --- /dev/null +++ b/internal/bloblang2/spec/11_common_patterns.md @@ -0,0 +1,140 @@ +# 11. Common Patterns + +## Copy-and-Modify + +```bloblang +output = input +output.password = deleted() +output.updated_at = now() + +output@ = input@ +output@.kafka_topic = "processed" +``` + +## Null-Safe Access + +```bloblang +output.city = input.user?.address?.city +output.email = input.contact?.email.or("no-email@example.com") +output.first = input.users?[0]?.name +output.product = input.order?.items?[0]?.product?.name.or("Unknown") +``` + +## Error-Safe Parsing + +```bloblang +output.parsed = input.date + .ts_parse("%Y-%m-%d") + .catch(err -> input.date.ts_parse("%Y/%m/%d")) + .catch(err -> null) +``` + +## Array Transformation + +```bloblang +# Filter, map, sort +output.results = input.items + .filter(item -> item.active) + .map(item -> item.name.uppercase()) + .sort() + +# Object transformation +output.uppercased = input.data.map_values(v -> v.trim().uppercase()) +``` + +## Indexing Patterns + +```bloblang +# Arrays — null-safe handles missing field, catch handles out-of-bounds +output.first = input.items?[0].catch(err -> null) +output.last = input.items?[-1].catch(err -> null) + +# Strings (codepoint position, returns int64) +output.first_codepoint = input.name[0] # int64 Unicode codepoint + +# Dynamic indexing +output.selected = input.options[input.index].catch(err -> "invalid") +``` + +## Metadata Routing + +```bloblang +output@ = input@ +output@.kafka_topic = if input.priority == "high" { + "urgent-topic" +} else { + "normal-topic" +} + +# Enrich from document +output@.kafka_key = input.user_id +output@.content_type = "application/json" +``` + +## Recursive Tree Walking + +```bloblang +map walk(node) { + match node.type() { + "object" => node.map_values(v -> walk(v)), + "array" => node.map(elem -> walk(elem)), + "string" => node.uppercase(), + _ => node, + } +} +output = walk(input) +``` + +## Unary Minus with Methods + +```bloblang +# Method calls bind tighter than unary minus +# -10.string() # ERROR: parses as -(10.string()) = -("10") +(-10).string() # OK: "-10" +(-3.14).abs() # OK: 3.14 +``` + +## Boolean Dispatch + +```bloblang +# match equality form cannot match boolean values — use if/else instead +output.label = if input.flag { "yes" } else { "no" } + +# Else-if chains for multi-way branching +output.tier = if input.score >= 90 { + "gold" +} else if input.score >= 50 { + "silver" +} else { + "bronze" +} + +# For multi-way boolean dispatch, use match-without-expression or match-with-as +output.status = match { + input.enabled && input.verified => "active", + input.enabled => "pending", + _ => "disabled", +} +``` + +## Complex Conditional Transformations + +```bloblang +output.user = if input.user_type == "premium" { + $discount = 0.20 + (if input.loyalty_years > 5 { 0.05 } else { 0 }) + { + "id": input.user_id, + "tier": "premium", + "discount_rate": $discount + } +} else { + {"id": input.user_id, "tier": "basic", "discount_rate": 0} +} + +output.timestamp = match input.date_format { + "iso8601" => input.date.ts_parse("%Y-%m-%dT%H:%M:%S%z").ts_unix(), + "unix" => input.date.int64(), + _ => input.date.ts_parse("%Y-%m-%d").ts_unix(), +} +``` + diff --git a/internal/bloblang2/spec/12_implementation_guide.md b/internal/bloblang2/spec/12_implementation_guide.md new file mode 100644 index 000000000..70e9976f3 --- /dev/null +++ b/internal/bloblang2/spec/12_implementation_guide.md @@ -0,0 +1,131 @@ +# 12. Implementation Guide + +## 12.1 Standard Library + +See **[Section 13: Standard Library](13_standard_library.md)** for the complete reference of all required functions and methods. Implementations may provide additional functions and methods beyond the standard library. + +## 12.2 Optional Optimizations + +Implementations may optimize without changing observable behavior. Results must be identical with or without optimization. + +### Lazy Evaluation (Iterators) + +**Strategy:** Methods may return internal iterators instead of materializing arrays immediately. + +**Lazy methods (from standard library):** `.filter()`, `.map()`. If an implementation adds extension methods beyond the standard library (e.g., `.flat_map()`, `.take()`, `.drop()`, `.take_while()`, `.skip_while()`), these should also be made lazy. + +**Terminal methods (from standard library):** Any method that needs the full array (e.g., `.sort()`, `.reverse()`, `.length()`, `.fold()`, `.collect()`). See Section 13 for the complete method reference. + +**Materialization points:** +- Variable assignment: `$var = iterator` → array +- Output assignment: `output.x = iterator` → array +- Indexing: `iterator[0]` → array +- Terminal methods + +**Example:** +```bloblang +# Direct chain (stays lazy) +output.active_values = input.items + .filter(x -> x.active) + .map(x -> x.value) +# Single pass, no intermediate array + +# Variable breaks chain (materializes) +$filtered = input.items.filter(x -> x.active) # Materializes +output.values = $filtered.map(x -> x.value) # Second pass +``` + +**Benefit:** 10-100x faster for large datasets, no intermediate allocations. + +**`deleted()` in lazy `.map()`:** A lazy `.map()` must handle `deleted()` returns by omitting the element from the result, matching eager behavior (Section 9.2). In fused pipelines like `.filter().map()`, a `deleted()` return from `.map()` acts as an additional filter — both mechanisms must compose correctly. The observable result must be identical to eager evaluation. + +**Guarantee:** Variables always hold arrays (never iterators), fully reusable. + +### Pipeline Fusion + +Combine multiple operations into single loop: +```bloblang +# User code +output.results = input.items + .filter(x -> x.active) + .map(x -> x.value * 2) + .filter(x -> x > 0) + +# Implementation may fuse into: +# for item in items: +# if item.active and item.value * 2 > 0: +# yield item.value * 2 +``` + +### Early Termination + +**Note:** `.any()` and `.all()` short-circuiting is a **required semantic**, not an optional optimization — see Section 13.6. Implementations must not evaluate elements beyond the determined result. + +### Constant Folding + +Evaluate constant expressions at parse time: +```bloblang +output.value = 2 + 2 # May compile to: output.value = 4 +``` + +### Dead Code Elimination + +Remove unreachable code: +```bloblang +if true { + output.a = "always" +} else { + output.b = "never" # May be eliminated +} +``` + +### String Builder + +Optimize repeated concatenation: +```bloblang +"a" + "b" + "c" + "d" # May use string builder instead of intermediate strings +``` + +## 12.3 Intrinsic Methods + +`.catch()` and `.or()` are parsed as regular method calls (via the `method_call` postfix operation in `postfix_expr`) but require special handling by the runtime. They **cannot** be implemented as ordinary methods: + +- **`.catch(err -> expr)`** — Must intercept errors from the left-hand expression chain. Normal methods are skipped when the receiver is an error; `.catch()` is the opposite — it activates only on errors and passes through successful values unchanged. See Section 8.2. +- **`.or(default)`** — Must use short-circuit evaluation. Normal methods eagerly evaluate all arguments; `.or()` must *not* evaluate its argument unless the receiver is null, void, or `deleted()`. Additionally, `.or()` and `.catch()` are the only methods that can be called on void or `deleted()` — all other methods error on void and `deleted()` receivers. `.catch()` passes void and `deleted()` through unchanged (they are not errors); `.or()` actively rescues them. This matters when the argument has side effects or throws (e.g., `.or(throw("required"))`). See Section 8.3. + +Similarly, `throw()`, `deleted()`, and `void()` are parsed as regular function calls but their return values require special downstream tracking: + +- **`throw(message)`** — Produces a runtime error with the given message. Must be recognized so that short-circuit evaluation in `.or(throw("required"))` works correctly (the argument is only evaluated when `.or()` activates). See Section 8.4. +- **`deleted()`** — Produces a special deletion marker, not a normal value. Must be tracked through assignments and collection literals to trigger field removal, element omission, or message dropping. See Section 9.2. +- **`void()`** — Produces the void sentinel, the same value an `if` without `else` (or a non-exhaustive match) produces implicitly. Must flow through the same runtime paths as any other void-producing expression: assignment sites skip, collection literals and variable declarations error, `.or()` rescues, `.catch()` passes through. See Sections 4.1 and 13.1. + +Implementations should recognize these intrinsic methods and functions during compilation/interpretation and emit specialized instructions rather than routing them through the general dispatch path. + +## 12.4 Error Messages + +Provide clear error messages with context: +``` +mapping.blobl:15:22: Type mismatch: cannot add int64 and string + output.result = 5 + "3" + ^^^ +``` + +Include: +- File name and location +- Clear description +- Suggested fix when possible + +## 12.5 Performance Expectations + +**Lazy evaluation benefits:** +- Filter + Map + Take (10K items): 10-15x faster +- Long pipeline (1M items): 3-5x faster, 99% less memory +- Early termination (find first in 1M): 100-1000x faster + +## 12.6 Testing Requirements + +- Results must match eager evaluation exactly +- Variable materialization must be transparent +- Iterator consumption must not leak to user code +- All examples in spec must execute correctly +- Float-to-string conversions (`.string()`, `.format_json()`) may vary across implementations due to different shortest-representation algorithms (Section 13.2). Conformance tests involving float string output should compare parsed numeric values rather than exact string representations diff --git a/internal/bloblang2/spec/13_standard_library.md b/internal/bloblang2/spec/13_standard_library.md new file mode 100644 index 000000000..aa541e6f9 --- /dev/null +++ b/internal/bloblang2/spec/13_standard_library.md @@ -0,0 +1,1434 @@ +# 13. Standard Library + +All implementations must provide these functions and methods. This is the complete required standard library — implementations may offer additional functions and methods beyond this list. + +**Namespace sharing:** Standard library functions share the same namespace as user-defined maps. User-defined maps shadow standard library functions of the same name. Resolution priority: parameters > maps > standard library functions. Like map names, standard library function names without parentheses can be passed as arguments to higher-order methods (e.g., `.sort_by(abs)`) but cannot be used as general-purpose expressions or stored in variables (Section 5.5). + +**Named arguments and arity:** All standard library functions and methods support named arguments using the parameter names shown in their signatures. For example, `random_int(min: 1, max: 100)` and `.replace_all(old: "x", new: "y")` are valid. The same rules apply as for user maps (Section 5.3): positional and named arguments cannot be mixed in the same call, duplicate named arguments are a compile-time error, and extra or mismatched arguments are errors. Parameters with default values may be omitted (Section 5.1). Parameters marked with `?` in their signatures are truly optional — they may be omitted entirely, and the method's documented behavior applies when they are absent. + +**Parameter type promotion:** When a method or function signature documents a specific numeric type (e.g., `int64`), any numeric type is accepted. Integer parameters accept any numeric value that is a whole number — float values like `2.0` are accepted but `1.5` is an error (consistent with indexing rules in Section 3.1). Float parameters accept any numeric type (integers are promoted to float using the standard promotion rules in Section 2.3). In all cases, checked promotion applies — values that cannot be represented exactly in the target type are a runtime error. + +**Lambda return values — void and `deleted()`:** Unless a method's documentation explicitly states otherwise, void and `deleted()` as lambda return values are runtime errors. The bullets below describe what the listed methods do when their *lambda* returns void, `deleted()`, or an error — receiver behaviour is covered separately in each method's documentation. (As a rule, only `.or()` and `.catch()` accept void or `deleted()` as a *receiver*; every other method errors on such a receiver, `.into()` included.) + +- **Omit from collection:** `.map()`, `.map_values()`, `.map_keys()`, `.map_entries()` — a lambda returning `deleted()` omits the element/entry from the result. +- **Lambda result flows to caller:** `.catch()`, `.into()` — the lambda's return value flows to the calling context with normal semantics. `deleted()`, void, and errors propagate unchanged and are handled where the method-call expression is consumed (field removal, assignment skipping, error propagation, etc.). + +**Regular expressions:** All regex parameters use [RE2 syntax](https://github.com/google/re2/wiki/Syntax). RE2 guarantees linear-time matching (no catastrophic backtracking). Notable exclusions from RE2: backreferences and lookahead/lookbehind assertions are not supported. + +--- + +## 13.1 Functions + +### `uuid_v4()` + +Generate a random UUID v4 string. + +- **Parameters:** none +- **Returns:** string +- **Example:** `uuid_v4()` → `"a3e7f1b0-1234-4abc-9def-567890abcdef"` + +### `now()` + +Return the current timestamp. Each call returns a fresh timestamp — consecutive calls may return different values, including within `.map()` lambdas. The returned timestamp carries the **process's local zone** as its stored zone (Section 2.3). This matters only for rendering via `.ts_format()` / `.string()` without an explicit zone directive — instant-based operations (`.ts_unix*()`, comparison, subtraction) are unaffected by the stored zone. For deterministic, cross-platform output, always include `%z` in the format string (the default format does) or use `timestamp(...)` to construct a UTC-zoned timestamp explicitly. + +- **Parameters:** none +- **Returns:** timestamp (stored zone: process local) +- **Example:** `now().ts_unix()` → `1709500000` + +### `random_int(min, max)` + +Return a random int64 in the inclusive range [min, max]. Error if min > max. + +- **Parameters:** `min` (int64), `max` (int64) +- **Returns:** int64 +- **Example:** `random_int(1, 100)` → `42` + +### `range(start, stop, step?)` + +Generate an array of integers from `start` (inclusive) to `stop` (exclusive) with the given step. When `step` is omitted, the direction is inferred: `1` if `start <= stop`, `-1` if `start > stop`. Error if step is zero. Error if an explicit step contradicts the direction (positive step with start > stop, or negative step with start < stop). If start == stop, returns an empty array regardless of step. + +- **Parameters:** `start` (int64), `stop` (int64), `step` (int64, optional — inferred from direction when omitted) +- **Returns:** array of int64 +- **Examples:** + ```bloblang + range(0, 5) # [0, 1, 2, 3, 4] (step inferred as 1) + range(5, 0) # [5, 4, 3, 2, 1] (step inferred as -1) + range(0, 5, 1) # [0, 1, 2, 3, 4] (explicit step) + range(0, 10, 3) # [0, 3, 6, 9] + range(5, 0, -2) # [5, 3, 1] + range(0, 0) # [] + range(0, 5, -1) # ERROR: step direction contradicts start/stop + range(5, 0, 1) # ERROR: step direction contradicts start/stop + range(0, 5, 0) # ERROR: step cannot be zero + ``` + +### `timestamp(year, month, day, hour = 0, minute = 0, second = 0, nano = 0, timezone = "UTC")` + +Construct a timestamp from individual components. + +- **Parameters:** + - `year` (int64) + - `month` (int64, 1–12) + - `day` (int64, 1–31) + - `hour` (int64, 0–23, default `0`) + - `minute` (int64, 0–59, default `0`) + - `second` (int64, 0–59, default `0`) + - `nano` (int64, 0–999999999, default `0`) + - `timezone` (string, IANA timezone name or `"UTC"`, default `"UTC"`) +- **Returns:** timestamp (stored zone is the `timezone` argument) +- **Errors:** if any component is out of range, or if the timezone is not recognized +- **Examples:** + ```bloblang + timestamp(2024, 3, 1) # 2024-03-01T00:00:00Z + timestamp(2024, 3, 1, 12, 30, 0) # 2024-03-01T12:30:00Z + timestamp(2024, 3, 1, 12, 30, 0, 0, "America/New_York") # 2024-03-01T12:30:00-05:00 + timestamp(year: 2024, month: 3, day: 1) # 2024-03-01T00:00:00Z + timestamp(year: 2024, month: 12, day: 25, hour: 8) # 2024-12-25T08:00:00Z + ``` +- **Use this over `.ts_from_unix*()` or `now()` when deterministic cross-platform output is required** — the `timezone` argument makes the stored zone explicit. + +### `second()` + +Return the number of nanoseconds in one second (`1000000000`). This is a convenience constant for use with `.ts_add()` and other duration arithmetic. + +- **Parameters:** none +- **Returns:** int64 (`1000000000`) +- **Examples:** + ```bloblang + now().ts_add(second()) # 1 second later + now().ts_add(second() * -30) # 30 seconds ago + ``` + +### `minute()` + +Return the number of nanoseconds in one minute (`60000000000`). Convenience constant equivalent to `second() * 60`. + +- **Parameters:** none +- **Returns:** int64 (`60000000000`) +- **Examples:** + ```bloblang + now().ts_add(minute()) # 1 minute later + now().ts_add(minute() * 5) # 5 minutes later + ``` + +### `hour()` + +Return the number of nanoseconds in one hour (`3600000000000`). Convenience constant equivalent to `second() * 3600`. + +- **Parameters:** none +- **Returns:** int64 (`3600000000000`) +- **Examples:** + ```bloblang + now().ts_add(hour()) # 1 hour later + now().ts_add(hour() * -2) # 2 hours ago + ``` + +### `day()` + +Return the number of nanoseconds in one day (`86400000000000`). Convenience constant equivalent to `second() * 86400`. Note: this is exactly 24 hours — it does not account for daylight saving time transitions or leap seconds. + +- **Parameters:** none +- **Returns:** int64 (`86400000000000`) +- **Examples:** + ```bloblang + now().ts_add(day()) # 1 day later + now().ts_add(day() * 7) # 1 week later + ``` + +### `throw(message)` + +Throw a custom error. The error propagates and can be caught with `.catch()`. If uncaught, it halts the mapping. + +- **Parameters:** `message` (string, required). Non-string literal arguments are a compile-time error; dynamic arguments that evaluate to a non-string type at runtime are a runtime error. +- **Returns:** never (always produces an error) +- **Example:** `throw("value is required")` +- **See:** Section 8.4 + +### `deleted()` + +Return a deletion marker. When assigned to a field or metadata key, removes it. When included in a collection literal, omits the element/field. When assigned to the root output (`output = deleted()`), drops the entire message and immediately exits the mapping. Assigning `deleted()` to a variable (`$var = deleted()`) is a runtime error. + +- **Parameters:** none +- **Returns:** deletion marker (not a runtime type) +- **See:** Section 9.2 + +### `void()` + +Return the void sentinel — the "no value was produced" signal normally emitted by an `if` without `else` or a non-exhaustive `match`. Provides an explicit spelling for the void value, useful when branching logic needs to "skip the assignment" in some cases without introducing a dummy `if false { x }` construct. + +Same rules as any other void: assigning `void()` to a field skips the assignment (leaves the target unchanged); `void()` inside a collection literal, as a variable declaration RHS, or as an operand to most methods/operators is a runtime error (use `deleted()` to omit from collections). + +- **Parameters:** none +- **Returns:** void (not a runtime type) +- **See:** Section 4.1 (void semantics), Section 9.2 (deleted vs void) + +```bloblang +# Skip an assignment conditionally +output.status = if input.override_status { input.override_status } else { void() } +# Equivalent to the if-without-else form but explicit + +# Use inside a map result when no output is desired for some inputs +map classify(val) { + if val > 0 { "positive" } + else if val < 0 { "negative" } + else { void() } # zero inputs produce no classification +} +``` + +--- + +## 13.2 Type Conversion Methods + +These are the only way to create non-default numeric types, since literals are always int64 or float64. + +### `.string()` + +Convert a value to its string representation. + +- **Receiver:** any type +- **Returns:** string +- **Conversion rules:** + - Integer types: decimal representation (`42` → `"42"`, `-10` → `"-10"`) + - Float types: any shortest decimal representation that round-trips exactly back to the same float value, always including either a decimal point or exponent to distinguish the result from an integer string. When a non-exponent form and an exponent form have the same length, prefer the non-exponent form. `0.0` → `"0.0"`, `3.14` → `"3.14"`. For values where exponent notation is shorter, the exponent form is used — e.g., `1000000.0` may produce `"1e+06"`, `"1e6"`, or another equivalent shortest form (the exact exponent format is implementation-dependent). Negative zero normalizes to `"0.0"` (not `"-0.0"`). NaN produces `"NaN"`, Infinity produces `"Infinity"`, negative Infinity produces `"-Infinity"`. **Cross-implementation note:** Different shortest-representation algorithms (Ryu, Grisu3, etc.) may produce different valid outputs for the same float value. Conformance tests should compare parsed numeric values rather than exact string representations. + - Bool: `"true"` or `"false"` + - Null: `"null"` + - Timestamp: RFC 3339 format with shortest-precision fractional seconds — trailing zeros are removed and the fractional part (including `.`) is omitted entirely when zero. Examples: `"2024-03-01T12:00:00Z"`, `"2024-03-01T12:00:00.123Z"`. UTC is represented as `Z`. This matches `.ts_format()` with default arguments. + - Bytes: UTF-8 decode (error if bytes are not valid UTF-8) + - Array, object: compact JSON (equivalent to `.format_json()` with default parameters — object keys sorted lexicographically by Unicode codepoint value) + - Lambda: error + - **Containers with bytes:** If an array or object contains a bytes value (at any nesting depth), `.string()` errors — bytes have no implicit serialization format. Convert bytes explicitly before including them in structures that will be stringified (e.g., use `.encode("base64")` or `.string()` on individual bytes values before embedding them in arrays or objects). +- **Examples:** + ```bloblang + 42.string() # "42" (int64 → string) + 3.14.string() # "3.14" (float64 → string) + 0.0.string() # "0.0" (float64 — always includes decimal point) + true.string() # "true" + null.string() # "null" + [1, 2].string() # "[1,2]" + {"a": 1}.string() # "{\"a\":1}" + ``` + +### `.int32()` + +Convert a value to int32. Errors if the value cannot be represented as int32 (out of range or non-numeric string). Float values are **truncated** toward zero (fractional part discarded). Errors if the truncated value is out of int32 range. + +- **Receiver:** numeric types, string +- **Returns:** int32 +- **Examples:** + ```bloblang + "42".int32() # 42 (int32) + 3.7.int32() # 3 (int32: truncated toward zero) + (-3.7).int32() # -3 (int32: truncated toward zero) + ``` + +### `.int64()` + +Convert a value to int64. Errors if the value cannot be represented as int64. Float values are **truncated** toward zero. + +- **Receiver:** numeric types, string +- **Returns:** int64 +- **Examples:** + ```bloblang + "42".int64() # 42 (int64) + 3.9.int64() # 3 (int64: truncated toward zero) + ``` + +### `.uint32()` + +Convert a value to uint32. Errors if the value is negative or out of range. Float values are **truncated** toward zero. + +- **Receiver:** numeric types, string +- **Returns:** uint32 +- **Example:** `"255".uint32()` → `255` (uint32) + +### `.uint64()` + +Convert a value to uint64. Errors if the value is negative or out of range. Float values are **truncated** toward zero. + +- **Receiver:** numeric types, string +- **Returns:** uint64 +- **Example:** `"1000".uint64()` → `1000` (uint64) + +### `.float32()` + +Convert a value to float32. Precision loss may occur for large values. Unlike implicit numeric promotion in arithmetic (Section 2.3), explicit conversion methods are unchecked — the caller is opting in to potential precision loss. + +- **Receiver:** numeric types, string +- **Returns:** float32 +- **Example:** `"3.14".float32()` → `3.14` (float32) + +### `.float64()` + +Convert a value to float64. Precision loss may occur for large integers. Unlike implicit numeric promotion in arithmetic (Section 2.3), explicit conversion methods are unchecked — the caller is opting in to potential precision loss. + +- **Receiver:** numeric types, string +- **Returns:** float64 +- **Example:** `"3.14".float64()` → `3.14` (float64) + +### `.bool()` + +Convert a value to boolean. + +- **Receiver:** bool (identity — returned as-is), string (`"true"`, `"false"`), numeric (0 = false, non-zero = true) +- **Returns:** bool +- **Special float values:** Negative zero (`(-0.0).bool()`) is `false` (it is equal to zero per IEEE 754). Infinity and negative Infinity are `true` (non-zero). NaN is an error (neither zero nor non-zero). +- **Design note:** Numeric-to-boolean conversion is an explicit opt-in via `.bool()` — it does not happen implicitly. Logical operators (`&&`, `||`, `!`) still require boolean operands; `5 && true` is an error. This differs from V1, where numbers were silently accepted as booleans in logical expressions. +- **Examples:** + ```bloblang + true.bool() # true (identity) + "true".bool() # true + "false".bool() # false + 1.bool() # true + 0.bool() # false + ``` + +### `.char()` + +Convert an integer (Unicode codepoint) to a single-character string. This is the inverse of string indexing (`"hello"[0]` → `104`). + +- **Receiver:** any integer type (int64, int32, uint32, uint64) +- **Returns:** string (single codepoint) +- **Errors:** if the value is not a valid Unicode codepoint +- **Examples:** + ```bloblang + 104.char() # "h" + 233.char() # "é" + 128512.char() # "😀" + "hello"[0].char() # "h" (round-trip from string indexing) + ``` + +### `.bytes()` + +Convert a value to a byte array. For strings, returns the UTF-8 encoding. For all other supported types, equivalent to `.string().bytes()` (UTF-8 encoding of the string representation). + +- **Receiver:** any type +- **Returns:** bytes +- **Conversion rules:** + - String: UTF-8 encoding + - Bytes: returned as-is + - All other types (numeric, bool, null, timestamp, array, object): UTF-8 encoding of `.string()` result. Since this goes through `.string()`, containers (arrays/objects) with nested bytes values will error — convert bytes values explicitly (e.g., `.encode("base64")`) before calling `.bytes()` on a container. + - Lambda: error +- **Examples:** + ```bloblang + "hello".bytes() # byte array (5 bytes) + "hello".bytes().bytes() # byte array (unchanged) + 42.bytes() # byte array of "42" (2 bytes) + true.bytes() # byte array of "true" (4 bytes) + ``` + +--- + +## 13.3 Type Introspection + +### `.type()` + +Return the type name of a value as a string. Works on any type including null. + +- **Receiver:** any type (including null) +- **Returns:** string — one of `"string"`, `"int32"`, `"int64"`, `"uint32"`, `"uint64"`, `"float32"`, `"float64"`, `"bool"`, `"null"`, `"bytes"`, `"timestamp"`, `"array"`, `"object"` +- **Examples:** + ```bloblang + "hello".type() # "string" + 42.type() # "int64" + 3.14.type() # "float64" + null.type() # "null" + now().type() # "timestamp" + [1, 2].type() # "array" + {"a": 1}.type() # "object" + ``` + +--- + +## 13.4 Sequence Methods + +These methods work on multiple sequence-like types: strings (codepoint-based), arrays (element-based), and bytes (byte-based). Each method is documented once; per-type behavior is noted where it differs. + +### `.length()` + +Return the length of a sequence, or the number of keys in an object. + +- **Receiver:** string, array, bytes, object +- **Returns:** int64 +- **Semantics:** strings count codepoints, arrays count elements, bytes count bytes, objects count keys +- **Examples:** + ```bloblang + "hello".length() # 5 (codepoints) + [1, 2, 3].length() # 3 (elements) + "hello".bytes().length() # 5 (bytes) + {"a": 1, "b": 2}.length() # 2 (keys) + ``` + +### `.contains(target)` + +Check if a sequence contains the given target. + +- **Receiver:** string, array, bytes +- **Parameters:** `target` — string (for string/bytes receiver) or any (for array receiver) +- **Returns:** bool +- **Semantics:** + - **string:** searches for a substring + - **array:** searches for an element by equality + - **bytes:** searches for a byte subsequence (target must be bytes) +- **Examples:** + ```bloblang + "hello world".contains("world") # true + [1, 2, 3].contains(2) # true + "hello".bytes().contains("ll".bytes()) # true + ``` +- **Note:** For object key checking, see `.has_key()` (Section 13.7). + +### `.index_of(target)` + +Return the index of the first occurrence of the target, or -1 if not found. + +- **Receiver:** string, array, bytes +- **Parameters:** `target` — string (for string receiver), any (for array receiver), bytes (for bytes receiver) +- **Returns:** int64 +- **Semantics:** + - **string:** returns codepoint index of first occurrence of substring + - **array:** returns element index of first match by equality + - **bytes:** returns byte index of first occurrence of byte subsequence +- **Examples:** + ```bloblang + "hello world".index_of("world") # 6 + [10, 20, 30].index_of(20) # 1 + "hello".bytes().index_of("ll".bytes()) # 2 + ``` + +### `.slice(low, high?)` + +Extract a subsequence. `low` is inclusive, `high` is exclusive. When `high` is omitted, the slice extends to the end of the sequence. Negative indices count from the end. Indices are clamped to the length — out-of-bounds indices do not error. If `low >= high` after clamping, returns an empty value of the same type. + +- **Receiver:** string, array, bytes +- **Parameters:** `low` (int64), `high` (int64, optional — defaults to sequence length when omitted) +- **Returns:** same type as receiver +- **Semantics:** strings slice by codepoint, arrays by element, bytes by byte +- **Examples:** + ```bloblang + "hello world".slice(0, 5) # "hello" + "hello world".slice(6) # "world" (high omitted — to end) + "hello world".slice(-5, -1) # "worl" + [1, 2, 3, 4, 5].slice(1, 3) # [2, 3] + [1, 2, 3, 4, 5].slice(2) # [3, 4, 5] (high omitted — to end) + "hello".bytes().slice(0, 3) # bytes("hel") + "hello".slice(0, 100) # "hello" (high clamped to length) + [1, 2, 3].slice(3, 1) # [] (low >= high) + ``` + +### `.reverse()` + +Reverse a sequence. + +- **Receiver:** string, array, bytes +- **Returns:** same type as receiver +- **Semantics:** strings reverse by codepoint, arrays by element, bytes by byte +- **Examples:** + ```bloblang + "hello".reverse() # "olleh" + [1, 2, 3].reverse() # [3, 2, 1] + "hello".bytes().reverse() # bytes("olleh") + ``` + +--- + +## 13.5 String Methods + +### `.uppercase()` + +Convert a string to uppercase. + +- **Receiver:** string +- **Returns:** string +- **Example:** `"hello".uppercase()` → `"HELLO"` + +### `.lowercase()` + +Convert a string to lowercase. + +- **Receiver:** string +- **Returns:** string +- **Example:** `"HELLO".lowercase()` → `"hello"` + +### `.trim()` + +Remove leading and trailing Unicode whitespace — characters with the Unicode `White_Space` property (space, `\t`, `\n`, `\r`, `\f`, `\v`, non-breaking space U+00A0, and other Unicode space characters). + +- **Receiver:** string +- **Returns:** string +- **Example:** `" hello ".trim()` → `"hello"` + +### `.trim_prefix(prefix)` + +Remove the given prefix from the start of the string. If the string does not start with the prefix, it is returned unchanged. + +- **Receiver:** string +- **Parameters:** `prefix` (string) +- **Returns:** string +- **Examples:** + ```bloblang + "hello world".trim_prefix("hello ") # "world" + "hello world".trim_prefix("xyz") # "hello world" + ``` + +### `.trim_suffix(suffix)` + +Remove the given suffix from the end of the string. If the string does not end with the suffix, it is returned unchanged. + +- **Receiver:** string +- **Parameters:** `suffix` (string) +- **Returns:** string +- **Examples:** + ```bloblang + "hello world".trim_suffix(" world") # "hello" + "hello world".trim_suffix("xyz") # "hello world" + ``` + +### `.has_prefix(prefix)` + +Check if a string starts with the given prefix. + +- **Receiver:** string +- **Parameters:** `prefix` (string) +- **Returns:** bool +- **Example:** `"hello world".has_prefix("hello")` → `true` + +### `.has_suffix(suffix)` + +Check if a string ends with the given suffix. + +- **Receiver:** string +- **Parameters:** `suffix` (string) +- **Returns:** bool +- **Example:** `"hello world".has_suffix("world")` → `true` + +### `.split(delimiter)` + +Split a string by a delimiter. + +- **Receiver:** string +- **Parameters:** `delimiter` (string) +- **Returns:** array of strings +- **Examples:** + ```bloblang + "a,b,c".split(",") # ["a", "b", "c"] + "hello".split("") # ["h", "e", "l", "l", "o"] + "👋🏽".split("") # ["👋", "🏽"] (splits by codepoint, not grapheme) + "".split("") # [] (no codepoints) + "".split(",") # [""] (empty string with non-empty delimiter) + ``` + +### `.replace_all(old, new)` + +Replace all occurrences of a substring. + +- **Receiver:** string +- **Parameters:** `old` (string), `new` (string) +- **Returns:** string +- **Example:** `"hello world".replace_all("o", "0")` → `"hell0 w0rld"` + +### `.repeat(count)` + +Return the string repeated `count` times. Error if count is negative. + +- **Receiver:** string +- **Parameters:** `count` (int64) +- **Returns:** string +- **Examples:** + ```bloblang + "ab".repeat(3) # "ababab" + "x".repeat(0) # "" + ``` + +### `.re_match(pattern)` + +Test if a string matches a regular expression. Returns true if the pattern matches any part of the string. + +- **Receiver:** string +- **Parameters:** `pattern` (string — RE2 regular expression) +- **Returns:** bool +- **Examples:** + ```bloblang + "hello123".re_match("[0-9]+") # true + "hello".re_match("^[a-z]+$") # true + "hello".re_match("^[0-9]+$") # false + ``` + +### `.re_find_all(pattern)` + +Return all non-overlapping matches of a regular expression. + +- **Receiver:** string +- **Parameters:** `pattern` (string — RE2 regular expression) +- **Returns:** array of strings +- **Examples:** + ```bloblang + "foo123bar456".re_find_all("[0-9]+") # ["123", "456"] + "hello".re_find_all("[0-9]+") # [] + ``` + +### `.re_replace_all(pattern, replacement)` + +Replace all matches of a regular expression with a replacement string. The replacement string supports RE2 expansion syntax: `$0` for the full match, `$1`/`$2` for numbered capture groups, and `${name}` for named capture groups. Use `$$` for a literal `$`. + +- **Receiver:** string +- **Parameters:** `pattern` (string — RE2 regular expression), `replacement` (string — with RE2 expansion) +- **Returns:** string +- **Examples:** + ```bloblang + "foo 123 bar 456".re_replace_all("[0-9]+", "N") # "foo N bar N" + "John Smith".re_replace_all("(\\w+) (\\w+)", "$2, $1") # "Smith, John" + "2024-03-01".re_replace_all("(?P\\d{4})-(?P\\d{2})-(?P\\d{2})", "${d}/${m}/${y}") + # "01/03/2024" + ``` + +--- + +## 13.6 Array Methods + +### `.filter(fn)` + +Return a new array containing only elements for which the lambda returns `true`. The lambda must return a boolean — non-boolean return values (including void) are an error. + +- **Receiver:** array +- **Parameters:** `fn` — lambda (one parameter: element → bool) +- **Returns:** array +- **Examples:** + ```bloblang + [1, 2, 3, 4].filter(x -> x > 2) # [3, 4] + [1, -2, 3].filter(x -> x > 0) # [1, 3] + ``` + +### `.map(fn)` + +Transform each element of an array. Returns a new array. + +- The lambda must return a value for every element — void is an error +- If the lambda returns `deleted()`, the element is omitted from the result + +- **Receiver:** array +- **Parameters:** `fn` — lambda (one parameter: element → any) +- **Returns:** array +- **Examples:** + ```bloblang + [1, 2, 3].map(x -> x * 2) # [2, 4, 6] + [1, -2, 3].map(x -> if x > 0 { x } else { deleted() }) # [1, 3] + [1, -2, 3].map(x -> if x > 0 { x * 10 }) # ERROR: void when x <= 0 + ``` +- **See:** Section 4.1 for void and deleted() behavior in lambda returns + +### `.sort()` + +Sort an array in ascending order. Sort is **stable** (equal elements preserve relative order). All elements must belong to the same sortable type family — mixing across families is an error. + +**Sortable type families:** +- **Numeric** (int32, int64, uint32, uint64, float32, float64): promoted before comparison using standard rules (Section 2.3) +- **String**: compared lexicographically by Unicode codepoint +- **Timestamp**: compared chronologically + +Bool, null, bytes, array, and object are not sortable — an array containing these types will error. Cross-family mixing (e.g., numbers with strings) is also an error. + +**NaN ordering:** NaN values sort after all other numeric values (including Infinity). This follows the total ordering convention used by Go and Java rather than IEEE 754 comparison semantics. Multiple NaN values maintain their relative order (stable sort). + +- **Receiver:** array +- **Returns:** array +- **Examples:** + ```bloblang + [3, 1, 2].sort() # [1, 2, 3] + ["b", "a", "c"].sort() # ["a", "b", "c"] + [3, 1.5, 2].sort() # [1.5, 2, 3] (int64 promoted to float64) + [].sort() # [] (empty array, trivially valid) + [1, "a", true].sort() # ERROR: cannot sort mixed type families + ``` + +### `.sort_by(fn)` + +Sort an array using a key function. Sort is **stable**. The lambda extracts a sort key from each element; keys are compared using the same rules as `.sort()`. + +- **Receiver:** array +- **Parameters:** `fn` — lambda (one parameter: element → comparable value) +- **Returns:** array +- **Examples:** + ```bloblang + [{"name": "Bob"}, {"name": "Alice"}].sort_by(x -> x.name) + # [{"name": "Alice"}, {"name": "Bob"}] + + [3, -1, 2].sort_by(x -> x.abs()) # [-1, 2, 3] (sorted by absolute value) + ``` + +### `.append(value)` + +Return a new array with the value appended to the end. + +- **Receiver:** array +- **Parameters:** `value` (any) +- **Returns:** array +- **Example:** `[1, 2].append(3)` → `[1, 2, 3]` + +### `.concat(other)` + +Concatenate two arrays. Returns a new array with all elements from both. + +- **Receiver:** array +- **Parameters:** `other` (array) +- **Returns:** array +- **Example:** `[1, 2].concat([3, 4])` → `[1, 2, 3, 4]` + +### `.flatten()` + +Flatten nested arrays by one level. Non-array elements are kept as-is. + +- **Receiver:** array +- **Returns:** array +- **Examples:** + ```bloblang + [[1, 2], [3, 4]].flatten() # [1, 2, 3, 4] + [[1, [2]], [3]].flatten() # [1, [2], 3] (one level only) + [1, 2, 3].flatten() # [1, 2, 3] (no nesting, unchanged) + [1, [], 2].flatten() # [1, 2] (empty array spliced as zero elements) + ``` + +### `.unique(fn?)` + +Remove duplicate elements, preserving the first occurrence of each value. When `fn` is provided, the lambda extracts a comparison key from each element — elements are considered duplicates if their keys are equal. When `fn` is omitted, elements are compared directly by equality. Comparison uses equality semantics (Section 2.3), except that all NaN values are considered equal (consistent with `.sort()`'s total ordering). At most one NaN is retained. + +- **Receiver:** array +- **Parameters:** `fn` — lambda (one parameter: element → any), optional. When provided, extracts a comparison key from each element. +- **Returns:** array +- **Examples:** + ```bloblang + [1, 2, 2, 3, 1].unique() # [1, 2, 3] + ["a", "b", "a"].unique() # ["a", "b"] + [{"id": 1, "v": "a"}, {"id": 1, "v": "b"}].unique(x -> x.id) # [{"id": 1, "v": "a"}] + ["hello", "HELLO", "world"].unique(x -> x.lowercase()) # ["hello", "world"] + ``` + +### `.without_index(index)` + +Return a new array with the element at the given index removed. Remaining elements shift down. Negative indices count from the end. Out-of-bounds indices are an error. + +- **Receiver:** array +- **Parameters:** `index` (int64) +- **Returns:** array +- **Examples:** + ```bloblang + [10, 20, 30].without_index(1) # [10, 30] + [10, 20, 30].without_index(0) # [20, 30] + [10, 20, 30].without_index(-1) # [10, 20] + [10, 20, 30].without_index(5) # ERROR: index out of bounds + ``` +- **Design note:** Unlike `.without()` on objects (which accepts an array of keys), `.without_index()` takes a single index. Chaining multiple `.without_index()` calls is error-prone because indices shift after each removal. To remove multiple elements, use `.filter()` or `.enumerate().filter(...).map(e -> e.value)` instead. + +### `.enumerate()` + +Convert an array to an array of `{"index": i, "value": v}` objects. + +- **Receiver:** array +- **Returns:** array of objects +- **Example:** + ```bloblang + ["a", "b", "c"].enumerate() + # [{"index": 0, "value": "a"}, {"index": 1, "value": "b"}, {"index": 2, "value": "c"}] + ``` + +### `.any(fn)` + +Test if any element satisfies the predicate. Returns `false` for empty arrays. **Must** short-circuit on first `true` — subsequent elements are not evaluated (this is a required semantic, not an optimization). + +- **Receiver:** array +- **Parameters:** `fn` — lambda (one parameter: element → bool) +- **Returns:** bool +- **Examples:** + ```bloblang + [1, 2, 3].any(x -> x > 2) # true + [1, 2, 3].any(x -> x > 5) # false + [].any(x -> true) # false + ``` + +### `.all(fn)` + +Test if all elements satisfy the predicate. Returns `true` for empty arrays. **Must** short-circuit on first `false` — subsequent elements are not evaluated (this is a required semantic, not an optimization). + +- **Receiver:** array +- **Parameters:** `fn` — lambda (one parameter: element → bool) +- **Returns:** bool +- **Examples:** + ```bloblang + [1, 2, 3].all(x -> x > 0) # true + [1, 2, 3].all(x -> x > 2) # false + [].all(x -> false) # true + ``` + +### `.find(fn)` + +Return the first element that satisfies the predicate. **Must** short-circuit — subsequent elements are not evaluated after the first match (this is a required semantic, not an optimization). If no element matches, produces **void** — use `.or()` to provide a fallback. + +**Design note:** `.find()` produces void on no match (rather than returning `null` or erroring) because `null` could be a legitimate array element, and "no match" is genuinely the absence of a value — exactly what void represents. This is consistent with if-without-else and match-without-`_`, which also produce void when no branch yields a value. In contrast, `.index_of()` returns `-1` because indices are non-negative integers, making `-1` an unambiguous "not found" sentinel. + +- **Receiver:** array +- **Parameters:** `fn` — lambda (one parameter: element → bool) +- **Returns:** any (the element), or void if no element matches +- **Examples:** + ```bloblang + [1, 2, 3].find(x -> x > 1) # 2 + [1, 2, 3].find(x -> x > 5) # void (no match) + [1, 2, 3].find(x -> x > 5).or(0) # 0 (void rescued) + output.val = [1, 2].find(x -> x > 5) # assignment skipped (void) + $x = [1, 2].find(x -> x > 5) # ERROR: void in variable declaration + $x = [1, 2].find(x -> x > 5).or(0) # 0 + ``` + +### `.join(delimiter)` + +Join array elements into a string with a delimiter. All elements must be strings — non-string elements are an error. + +- **Receiver:** array of strings +- **Parameters:** `delimiter` (string) +- **Returns:** string +- **Examples:** + ```bloblang + ["a", "b", "c"].join(",") # "a,b,c" + ["hello", "world"].join(" ") # "hello world" + [].join(",") # "" + ``` + +### `.sum()` + +Sum all numeric elements. All elements must be numeric — non-numeric elements are an error. Elements are pairwise promoted using the same rules as `+` (Section 2.3) — e.g., mixing int64 and float64 promotes all to float64. Returns `0` (int64) for empty arrays. + +- **Receiver:** array of numeric values +- **Returns:** numeric (promoted type) +- **Examples:** + ```bloblang + [1, 2, 3].sum() # 6 (int64) + [1.5, 2.5].sum() # 4.0 (float64) + [1, 1.5, 2].sum() # 4.5 (float64: int64 promoted to float64) + [].sum() # 0 (int64) + ``` + +### `.min()` + +Return the minimum element of an array. All elements must belong to the same sortable type family (same rules as `.sort()`). Empty arrays are an error. + +- **Receiver:** array of sortable values (numeric, string, or timestamp — not mixed) +- **Returns:** same type as elements (promoted type for mixed numeric subtypes) +- **Examples:** + ```bloblang + [3, 1, 2].min() # 1 (int64) + [3.5, 1.2, 2.8].min() # 1.2 (float64) + [3, 1.5, 2].min() # 1.5 (float64: int64 promoted) + ["c", "a", "b"].min() # "a" + [].min() # ERROR: empty array + ``` + +### `.max()` + +Return the maximum element of an array. All elements must belong to the same sortable type family (same rules as `.sort()`). Empty arrays are an error. + +- **Receiver:** array of sortable values (numeric, string, or timestamp — not mixed) +- **Returns:** same type as elements (promoted type for mixed numeric subtypes) +- **Examples:** + ```bloblang + [3, 1, 2].max() # 3 (int64) + [3.5, 1.2, 2.8].max() # 3.5 (float64) + [3, 1.5, 2].max() # 3.0 (float64: int64 promoted) + ["c", "a", "b"].max() # "c" + [].max() # ERROR: empty array + ``` + +### `.fold(initial, fn)` + +Reduce an array to a single value by applying an accumulator function to each element. The lambda receives the running tally and the current element, and returns the new tally. + +- **Receiver:** array +- **Parameters:** `initial` (any — starting value), `fn` — lambda (two parameters: tally, element → any) +- **Returns:** any (the final tally) +- **Examples:** + ```bloblang + [1, 2, 3].fold(0, (tally, x) -> tally + x) # 6 + [1, 2, 3].fold(1, (tally, x) -> tally * x) # 6 + ["a", "b"].fold("", (tally, x) -> tally + x + ",") # "a,b," + ``` + +### `.collect()` + +Convert an array of `{"key": k, "value": v}` objects back into an object. Last value wins on duplicate keys. + +- **Receiver:** array of objects (each must have `"key"` (string) and `"value"` (any) fields; extra fields are ignored) +- **Returns:** object +- **Errors:** if any element is not an object, is missing `"key"` or `"value"` fields, or if `"key"` is not a string +- **Examples:** + ```bloblang + [{"key": "a", "value": 1}, {"key": "b", "value": 2}].collect() # {"a": 1, "b": 2} + [{"key": "a", "value": 1}, {"key": "a", "value": 2}].collect() # {"a": 2} (last value wins) + [{"key": "a", "value": 1, "extra": true}].collect() # {"a": 1} (extra fields ignored) + [{"key": "a", "value": 1}, {"bad": true}].collect() # ERROR: element missing "key"/"value" fields + ``` +- **Note:** `.collect()` returns an object, and object key ordering is not preserved (Section 2.3). Sorting entries before `.collect()` (e.g., `.iter().sort_by(e -> e.key).collect()`) does not produce an object with ordered iteration — the sort order is lost. JSON serialization is deterministic (keys sorted lexicographically by `.format_json()` and `.string()`), but iteration order via `.iter()`, `.keys()`, and `.values()` is not guaranteed. + +--- + +## 13.7 Object Methods + +### `.iter()` + +Convert an object to an array of `{"key": k, "value": v}` objects. Order is not guaranteed. + +- **Receiver:** object +- **Returns:** array of objects (each with string field `"key"` and any-typed field `"value"`) +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.iter() + # [{"key": "a", "value": 1}, {"key": "b", "value": 2}] (order not guaranteed) + + # Extract keys + {"a": 1, "b": 2}.iter().map(e -> e.key) # ["a", "b"] (order not guaranteed) + + # Extract values + {"a": 1, "b": 2}.iter().map(e -> e.value) # [1, 2] (order not guaranteed) + + # Complex transforms — use iter/collect + {"a": 1, "b": 2}.iter().map(e -> {"key": e.key.uppercase(), "value": e.value * 10}).collect() + # {"A": 10, "B": 20} + + # Filter entries + {"a": 1, "b": 2, "c": 3}.iter().filter(e -> e.value > 1).collect() + # {"b": 2, "c": 3} + ``` +- **Note:** For common transforms, prefer the dedicated methods `.map_values()`, `.map_keys()`, `.map_entries()`, and `.filter_entries()` — they are more concise. Use `.iter()`/`.collect()` for complex transforms that don't fit those patterns. + +### `.keys()` + +Return the keys of an object as an array of strings. Order is not guaranteed. + +- **Receiver:** object +- **Returns:** array of strings +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.keys() # ["a", "b"] (order not guaranteed) + {}.keys() # [] + ``` + +### `.values()` + +Return the values of an object as an array. Order is not guaranteed, but corresponds to the same order as `.keys()` within a single call. + +- **Receiver:** object +- **Returns:** array of any +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.values() # [1, 2] (order not guaranteed) + {}.values() # [] + ``` + +### `.has_key(key)` + +Check if an object contains the given key. + +- **Receiver:** object +- **Parameters:** `key` (string) +- **Returns:** bool +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.has_key("a") # true + {"a": 1, "b": 2}.has_key("c") # false + ``` + +### `.merge(other)` + +Merge two objects. If both objects contain the same key, the value from `other` wins. + +- **Receiver:** object +- **Parameters:** `other` (object) +- **Returns:** object +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.merge({"b": 3, "c": 4}) # {"a": 1, "b": 3, "c": 4} + {"a": 1}.merge({}) # {"a": 1} + ``` + +### `.without(keys)` + +Return a new object with the specified keys removed. Keys that don't exist are ignored. + +- **Receiver:** object +- **Parameters:** `keys` (array of strings) +- **Returns:** object +- **Examples:** + ```bloblang + {"a": 1, "b": 2, "c": 3}.without(["a", "c"]) # {"b": 2} + {"a": 1}.without(["x"]) # {"a": 1} + {"a": 1, "b": 2}.without([]) # {"a": 1, "b": 2} + ``` + +### `.map_values(fn)` + +Transform the values of an object, keeping keys unchanged. Returns a new object. + +- The lambda must return a value for every entry — void is an error +- If the lambda returns `deleted()`, the entry is omitted from the result + +- **Receiver:** object +- **Parameters:** `fn` — lambda (one parameter: value → any) +- **Returns:** object +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.map_values(v -> v * 10) # {"a": 10, "b": 20} + {"a": "hello", "b": "world"}.map_values(v -> v.uppercase()) # {"a": "HELLO", "b": "WORLD"} + {"a": 1, "b": -2}.map_values(v -> if v > 0 { v } else { deleted() }) # {"a": 1} + ``` + +### `.map_keys(fn)` + +Transform the keys of an object, keeping values unchanged. Returns a new object. If multiple keys map to the same new key, last value wins. + +- The lambda must return a value for every entry — void is an error +- If the lambda returns `deleted()`, the entry is omitted from the result (the `deleted()` check runs before the type check, so this does not trigger a type error) +- Otherwise, the lambda must return a string — non-string return values are an error + +- **Receiver:** object +- **Parameters:** `fn` — lambda (one parameter: key (string) → string | deleted) +- **Returns:** object +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.map_keys(k -> k.uppercase()) # {"A": 1, "B": 2} + {"user_name": "Alice"}.map_keys(k -> k.replace_all("_", "-")) # {"user-name": "Alice"} + ``` + +### `.map_entries(fn)` + +Transform both keys and values of an object. The lambda receives two parameters (key, value) and must return an object with `"key"` (string) and `"value"` (any) fields. If multiple entries produce the same key, last value wins. **Errors:** lambda returns a non-object, returned object is missing `"key"` or `"value"` field, or `"key"` is not a string. + +- The lambda must return a value for every entry — void is an error +- If the lambda returns `deleted()`, the entry is omitted from the result + +- **Receiver:** object +- **Parameters:** `fn` — lambda (two parameters: key (string), value (any) → object with `"key"` and `"value"` fields) +- **Returns:** object +- **Examples:** + ```bloblang + {"a": 1, "b": 2}.map_entries((k, v) -> {"key": k.uppercase(), "value": v * 10}) + # {"A": 10, "B": 20} + ``` + +### `.filter_entries(fn)` + +Filter entries of an object. The lambda receives two parameters (key, value) and must return a boolean. Returns a new object containing only entries for which the lambda returns `true`. + +- **Receiver:** object +- **Parameters:** `fn` — lambda (two parameters: key (string), value (any) → bool) +- **Returns:** object +- **Examples:** + ```bloblang + {"a": 1, "b": 2, "c": 3}.filter_entries((k, v) -> v > 1) # {"b": 2, "c": 3} + {"aa": 1, "b": 2}.filter_entries((k, v) -> k.length() > 1) # {"aa": 1} + ``` + +--- + +## 13.8 Numeric Methods + +### `.abs()` + +Return the absolute value. For signed integer types, errors if the result overflows (the most-negative value of each signed type has no positive counterpart). For unsigned types, returns the value unchanged. + +- **Receiver:** any numeric type +- **Returns:** same type as receiver +- **Examples:** + ```bloblang + (-5).abs() # 5 (int64) + 3.14.abs() # 3.14 (float64) + (-3.14).abs() # 3.14 (float64) + (-2147483648).int32().abs() # ERROR: int32 overflow + ``` + +### `.floor()` + +Return the largest integer value less than or equal to the number. + +- **Receiver:** float32, float64 +- **Returns:** same float type as receiver +- **Examples:** + ```bloblang + 3.7.floor() # 3.0 (float64) + (-3.2).floor() # -4.0 (float64) + ``` + +### `.ceil()` + +Return the smallest integer value greater than or equal to the number. + +- **Receiver:** float32, float64 +- **Returns:** same float type as receiver +- **Examples:** + ```bloblang + 3.2.ceil() # 4.0 (float64) + (-3.7).ceil() # -3.0 (float64) + ``` + +### `.round(n = 0)` + +Round a float to `n` decimal places using **half-even rounding** (banker's rounding, IEEE 754 default). Defaults to `0` (round to nearest integer). Negative `n` rounds to powers of 10: `-1` rounds to nearest 10, `-2` to nearest 100, etc. + +- **Receiver:** float32, float64 +- **Parameters:** `n` (int64, default `0` — number of decimal places; negative values round to powers of 10) +- **Returns:** same float type as receiver +- **Examples:** + ```bloblang + 3.7.round() # 4.0 (default: round to nearest integer) + 2.5.round() # 2.0 (half-even: rounds to nearest even) + 3.456.round(2) # 3.46 + 2.5.round(0) # 2.0 (half-even: rounds to nearest even) + 3.5.round(0) # 4.0 (half-even: rounds to nearest even) + 1234.0.round(-2) # 1200.0 (round to nearest hundred) + 1250.0.round(-2) # 1200.0 (half-even: rounds to nearest even hundred) + ``` + +--- + +## 13.9 Time Methods + +### `.ts_parse(format = "%Y-%m-%dT%H:%M:%S%f%z")` + +Parse a string into a timestamp using the given format string. Defaults to RFC 3339 format when no format is specified. + +**Stored zone on the result:** +- If the format string contains `%z` or `%Z` and the parsed input provides zone information, the result's stored zone is taken from the input (e.g., `+05:30` or `America/New_York`). +- If the format string has no zone directive, or it has one but the input value matches an empty zone position, the result is **UTC** — the stored zone is set to UTC and all parsed clock components are interpreted as UTC wall-clock time. + +- **Receiver:** string +- **Parameters:** `format` (string, default `"%Y-%m-%dT%H:%M:%S%f%z"` — strftime format) +- **Returns:** timestamp +- **Errors:** if the string does not match the format, or if the format string contains unrecognized directives +- **Examples:** + ```bloblang + "2024-03-01T12:00:00Z".ts_parse() # UTC (Z in input, stored zone UTC) + "2024-03-01T12:00:00.123Z".ts_parse() # UTC with fractional seconds + "2024-03-01T12:00:00+05:30".ts_parse() # Stored zone +05:30 + "2024-03-01".ts_parse("%Y-%m-%d") # No zone in format → UTC + # (result is 2024-03-01T00:00:00Z) + ``` + +**Required strftime directives:** All implementations must support the following subset. Additional directives are implementation-defined. + +| Directive | Meaning | Example | +|-----------|---------|---------| +| `%Y` | 4-digit year | `2024` | +| `%m` | Month (01–12) | `03` | +| `%d` | Day of month (01–31) | `01` | +| `%H` | Hour, 24-hour (00–23) | `14` | +| `%M` | Minute (00–59) | `30` | +| `%S` | Second (00–59) | `05` | +| `%f` | Fractional seconds (nanosecond precision, optional — see note) | `.123456789` | +| `%z` | UTC offset or `Z` (see note) | `Z`, `+05:30` | +| `%Z` | Timezone name (IANA or abbreviation) | `UTC`, `America/New_York` | +| `%a` | Abbreviated weekday name | `Mon` | +| `%A` | Full weekday name | `Monday` | +| `%b` | Abbreviated month name | `Jan` | +| `%B` | Full month name | `January` | +| `%p` | AM/PM (uppercase) | `PM` | +| `%I` | Hour, 12-hour (01–12) | `02` | +| `%j` | Day of year (001–366) | `061` | +| `%%` | Literal `%` | `%` | + +**`%f` semantics:** `%f` is not part of POSIX strftime but is widely supported for sub-second precision. **Parsing:** `%f` is optional — it consumes the leading `.` and 1–9 fractional digits if present, padding to nanoseconds (e.g., `.123` → 123000000 ns). If no `.` follows the seconds, `%f` matches zero characters and contributes zero fractional seconds. This allows a single format string like `"%Y-%m-%dT%H:%M:%S%f%z"` to parse both `"2024-03-01T12:00:00Z"` and `"2024-03-01T12:00:00.123Z"`. **Formatting:** `%f` emits the shortest representation that retains precision — trailing zeros are removed, and the directive is omitted entirely (including the `.`) when fractional seconds are zero. Examples: 123456789 ns → `.123456789`, 123000000 ns → `.123`, 0 ns → (empty). This differs from Python's `%f`, which always emits exactly 6 digits. + +**`%z` semantics:** **Parsing:** `%z` accepts both `Z` (UTC) and UTC offsets in the forms `+HH:MM`, `-HH:MM`, `+HHMM`, or `-HHMM`. **Formatting:** `%z` emits `Z` for UTC (the shortest RFC 3339 representation) and `±HH:MM` for all other offsets. + +### `.ts_format(format = "%Y-%m-%dT%H:%M:%S%f%z")` + +Format a timestamp as a string using the given format string. Defaults to RFC 3339 format when no format is specified. Supports the same required directives as `.ts_parse()`. + +**Output uses the receiver's stored zone** (Section 2.3). Clock components (`%Y`, `%m`, `%d`, `%H`, `%M`, `%S`, `%f`, `%I`, `%p`, `%j`, `%a`, `%A`, `%b`, `%B`) are rendered as they read in the stored zone, and `%z` / `%Z` emit that zone's offset / name. A timestamp produced by `now()` therefore renders in the process's local zone when the format string omits `%z`/`%Z` — include a zone directive (as the default format does) or construct timestamps explicitly via `timestamp(..., timezone: "UTC")` for deterministic output. Unrecognized directives in the format string are an error, matching `.ts_parse()`. + +- **Receiver:** timestamp +- **Parameters:** `format` (string, default `"%Y-%m-%dT%H:%M:%S%f%z"` — strftime format) +- **Returns:** string +- **Examples:** + ```bloblang + timestamp(2024, 3, 1, 12).ts_format() # "2024-03-01T12:00:00Z" (UTC default on constructor) + timestamp(2024, 3, 1, 12).ts_format("%Y-%m-%d %H:%M:%S") # "2024-03-01 12:00:00" + timestamp(2024, 3, 1, 12, 0, 0, 0, "America/New_York") + .ts_format() # "2024-03-01T12:00:00-05:00" + now().ts_format() # RFC 3339 in process local zone + # (e.g., "2024-03-01T14:32:05+01:00") + now().ts_format("%Y-%m-%d") # Local-zone date + # (host-dependent near midnight) + ``` +- **Note:** `.ts_format()` with default arguments produces the same output as `.string()` on a timestamp. Both use RFC 3339 with shortest-precision fractional seconds and the receiver's stored zone. + +### `.ts_unix()` + +Convert a timestamp to a Unix timestamp (seconds since epoch). + +- **Receiver:** timestamp +- **Returns:** int64 +- **Example:** `now().ts_unix()` → `1709500000` + +### `.ts_unix_milli()` + +Convert a timestamp to a Unix timestamp in milliseconds. + +- **Receiver:** timestamp +- **Returns:** int64 +- **Example:** `now().ts_unix_milli()` → `1709500000000` + +### `.ts_unix_micro()` + +Convert a timestamp to a Unix timestamp in microseconds. + +- **Receiver:** timestamp +- **Returns:** int64 +- **Example:** `now().ts_unix_micro()` → `1709500000000000` + +### `.ts_unix_nano()` + +Convert a timestamp to a Unix timestamp in nanoseconds. + +- **Receiver:** timestamp +- **Returns:** int64 +- **Example:** `now().ts_unix_nano()` → `1709500000000000000` + +### `.ts_from_unix()` + +Convert a Unix timestamp (seconds since epoch) to a timestamp. Integer receivers produce second-precision timestamps. Float receivers provide sub-second precision — the fractional part is interpreted as fractions of a second. **Precision note:** float64 has ~15-17 significant decimal digits. For current Unix timestamps (~10 integer digits), this leaves ~6-7 fractional digits of precision — sufficient for microseconds but not nanoseconds. For full nanosecond precision, use `.ts_from_unix_nano()` with an int64 value instead. + +**Stored zone:** the result's stored zone is the **process's local zone** (same convention as `now()`), affecting only `.ts_format()` / `.string()` output. This applies to `.ts_from_unix()`, `.ts_from_unix_milli()`, `.ts_from_unix_micro()`, and `.ts_from_unix_nano()`. + +- **Receiver:** any numeric type (integers are widened to int64; float32 is widened to float64). uint64 values exceeding int64 range are a runtime error, consistent with signed+unsigned promotion rules (Section 2.3) +- **Returns:** timestamp +- **Examples:** + ```bloblang + 1709500000.ts_from_unix() # timestamp: 2024-03-03T...Z (second precision) + 1709500000.5.ts_from_unix() # timestamp: 2024-03-03T...500000000Z (sub-second) + 1709500000.123456.ts_from_unix() # ~microsecond precision (float64 limit) + ``` + +### `.ts_from_unix_milli()` + +Convert a Unix timestamp in milliseconds to a timestamp. Provides exact millisecond precision using integer arithmetic. + +- **Receiver:** int64 +- **Returns:** timestamp +- **Examples:** + ```bloblang + 1709500000000.ts_from_unix_milli() # same as 1709500000.ts_from_unix() + 1709500000123.ts_from_unix_milli() # exact millisecond precision + ``` + +### `.ts_from_unix_micro()` + +Convert a Unix timestamp in microseconds to a timestamp. Provides exact microsecond precision using integer arithmetic. + +- **Receiver:** int64 +- **Returns:** timestamp +- **Examples:** + ```bloblang + 1709500000000000.ts_from_unix_micro() # same as 1709500000.ts_from_unix() + 1709500000123456.ts_from_unix_micro() # exact microsecond precision + ``` + +### `.ts_from_unix_nano()` + +Convert a Unix timestamp in nanoseconds to a timestamp. Provides exact nanosecond precision using integer arithmetic. This is the lossless round-trip counterpart to `.ts_unix_nano()`. + +- **Receiver:** int64 +- **Returns:** timestamp +- **Examples:** + ```bloblang + 1709500000000000000.ts_from_unix_nano() # same as 1709500000.ts_from_unix() + 1709500000123456789.ts_from_unix_nano() # exact nanosecond precision + now().ts_unix_nano().ts_from_unix_nano() # lossless round-trip + ``` + +### `.ts_add(nanos)` + +Add a duration in nanoseconds to a timestamp. Negative values subtract. Use `second()` to avoid raw nanosecond constants. If the resulting timestamp would be outside the representable range, a runtime error is thrown (consistent with integer overflow rules in Section 2.3). + +- **Receiver:** timestamp +- **Parameters:** `nanos` (int64 — duration in nanoseconds) +- **Returns:** timestamp +- **Examples:** + ```bloblang + now().ts_add(second()) # 1 second later + now().ts_add(second() * -60) # 1 minute ago + now().ts_add(second() * 86400) # 1 day later + ``` + +--- + +## 13.10 Error Handling Methods + +### `.catch(fn)` + +Handle errors. Called only when the expression to its left produces an error. If the expression succeeds, `.catch()` returns its value unchanged. The error object has a single field: `.what` (string, the error message). + +- **Receiver:** any expression (catches errors from the left-hand side) +- **Parameters:** `fn` — lambda (one parameter: error object → any) +- **Returns:** any (either the original value or the lambda's result) +- **`deleted()` and void from handler:** The handler lambda may return `deleted()` or void — these flow to the calling context with normal semantics. For example, in a field assignment, `deleted()` removes the field and void skips the assignment. This mirrors `.or()`, which also supports `deleted()` (Section 8.3). +- **Examples:** + ```bloblang + input.date.ts_parse("%Y-%m-%d").catch(err -> null) + input.date.ts_parse("%Y-%m-%d").catch(err -> throw("parse failed: " + err.what)) + + # deleted() from handler — removes field on error + output.date = input.raw_date.ts_parse("%Y-%m-%d").catch(err -> deleted()) + + # void from handler — skips assignment on error + output.date = input.raw_date.ts_parse("%Y-%m-%d").catch(err -> if input.strict { throw(err.what) }) + ``` +- **See:** Section 8.2 + +### `.or(default)` + +Provide a default value for null, void, or `deleted()`. Takes exactly one argument — zero or multiple arguments are a compile-time error. Uses **short-circuit evaluation** — the argument is only evaluated if the receiver is null, void, or `deleted()`. Along with `.catch()`, this is one of only two methods that can be called on void or `deleted()`. `.catch()` passes them through unchanged; `.or()` actively rescues them. + +- **Receiver:** any expression (including void and `deleted()`) +- **Parameters:** `default` (any expression, lazily evaluated; exactly one argument required) +- **Returns:** any (either the original value, or the default if receiver was null/void/deleted) +- **Examples:** + ```bloblang + input.name.or("Anonymous") + input.name.or(throw("name is required")) # throw() only evaluated if name is null + (if false { "hello" }).or("world") # "world" (void rescued) + (match input.x { "a" => 1 }).or(0) # 0 if no case matched (void rescued) + some_map(input.value).or("fallback") # "fallback" if map returned deleted() + ``` +- **See:** Section 8.3; Section 8.6 for how `.or()` and `.catch()` compose when chained together + +### `.not_null(message = "unexpected null value")` + +Assert that a value is not null. Returns the value unchanged if it is not null; throws an error with the given message if it is null. This is a concise alternative to `.or(throw("message"))` for null validation. + +- **Receiver:** any type (including null) +- **Parameters:** `message` (string, default `"unexpected null value"`) +- **Returns:** the receiver value, unchanged (if not null) +- **Errors:** if the receiver is null, throws an error with the given message +- **Examples:** + ```bloblang + "hello".not_null() # "hello" (not null, returned as-is) + 42.not_null() # 42 + null.not_null() # ERROR: unexpected null value + null.not_null("name is required") # ERROR: name is required + input.name.not_null("name is required") # value if present, error if null + ``` +- **Note:** `.not_null()` is a regular method — it cannot be called on void or `deleted()` (those error before reaching the method). To validate against null, void, and `deleted()` simultaneously, use `.or(throw("message"))` instead (Section 8.3). + +--- + +## 13.11 Parsing Methods + +### `.parse_json()` + +Parse a JSON string into a value. Errors if the string is not valid JSON. + +**Numeric type mapping:** JSON numbers without a decimal point or exponent are parsed as int64 if the value fits in int64 range; if it exceeds int64 range, the value is parsed as float64 (which may lose precision for very large integers). JSON numbers with a decimal point or exponent are parsed as float64 (matching Bloblang float literal rules). **Note:** Large unsigned integers (between 2^63 and 2^64-1) exceed int64 range and are parsed as float64, which loses precision. To handle these values losslessly, represent them as JSON strings and convert explicitly: `"18446744073709551615".uint64()`. + +- **Receiver:** string, bytes (bytes are interpreted as UTF-8-encoded JSON; errors if bytes are not valid UTF-8) +- **Returns:** any (the parsed value) +- **Examples:** + ```bloblang + `{"name":"Alice"}`.parse_json() # {"name": "Alice"} + `[1,2,3]`.parse_json() # [1, 2, 3] (int64 elements) + `"hello"`.parse_json() # "hello" + `42`.parse_json() # 42 (int64: no decimal point) + `3.14`.parse_json() # 3.14 (float64: has decimal point) + `1e3`.parse_json() # 1000.0 (float64: has exponent) + ``` + +### `.format_json(indent = "", no_indent = false, escape_html = true)` + +Serialize a value to a JSON string. Object keys are sorted lexicographically by Unicode codepoint value (consistent with string comparison semantics in Section 2.3). Timestamp values are formatted as RFC 3339 strings (Section 2.3). **Note:** Since object key ordering is not preserved (Section 2.3) and keys are sorted on output, `.parse_json().format_json()` may produce different key ordering than the original JSON string. + +- **Receiver:** any type (except bytes) +- **Parameters:** + - `indent` (string, default `""`) — Indentation string. When non-empty, each element in a JSON object or array begins on a new line, indented with one or more copies of this string according to nesting depth. + - `no_indent` (bool, default `false`) — Disable indentation entirely, overriding `indent`. Produces compact output with no extra whitespace. + - `escape_html` (bool, default `true`) — Escape HTML-sensitive characters (`<`, `>`, `&`) as Unicode escape sequences in strings. +- **Returns:** string +- **Errors:** if the value is or contains a bytes value (at any nesting depth). Bytes have no implicit JSON serialization — use `.encode("base64")` or `.encode("hex")` before serializing. NaN and Infinity float values also error (not representable in JSON). +- **Numeric serialization:** + - Integer types (int32, int64, uint32, uint64): serialized as JSON integers (no decimal point, no quotes). Large uint64 values (> 2^53) are serialized as-is — the JSON spec imposes no range limit, though consumers using float64 may lose precision. + - Float types (float32, float64): serialized as any shortest decimal representation that round-trips exactly, always including either a decimal point or exponent to distinguish from integer serialization. This matches `.string()` behavior (including the cross-implementation note) and ensures that `.format_json()` → `.parse_json()` preserves the float type. Exponent notation is permitted (e.g., `1e+06`). +- **Examples:** + ```bloblang + {"name": "Alice"}.format_json() # `{"name":"Alice"}` + [1, 2, 3].format_json() # `[1,2,3]` + {"time": now()}.format_json() # `{"time":"2024-03-01T12:00:00Z"}` + {"name": "Alice", "age": 30}.format_json(indent: " ") # pretty-printed with 2-space indent + {"name": "Alice"}.format_json(indent: "\t") # pretty-printed with tab indent + {"html": "hi"}.format_json() # `{"html":"\u003cb\u003ehi\u003c/b\u003e"}` + {"html": "hi"}.format_json(escape_html: false) # `{"html":"hi"}` + ``` + +### `.encode(scheme)` + +Encode a value into a string using the specified encoding scheme. + +- **Receiver:** string, bytes +- **Parameters:** `scheme` — string, one of: `"base64"`, `"base64url"`, `"base64rawurl"`, `"hex"` +- **Returns:** string +- **Schemes:** + - `"base64"` — standard Base64 with padding (RFC 4648) + - `"base64url"` — URL-safe Base64 with padding (RFC 4648) + - `"base64rawurl"` — URL-safe Base64 without padding (RFC 4648) + - `"hex"` — lowercase hexadecimal +- **Examples:** + ```bloblang + "hello".bytes().encode("base64") # "aGVsbG8=" + "hello".bytes().encode("hex") # "68656c6c6f" + "hello".encode("base64") # "aGVsbG8=" (string treated as UTF-8 bytes) + ``` + +### `.decode(scheme)` + +Decode a string from the specified encoding scheme into bytes. + +- **Receiver:** string +- **Parameters:** `scheme` — string, one of: `"base64"`, `"base64url"`, `"base64rawurl"`, `"hex"` +- **Returns:** bytes +- **Errors:** if the input is not valid for the specified scheme +- **Examples:** + ```bloblang + "aGVsbG8=".decode("base64") # bytes("hello") + "68656c6c6f".decode("hex") # bytes("hello") + "aGVsbG8=".decode("base64").string() # "hello" + ``` + +--- + +## 13.12 Pipeline Methods + +### `.into(fn)` + +Pass the receiver to a single-parameter lambda and return the lambda's result. Equivalent to binding the receiver to a name for use inside an expression-shaped body. Useful when a derived intermediate value is referenced multiple times in a transformation and extracting a full `let` statement or a named `map` would be heavier than warranted. + +- **Receiver:** any value type. As with every method other than `.or()` and `.catch()`, void and `deleted()` receivers are a runtime error (see Section 4.1); errors on the receiver propagate through `.into()` unchanged — the lambda is not called — and reach any downstream `.catch()` with normal semantics. +- **Parameters:** `fn` — lambda (one parameter: receiver → any). Arity must be exactly one; zero or multiple parameters is a compile-time error. +- **Returns:** whatever the lambda returns. If the lambda returns `deleted()`, void, or an error, those propagate through `.into()` unchanged; the calling context decides how to handle them. +- **Examples:** + ```bloblang + # Reuse a derived intermediate value without a separate let statement + output.summary = input.events + .filter(e -> e.kind == "purchase") + .into(purchases -> { + "count": purchases.length(), + "total": purchases.map(p -> p.amount).sum(), + }) + + # Chainable — behaves like any other method + output.normalized = input.name.trim().lowercase().into(s -> s.replace_all(" ", "_")) + + # Equivalent to a match _ as name binding, but reads as a pipeline step + output.label = input.score.into(s -> if s > 90 { "A" } else if s > 80 { "B" } else { "C" }) + + # Rescue void *before* .into(), since .into() errors on a void receiver + output.status = input.last_event.find(e -> e.kind == "login").or(null).into(login -> + if login == null { "never" } else { "recent" } + ) + ``` +- **Notes:** + - The lambda parameter is the only way to reference the receiver inside the body. V2 has no implicit-receiver keyword: `input`, `output`, and outer variables keep their normal meanings (see Section 3 on lambda scoping). + - `.into()` runs its lambda exactly once. It is not an iteration method; for array/object transformations use `.map()` / `.map_values()` / `.map_entries()`. + - If the lambda returns void (e.g. an `if` with no matching case), `.into()`'s result is void — the caller handles it like any other void (assignment skipped, collection-literal error, etc.). + - If the receiver might be void or `deleted()` (e.g. from a `.find()` call or a non-exhaustive match), rescue it with `.or(...)` *before* `.into()`; calling `.into()` on a void or `deleted()` receiver errors just like any other method call would. diff --git a/internal/bloblang2/spec/PROPOSAL.md b/internal/bloblang2/spec/PROPOSAL.md new file mode 100644 index 000000000..ea02f6b41 --- /dev/null +++ b/internal/bloblang2/spec/PROPOSAL.md @@ -0,0 +1,188 @@ +# Bloblang V2: Proposal for Redpanda Connect V5 + +## Context + +Redpanda Connect V5 is an opportunity to commit breaking changes. Any breaking change must meet a high bar: it should either reduce maintenance burden without impacting the majority of users, or improve the product in ways that aren't possible with backwards compatibility — ideally both. + +After consulting with a large number of customers we've determined that Bloblang is well-liked. Users generally don't want to migrate away from it, but they do want improvements to the language and better development tooling. This positions Bloblang as something to improve upon, not replace. + +## The Problem + +Bloblang V1 was never backed by a formal specification. The language grew organically, accumulating inconsistencies and ambiguities that make it: + +- **Hard to parse correctly** — edge cases and implicit behaviors make writing a correct parser difficult, and writing multiple parsers (e.g. in different languages) impractical. +- **Hard to build tooling for** — without a formal grammar, building LSP servers, linters, formatters, and other developer tools requires reverse-engineering behavior from the Go implementation. +- **Hard for AI to assist with** — LLMs cannot reliably help users write or debug mappings without a complete, unambiguous reference to ground their output against. +- **Expensive to maintain** — the existing parser and execution engine carry the weight of every historical ambiguity and implicit behavior. + +## V1 Pain Points + +These are concrete examples of ambiguities and traps in Bloblang V1 that cost us maintenance time and cost users debugging time. Each exists because V1 evolved without a formal spec, and each is resolved in V2. + +### `this` silently changes meaning + +The most significant design problem. At top level, `this` refers to the input document. Inside `map_each`, it silently shifts to refer to the current array element — or, for objects, to a synthetic `{"key": k, "value": v}` wrapper that appears from nowhere: + +```bloblang +# V1: the two uses of `this` mean completely different things +root.names = this.users.map_each(this.name.uppercase()) +# ^^^^ ^^^^ +# input document current array element +``` + +For objects it's worse — bare identifiers like `key` and `value` implicitly refer to fields on a hidden context object: + +```bloblang +# V1: where do `key` and `value` come from? nowhere obvious +root.result = this.dict.map_each(key + ": " + value) +``` + +V2 fixes this with explicit lambda parameters: +```bloblang +# V2: no ambiguity +output.names = input.users.map(user -> user.name.uppercase()) +output.result = input.dict.map_entries((k, v) -> {"key": k, "value": k + ": " + v}) +``` + +### Bare identifiers silently resolve to `this.field` + +Any unrecognized bare identifier is silently treated as `this.`. A typo like `inpt.name` instead of `input.name` parses without error and returns `null` at runtime. The parser has a `TODO V5: Remove this and force this, root, or named contexts` comment acknowledging this. + +### Assignment targets don't require `root` + +`root.foo = bar` and `foo = bar` are silently identical — the `root.` prefix is stripped if present but not required. This means the left and right sides of an assignment use different naming conventions: on the left, `foo` means `root.foo`; on the right, `foo` means `this.foo`. The parser has a `TODO V5: Enforce root here` comment. + +### `null` is silently treated as `false` in conditionals + +`if this.field { ... }` treats a `null` value (field doesn't exist) the same as `false` (field exists and is explicitly false). These are semantically different — a missing field is not the same as a false field. The implementation has a `TODO V5: Remove this` comment explaining this was discovered after users were already relying on it, so it couldn't be fixed without a breaking change. + +### `or()` conflates null, errors, deletion, and "nothing" + +The `.or()` method activates on errors, null values, `deleted()` markers, and the internal `Nothing` type — four semantically distinct conditions handled by one operator. Users cannot distinguish between "the value was legitimately null" and "an error occurred during evaluation": + +```bloblang +# V1: all of these return "fallback" — but for very different reasons +this.missing_field.or("fallback") # null +(5 / 0).or("fallback") # error +deleted().or("fallback") # deletion marker +nothing().or("fallback") # internal nothing type +``` + +V2 separates these cleanly: `.or()` handles null and void (absence of a value), `.catch()` handles only errors. + +### The pipe `|` looks like logical OR but is coalesce + +The `|` operator is a null/error coalesce, not a logical OR — but it uses the same symbol most developers associate with boolean logic. Like `.or()`, it silently swallows both errors and null values: + +```bloblang +# V1: this silently catches JSON parse errors, not just null +root.city = this.user.address.city | "Unknown" +``` + +V2 removes `|` entirely and provides `?.` for null-safe navigation and `.or()` for explicit null/void fallback. + +### Numbers silently coerce to booleans + +The `&&` and `||` operators accept numbers, coercing nonzero to `true` and zero to `false`. This means `5 && true` silently evaluates to `true` and `0 || false` evaluates to `false` — C-style implicit coercion that leads to subtle bugs. V2 requires booleans for logical operators. + +### Overlapping metadata APIs + +V1 has four-plus ways to access metadata: `meta("key")` (deprecated, string-only, reads input), `root_meta("key")` (deprecated, string-only, reads output), `metadata("key")` (any-typed, reads input), and `@key` (any-typed, reads output). The deprecated `meta()` function also can't distinguish between an empty string value and a missing key — both return `null`. V2 has exactly two: `input@.key` and `output@.key`. + +### No null-safe navigation + +V1 has no `?.` operator. The only way to handle potentially-null nested access is the pipe operator `|`, which also swallows errors (see above). There is no way to say "navigate this path, short-circuiting on null but still surfacing type errors." + +### Match mixes equality and boolean cases + +A V1 `match` block decides between equality comparison and boolean evaluation based on whether each case is a literal value or a dynamic expression. You can mix both forms in the same block, and the parser silently picks the mode per-case: + +```bloblang +# V1: first case is boolean evaluation, second is equality comparison +match this.score { + this.score >= 100 => "gold" # boolean: is the result true? + 50 => "exactly fifty" # equality: is score == 50? + _ => "other" +} +``` + +V2 has three explicit match forms and makes mixing them a compile error. + +## The Proposal + +I propose introducing **Bloblang V2**: a redesigned version of the language with a formal specification, developed and validated with the assistance of AI coding agents. The language preserves what users like about Bloblang (expressive, composable, familiar) while fixing what costs us (ambiguity, inconsistency, implicit behavior). + +### Design Principles + +1. **Radical Explicitness** — no implicit context shifting; all references are explicit (`input`/`output` instead of overloaded `this`/`root`). +2. **One Clear Way** — a single obvious approach for each operation, reducing cognitive load and eliminating "which way do I do this?" moments. +3. **Consistent Syntax** — symmetrical keywords, consistent prefixes, predictable patterns. +4. **Fail Loudly** — errors are explicit, not silent. No more wondering whether a mapping silently dropped data. + +### Key Language Improvements + +- **Explicit contexts**: `input`/`output` replace overloaded `this`/`root` +- **Expanded type system**: explicit timestamps, explicit lambda parameters for methods, and multiple integer/float widths (see Section 2.1) +- **Null-safe navigation**: `?.` and `?[]` operators +- **Isolated maps with parameters**: maps become proper functions — no implicit access to input/output context +- **Namespace imports**: `import "./utils.blobl" as utils` with `utils::function()` syntax +- **Formal grammar**: unambiguous, machine-parseable specification + +## Development Plan + +The plan leverages AI coding agents throughout the process — not as a shortcut, but as a forcing function. If the spec is good enough for AI to build correct implementations and tooling from, it's good enough for humans too. + +### Phase 1: Specification (in progress) + +Build a formal spec and iterate until both humans and AI are satisfied with its consistency and ergonomics. The spec must be: +- Unambiguous enough for a parser to be generated from it +- Complete enough that an AI agent can answer any question about the language +- Compressed enough to fit in an LLM context window + +**Status**: In progress. The specification lives on a branch as 13 sections covering the full language, from lexical structure through a formal grammar to a complete standard library reference. + +### Phase 2: Test Suite + +Generate a thorough test suite of mappings covering the entire breadth of the language. These are spec-level conformance tests — any correct implementation of Bloblang V2 should pass them all. + +### Phase 3: Multi-Implementation Validation + +Prove the robustness of the spec by having multiple independent AI agents generate implementations, then run those implementations against the conformance test suite from Phase 2. If different agents, working only from the spec, produce implementations that agree on all test cases, the spec is unambiguous. Where they disagree, we've found a gap in the spec to fix. + +### Phase 4: Tooling Generation + +Prove that agents can generate useful development tooling directly from the spec: LSP servers, syntax highlighters, formatters, linters. This validates that the spec supports the tooling ecosystem users have asked for. + +### Phase 5: AI-Assisted Development + +Test agents' ability to assist with language development when provided the spec. Can they help users write mappings? Debug them? Explain error messages? This validates the spec as a grounding document for AI-assisted workflows. + +### Phase 6: Opt-In Introduction + +If phases 2–5 are successful, introduce Bloblang V2 as an optional processor in Redpanda Connect (still V4), and gather real user feedback before any forced migration. + +### Phase 7: Migration Tooling + +Build a tool that converts Bloblang V1 mappings to V2. Critically, this tool must surface areas where V1 mappings rely on poorly defined or implicit behavior, so users can verify the converted output rather than trusting a silent best-effort translation. + +### Phase 8: Full Config Migration + +Expand the migration tool to convert entire Redpanda Connect V4 configs to V5, including Bloblang V1→V2 conversion as one component of the broader migration. + +## Why This Approach + +### AI changes the cost equation for DSLs + +Maintaining a domain-specific language has historically been expensive: parsers, tooling, documentation, and support all require ongoing investment. AI coding agents are changing this equation. With a good formal spec, much of this work — parser generation, tooling, documentation, user assistance — can be substantially automated. The investment shifts from "maintain everything by hand" to "maintain one excellent spec and generate the rest." + +### AI will be helping users write Bloblang regardless + +Users are already asking AI to help them write and debug Bloblang mappings. Today that assistance is grounded in whatever the LLM absorbed from documentation and blog posts during training — an incomplete and sometimes inaccurate picture. A formal spec gives AI a complete, authoritative reference to work from, making the assistance users are already seeking dramatically more reliable. + +### A spec-first approach de-risks the whole effort + +By validating the spec through multiple independent implementations and generated tooling before committing to a release, we catch design problems early. If agents struggle to implement a feature correctly from the spec, that's a signal the feature is under-specified or poorly designed — and we can fix it before users ever see it. + +### Multi-language parsers unlock new deployment models + +Bloblang V1 exists only as a Go implementation. A spec that's proven to be implementable by AI agents in multiple languages opens the door to native Bloblang support in other runtimes — WebAssembly for browser-based tooling, TypeScript for VS Code extensions, Rust for performance-critical paths — without maintaining each implementation by hand. diff --git a/internal/bloblang2/spec/README.md b/internal/bloblang2/spec/README.md new file mode 100644 index 000000000..84c669dd0 --- /dev/null +++ b/internal/bloblang2/spec/README.md @@ -0,0 +1,105 @@ +# Bloblang V2 Technical Specification + +## Core Principles + +- **Explicit Context Management** - No implicit behavior +- **One Clear Way** - Single obvious approach +- **Consistent Syntax** - Predictable patterns +- **Fail Loudly** - Errors are explicit + +--- + +## Specification Sections + +1. **[Overview & Lexical Structure](01_overview.md)** - Introduction, design philosophy, tokens, literals +2. **[Type System & Coercion](02_type_system.md)** - Runtime types, type conversion, coercion rules +3. **[Expressions & Statements](03_expressions.md)** - Paths, operators, functions, lambdas, assignments, variables +4. **[Control Flow](04_control_flow.md)** - If expressions/statements, match expressions/statements +5. **[Maps](05_maps.md)** - User-defined reusable transformations +6. **[Imports & Modules](06_imports.md)** - Namespace imports, file resolution +7. **[Execution Model](07_execution_model.md)** - Immutability, contexts, scoping, metadata +8. **[Error Handling](08_error_handling.md)** - `.catch()`, `.or()`, `throw()`, validation +9. **[Special Features](09_special_features.md)** - Dynamic fields, message filtering, non-structured data +10. **[Grammar Reference](10_grammar.md)** - Formal grammar definition +11. **[Common Patterns](11_common_patterns.md)** - Practical examples and idioms +12. **[Implementation Guide](12_implementation_guide.md)** - Optional optimizations, performance +13. **[Standard Library](13_standard_library.md)** - Required functions and methods reference + +--- + +## Quick Reference + +### Basic Syntax + +```bloblang +# Assignment +output.field = input.field +output@.key = input@.key + +# Variables +$user = input.user +$name = $user.name.uppercase() + +# Null-safe +output.city = input.user?.address?.city.or("Unknown") + +# Functional +output.results = input.items + .filter(item -> item.active) + .map(item -> item.value * 2) + .sort() + +# Conditionals +output.tier = match input.score as s { + s >= 100 => "gold", + s >= 50 => "silver", + _ => "bronze", +} + +# Maps (isolated functions) +map normalize(data) { + { + "id": data.user_id, + "name": data.full_name + } +} +output.user = normalize(input.user_data) + +# Imports +import "./utils.blobl" as utils +output.result = utils::transform(input.data) +``` + +### Key Features + +- **Immutable input:** `input` never changes +- **Mutable output:** `output` built incrementally +- **Mutable variables:** `$var` can be reassigned, block-scoped with shadowing +- **Null-safe operators:** `?.` and `?[]` +- **Explicit type coercion:** No implicit conversion +- **Function-style maps:** Called as `name(arg)` or `namespace::name(arg)` +- **Namespace imports:** `import "..." as name` +- **Lambda syntax:** Multi-param, multi-statement, for method arguments and map bodies + +## For Implementers + +See **Section 12: Implementation Guide** for: +- Optional optimization strategies (iterators, fusion) +- Performance expectations +- Testing requirements + +See **Section 13: Standard Library** for: +- Complete reference of all required functions and methods + +## Contributing to the Spec + +**Behavioural and feature changes must be expressed as tests in `./tests`.** Prose in the Markdown files is guidance for humans; the YAML conformance suite is what implementations have to pass. A change that is only in the prose is effectively optional — implementations may or may not adopt it, and there is no mechanical way to detect drift. + +When you modify the spec: + +- **Add or update tests** in `./tests` that exercise the new or changed behaviour. The test should fail on an implementation that has not adopted the change and pass on one that has. +- **Pure clarifications** (rewording, table reshaping, adding cross-references, resolving ambiguity in a way already consistent with existing tests) do not require new tests. Note this explicitly in the commit or PR so reviewers don't chase a missing test. +- **Removing or loosening a constraint** still counts as a behaviour change — add a test that would have failed under the old constraint. + +See `./tests/README.md` for the test schema. + From 4c946d5a15a0754e7bd70e5f4b68a2b3586441b7 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Thu, 30 Apr 2026 15:57:28 +0100 Subject: [PATCH 02/20] bloblang(v2): Add Go scanner, parser, AST, optimizer and printer Adds the syntax half of the Go runtime under internal/bloblang2/go/pratt/syntax/: a hand-written scanner, a Pratt parser producing an AST, a name resolver, a post-parse optimizer pass, and a V2 pretty-printer used by the migrator's output stage. Includes parser and scanner unit tests plus go-fuzz harnesses (FuzzParse, FuzzScan) with a seed corpus. --- internal/bloblang2/go/pratt/syntax/ast.go | 495 +++++++ .../go/pratt/syntax/fuzz_parser_test.go | 61 + .../go/pratt/syntax/fuzz_scanner_test.go | 29 + .../bloblang2/go/pratt/syntax/fuzz_test.go | 151 ++ .../bloblang2/go/pratt/syntax/optimize.go | 593 ++++++++ internal/bloblang2/go/pratt/syntax/parser.go | 1281 +++++++++++++++++ .../bloblang2/go/pratt/syntax/parser_test.go | 645 +++++++++ internal/bloblang2/go/pratt/syntax/print.go | 1025 +++++++++++++ .../bloblang2/go/pratt/syntax/print_test.go | 571 ++++++++ .../go/pratt/syntax/print_trivia_test.go | 103 ++ .../bloblang2/go/pratt/syntax/resolver.go | 1056 ++++++++++++++ internal/bloblang2/go/pratt/syntax/scanner.go | 566 ++++++++ .../bloblang2/go/pratt/syntax/scanner_test.go | 438 ++++++ .../testdata/fuzz/FuzzParse/2fa037f88dc1a315 | 2 + internal/bloblang2/go/pratt/syntax/token.go | 288 ++++ 15 files changed, 7304 insertions(+) create mode 100644 internal/bloblang2/go/pratt/syntax/ast.go create mode 100644 internal/bloblang2/go/pratt/syntax/fuzz_parser_test.go create mode 100644 internal/bloblang2/go/pratt/syntax/fuzz_scanner_test.go create mode 100644 internal/bloblang2/go/pratt/syntax/fuzz_test.go create mode 100644 internal/bloblang2/go/pratt/syntax/optimize.go create mode 100644 internal/bloblang2/go/pratt/syntax/parser.go create mode 100644 internal/bloblang2/go/pratt/syntax/parser_test.go create mode 100644 internal/bloblang2/go/pratt/syntax/print.go create mode 100644 internal/bloblang2/go/pratt/syntax/print_test.go create mode 100644 internal/bloblang2/go/pratt/syntax/print_trivia_test.go create mode 100644 internal/bloblang2/go/pratt/syntax/resolver.go create mode 100644 internal/bloblang2/go/pratt/syntax/scanner.go create mode 100644 internal/bloblang2/go/pratt/syntax/scanner_test.go create mode 100644 internal/bloblang2/go/pratt/syntax/testdata/fuzz/FuzzParse/2fa037f88dc1a315 create mode 100644 internal/bloblang2/go/pratt/syntax/token.go diff --git a/internal/bloblang2/go/pratt/syntax/ast.go b/internal/bloblang2/go/pratt/syntax/ast.go new file mode 100644 index 000000000..3adab500f --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/ast.go @@ -0,0 +1,495 @@ +package syntax + +// Node is the interface implemented by all AST nodes. +type Node interface { + nodePos() Pos +} + +// Expr is the interface implemented by all expression nodes. +type Expr interface { + Node + exprNode() +} + +// Stmt is the interface implemented by all statement nodes. +// +// Every Stmt carries a TriviaSet — this is how comments and blank lines +// survive a V1→V2 translation round-trip. The V2 parser leaves these empty +// (it doesn't attempt trivia preservation on V2 source); translators +// populate them; the printer renders them. +type Stmt interface { + Node + stmtNode() + // Trivia returns the statement's leading+trailing trivia bucket. The + // returned pointer is the statement's own storage — mutation sticks. + Trivia() *TriviaSet +} + +// TriviaKind identifies the kind of a trivia entry. +type TriviaKind int + +// Trivia kinds. +const ( + // TriviaComment is a `# ...` line comment. Text excludes the leading + // `#` and the trailing newline, verbatim otherwise. + TriviaComment TriviaKind = iota + // TriviaBlankLine marks an empty line between statements. + TriviaBlankLine +) + +// Trivia is a single entry in a TriviaSet. +type Trivia struct { + Kind TriviaKind + // Text is the comment text (without `#` or trailing newline). Empty + // for blank-line trivia. + Text string + Pos Pos +} + +// TriviaSet groups leading and trailing trivia for a node. Leading trivia +// is what appears before the statement (own-line comments, blank lines); +// trailing trivia is a same-line comment after the statement. +type TriviaSet struct { + Leading []Trivia + Trailing []Trivia +} + +// Trivia returns the set itself so *TriviaSet satisfies the Stmt contract +// when embedded on a concrete statement type. +func (t *TriviaSet) Trivia() *TriviaSet { return t } + +// +// Top-level structures +// + +// Program is the root AST node for a complete mapping. +type Program struct { + Stmts []Stmt // top-level statements (assignments, if/match stmts) + Maps []*MapDecl // map declarations (hoisted) + Imports []*ImportStmt // import statements + Namespaces map[string][]*MapDecl // imported maps keyed by namespace + MaxSlots int // max variable stack slots needed (set by resolver) + ReadsOutput bool // true if any expression reads output/output@ (set by resolver) +} + +func (p *Program) nodePos() Pos { + if len(p.Stmts) > 0 { + return p.Stmts[0].nodePos() + } + return Pos{Line: 1, Column: 1} +} + +// MapDecl is a user-defined function declaration. +type MapDecl struct { + TriviaSet + TokenPos Pos + Name string + Params []Param + Body *ExprBody + Namespaces map[string][]*MapDecl // namespaces available to this map (from its file's imports) + MaxSlots int // max variable stack slots needed for this map body (set by resolver) +} + +func (m *MapDecl) nodePos() Pos { return m.TokenPos } + +// Param is a parameter in a map or lambda declaration. +type Param struct { + Name string // empty for discard (_) + Default Expr // nil if no default + Discard bool // true for _ params + Pos Pos + SlotIndex int // resolver-assigned stack slot (-1 = unassigned) +} + +// ImportStmt is an import declaration. +type ImportStmt struct { + TriviaSet + TokenPos Pos + Path string // the import path string + Namespace string // the alias (as name) +} + +func (i *ImportStmt) nodePos() Pos { return i.TokenPos } + +// +// Expression body +// + +// ExprBody is a sequence of variable assignments followed by a final +// expression. Used in map bodies, lambda blocks, and if/match expression arms. +type ExprBody struct { + Assignments []*VarAssign // $var = expr (zero or more) + Result Expr // the final expression (required) +} + +// VarAssign is a variable assignment within an expression body. +type VarAssign struct { + TriviaSet + TokenPos Pos + Name string // variable name (without $) + Path []PathSegment // optional path components ($var.field[0] = ...) + Value Expr + SlotIndex int // resolver-assigned stack slot (-1 = unassigned) +} + +func (v *VarAssign) nodePos() Pos { return v.TokenPos } + +// +// Statements +// + +// Assignment is an assignment to output, output metadata, or a variable. +type Assignment struct { + TriviaSet + TokenPos Pos + Target AssignTarget + Value Expr +} + +func (a *Assignment) nodePos() Pos { return a.TokenPos } +func (a *Assignment) stmtNode() {} + +// AssignTarget represents the left-hand side of an assignment. +type AssignTarget struct { + Pos Pos // position of the target root token + Root AssignTargetRoot + VarName string // variable name (only for AssignVar root) + MetaAccess bool // true for output@ targets + Path []PathSegment // path components after root + SlotIndex int // resolver-assigned stack slot for AssignVar (-1 = unassigned) +} + +// AssignTargetRoot is the root of an assignment target. +type AssignTargetRoot int + +const ( + // AssignOutput targets the output document. + AssignOutput AssignTargetRoot = iota + // AssignVar targets a variable. + AssignVar +) + +// IfStmt is a standalone if statement containing output assignments. +type IfStmt struct { + TriviaSet + TokenPos Pos + Branches []IfBranch // first is the if, rest are else-if + Else []Stmt // else body (nil if no else) +} + +func (i *IfStmt) nodePos() Pos { return i.TokenPos } +func (i *IfStmt) stmtNode() {} + +// IfBranch is a single if or else-if branch in a statement. +type IfBranch struct { + Cond Expr + Body []Stmt +} + +// MatchStmt is a standalone match statement containing output assignments. +type MatchStmt struct { + TriviaSet + TokenPos Pos + Subject Expr // nil for boolean match without expression + Binding string // as-binding name (empty if no as) + Cases []MatchCase // match cases + BindingSlot int // resolver-assigned stack slot for as-binding (-1 = unassigned) +} + +func (m *MatchStmt) nodePos() Pos { return m.TokenPos } +func (m *MatchStmt) stmtNode() {} + +// MatchCase is a single case in a match statement or expression. +type MatchCase struct { + Pattern Expr // nil for wildcard (_) + Wildcard bool // true for _ + Body any // []Stmt (statement) or Expr (expression) or *ExprBody +} + +// +// Expressions +// + +// IfExpr is an if expression that returns a value. +type IfExpr struct { + TokenPos Pos + Branches []IfExprBranch // first is the if, rest are else-if + Else *ExprBody // else body (nil = void when no branch matches) +} + +func (i *IfExpr) nodePos() Pos { return i.TokenPos } +func (i *IfExpr) exprNode() {} + +// IfExprBranch is a single if or else-if branch in an expression. +type IfExprBranch struct { + Cond Expr + Body *ExprBody +} + +// MatchExpr is a match expression that returns a value. +type MatchExpr struct { + TokenPos Pos + Subject Expr // nil for boolean match without expression + Binding string // as-binding name (empty if no as) + Cases []MatchCase // cases with Expr or *ExprBody bodies + BindingSlot int // resolver-assigned stack slot for as-binding (-1 = unassigned) +} + +func (m *MatchExpr) nodePos() Pos { return m.TokenPos } +func (m *MatchExpr) exprNode() {} + +// BinaryExpr is a binary operation (a + b, a == b, etc.). +type BinaryExpr struct { + Left Expr + Op TokenType + OpPos Pos + Right Expr +} + +func (b *BinaryExpr) nodePos() Pos { return b.Left.nodePos() } +func (b *BinaryExpr) exprNode() {} + +// UnaryExpr is a unary operation (!, -). +type UnaryExpr struct { + Op TokenType + OpPos Pos + Operand Expr +} + +func (u *UnaryExpr) nodePos() Pos { return u.OpPos } +func (u *UnaryExpr) exprNode() {} + +// CallExpr is a function call (name(args) or namespace::name(args)). +// +// Prebound: see MethodCallExpr.Prebound — same mechanism, applied to +// functions. +type CallExpr struct { + TokenPos Pos + Name string // function name + Namespace string // namespace (empty for unqualified calls) + Args []CallArg + Named bool // true if using named arguments + FunctionOpcode uint16 // resolver-assigned opcode for stdlib functions (0 = user map or unresolved) + Prebound any +} + +func (c *CallExpr) nodePos() Pos { return c.TokenPos } +func (c *CallExpr) exprNode() {} + +// CallArg is a single argument in a function or method call. +// +// Folded is the parse-time-precomputed form of this argument, populated +// by the resolver when the receiving method/function exposes an +// ArgFolder (see MethodInfo / FunctionInfo) and the argument shape +// allows folding (typically: the Value is a string literal). When set, +// the interpreter substitutes Folded for the evaluated Value verbatim, +// skipping repeat work on every call. This is how e.g. regex patterns +// get compiled once at parse time rather than on every invocation. +// When nil, the argument is evaluated normally at runtime. +type CallArg struct { + Name string // empty for positional args + Value Expr + Folded any +} + +// MethodCallExpr is a method call on a receiver (receiver.method(args)). +// +// Prebound, if non-nil, is the parse-time-bound dispatch target populated by +// the resolver when the receiving method exposes a CallFolder (see +// MethodInfo). The interpreter uses it in place of the normal spec lookup, +// skipping per-call constructor invocation. Currently this mechanism +// underpins the public plugin surface's static-args optimisation. +type MethodCallExpr struct { + Receiver Expr + Method string + MethodPos Pos + Args []CallArg + Named bool // true if using named arguments + NullSafe bool // true for ?.method() + MethodOpcode uint16 // resolver-assigned opcode for stdlib methods (0 = intrinsic or unresolved) + Prebound any +} + +func (m *MethodCallExpr) nodePos() Pos { return m.Receiver.nodePos() } +func (m *MethodCallExpr) exprNode() {} + +// FieldAccessExpr is a field access on a receiver (receiver.field). +type FieldAccessExpr struct { + Receiver Expr + Field string + FieldPos Pos + NullSafe bool // true for ?.field +} + +func (f *FieldAccessExpr) nodePos() Pos { return f.Receiver.nodePos() } +func (f *FieldAccessExpr) exprNode() {} + +// IndexExpr is an index access on a receiver (receiver[index]). +type IndexExpr struct { + Receiver Expr + Index Expr + LBracketPos Pos + NullSafe bool // true for ?[index] +} + +func (i *IndexExpr) nodePos() Pos { return i.Receiver.nodePos() } +func (i *IndexExpr) exprNode() {} + +// LambdaExpr is a lambda expression (params -> body). +type LambdaExpr struct { + TokenPos Pos + Params []Param + Body *ExprBody // single expression or block +} + +func (l *LambdaExpr) nodePos() Pos { return l.TokenPos } +func (l *LambdaExpr) exprNode() {} + +// LiteralExpr is a literal value (int, float, string, bool, null). +type LiteralExpr struct { + TokenPos Pos + TokenType TokenType // INT, FLOAT, STRING, RAW_STRING, TRUE, FALSE, NULL + Value string // raw literal text (for INT/FLOAT) or processed string content +} + +func (l *LiteralExpr) nodePos() Pos { return l.TokenPos } +func (l *LiteralExpr) exprNode() {} + +// ArrayLiteral is an array literal [elem, ...]. +type ArrayLiteral struct { + LBracketPos Pos + Elements []Expr +} + +func (a *ArrayLiteral) nodePos() Pos { return a.LBracketPos } +func (a *ArrayLiteral) exprNode() {} + +// ObjectLiteral is an object literal {key: value, ...}. +type ObjectLiteral struct { + LBracePos Pos + Entries []ObjectEntry +} + +func (o *ObjectLiteral) nodePos() Pos { return o.LBracePos } +func (o *ObjectLiteral) exprNode() {} + +// ObjectEntry is a single key-value pair in an object literal. +type ObjectEntry struct { + Key Expr + Value Expr +} + +// InputExpr is the "input" keyword as an expression atom. +type InputExpr struct { + TokenPos Pos +} + +func (i *InputExpr) nodePos() Pos { return i.TokenPos } +func (i *InputExpr) exprNode() {} + +// InputMetaExpr is "input@" as an expression atom. +type InputMetaExpr struct { + TokenPos Pos +} + +func (i *InputMetaExpr) nodePos() Pos { return i.TokenPos } +func (i *InputMetaExpr) exprNode() {} + +// OutputExpr is the "output" keyword as an expression atom (read context). +type OutputExpr struct { + TokenPos Pos +} + +func (o *OutputExpr) nodePos() Pos { return o.TokenPos } +func (o *OutputExpr) exprNode() {} + +// OutputMetaExpr is "output@" as an expression atom (read context). +type OutputMetaExpr struct { + TokenPos Pos +} + +func (o *OutputMetaExpr) nodePos() Pos { return o.TokenPos } +func (o *OutputMetaExpr) exprNode() {} + +// VarExpr is a variable reference ($name) as an expression atom. +type VarExpr struct { + TokenPos Pos + Name string // without the $ + SlotIndex int // resolver-assigned stack slot (-1 = unassigned) +} + +func (v *VarExpr) nodePos() Pos { return v.TokenPos } +func (v *VarExpr) exprNode() {} + +// IdentExpr is a bare identifier in expression position. Resolved by the +// name resolution pass to a parameter, map name, or stdlib function. +type IdentExpr struct { + TokenPos Pos + Namespace string // non-empty for qualified references (e.g., math::double) + Name string + SlotIndex int // resolver-assigned stack slot when identifier is a variable/parameter (-1 = not a variable) +} + +func (i *IdentExpr) nodePos() Pos { return i.TokenPos } +func (i *IdentExpr) exprNode() {} + +// +// Path expressions (produced by post-parse collapse pass) +// + +// PathRoot is the root of a collapsed path expression. +type PathRoot int + +const ( + // PathRootInput is the "input" root. + PathRootInput PathRoot = iota + // PathRootInputMeta is the "input@" root. + PathRootInputMeta + // PathRootOutput is the "output" root. + PathRootOutput + // PathRootOutputMeta is the "output@" root. + PathRootOutputMeta + // PathRootVar is a "$variable" root. + PathRootVar +) + +// PathExpr is a collapsed path expression: root + segments. +// Produced by the post-parse optimization pass from chains like +// FieldAccess(FieldAccess(InputExpr, "user"), "name") → PathExpr(input, ["user", "name"]). +type PathExpr struct { + TokenPos Pos + Root PathRoot + VarName string // only set when Root == PathRootVar + Segments []PathSegment + VarSlotIndex int // resolver-assigned stack slot for PathRootVar (-1 = unassigned) +} + +func (p *PathExpr) nodePos() Pos { return p.TokenPos } +func (p *PathExpr) exprNode() {} + +// PathSegment is a single segment in a path expression. Prebound, for a +// method segment, mirrors MethodCallExpr.Prebound and lets plugin methods +// cache a constructor's result at parse time. +type PathSegment struct { + Kind PathSegmentKind + Name string // for FieldAccess and MethodCall + Index Expr // for Index + Args []CallArg // for MethodCall + Named bool // for MethodCall: named arguments + NullSafe bool + Pos Pos + MethodOpcode uint16 // resolver-assigned opcode for method segments (0 = unresolved) + Prebound any // for MethodCall: parse-time-bound dispatch target (plugin methods) +} + +// PathSegmentKind is the type of a path segment. +type PathSegmentKind int + +const ( + // PathSegField is a field access (.name or ?.name). + PathSegField PathSegmentKind = iota + // PathSegIndex is an index access ([expr] or ?[expr]). + PathSegIndex + // PathSegMethod is a method call (.name(args) or ?.name(args)). + PathSegMethod +) diff --git a/internal/bloblang2/go/pratt/syntax/fuzz_parser_test.go b/internal/bloblang2/go/pratt/syntax/fuzz_parser_test.go new file mode 100644 index 000000000..cfc237dc1 --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/fuzz_parser_test.go @@ -0,0 +1,61 @@ +package syntax + +import "testing" + +// FuzzParse exercises the parser with arbitrary bytes and asserts: +// +// 1. No panic, regardless of input. +// 2. Determinism: two parses of the same input produce the same error count. +// 3. Resolve never panics on a parsed program (errors are fine). +// 4. Print round-trip stability: for any cleanly-parsed input, the second +// application of Parse → Print produces the same string as the first. +// 5. Optimize idempotence: running Optimize twice yields the same printed +// output as running it once. +func FuzzParse(f *testing.F) { + for _, s := range loadFuzzCorpus(f) { + f.Add(s) + } + f.Fuzz(func(t *testing.T, src string) { + if len(src) > fuzzMaxInputSize { + return + } + + prog1, errs1 := Parse(src, "", nil) + _, errs2 := Parse(src, "", nil) + if len(errs1) != len(errs2) { + t.Fatalf("non-deterministic parse: %d vs %d errors", len(errs1), len(errs2)) + } + + // Resolve must not panic; with empty opts every method/function + // is unknown but errors-vs-no-errors is irrelevant here. + _ = Resolve(prog1, ResolveOptions{}) + + // Round-trip and idempotence properties only apply to clean parses. + if len(errs1) > 0 { + return + } + + printed1 := Print(prog1) + prog3, errs3 := Parse(printed1, "", nil) + if len(errs3) > 0 { + t.Fatalf("round-trip failed: re-parse of printed output errored\nprinted:\n%s\nerrors:\n%s", printed1, FormatErrors(errs3)) + } + printed2 := Print(prog3) + if printed1 != printed2 { + t.Fatalf("Print not stable across round-trip:\nfirst:\n%s\nsecond:\n%s", printed1, printed2) + } + + progA, _ := Parse(src, "", nil) + Optimize(progA) + printA := Print(progA) + + progB, _ := Parse(src, "", nil) + Optimize(progB) + Optimize(progB) + printB := Print(progB) + + if printA != printB { + t.Fatalf("Optimize not idempotent:\nonce:\n%s\ntwice:\n%s", printA, printB) + } + }) +} diff --git a/internal/bloblang2/go/pratt/syntax/fuzz_scanner_test.go b/internal/bloblang2/go/pratt/syntax/fuzz_scanner_test.go new file mode 100644 index 000000000..d483f0232 --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/fuzz_scanner_test.go @@ -0,0 +1,29 @@ +package syntax + +import "testing" + +// FuzzScanner drives the tokenizer with arbitrary bytes and verifies it +// always terminates without panicking and within a bounded token count. +// +// Each call to scanner.next() must consume at least one byte of source or +// emit EOF. So total tokens cannot exceed input length by more than a small +// constant; we allow generous headroom and treat anything beyond as a +// likely infinite loop. +func FuzzScanner(f *testing.F) { + for _, s := range loadFuzzCorpus(f) { + f.Add(s) + } + f.Fuzz(func(t *testing.T, src string) { + if len(src) > fuzzMaxInputSize { + return + } + s := newScanner(src, "") + maxTokens := len(src)*8 + 64 + for range maxTokens { + if s.next().Type == EOF { + return + } + } + t.Fatalf("scanner produced > %d tokens for %d-byte input — possible infinite loop", maxTokens, len(src)) + }) +} diff --git a/internal/bloblang2/go/pratt/syntax/fuzz_test.go b/internal/bloblang2/go/pratt/syntax/fuzz_test.go new file mode 100644 index 000000000..923252c5d --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/fuzz_test.go @@ -0,0 +1,151 @@ +package syntax + +import ( + "io/fs" + "os" + "path/filepath" + "strings" + "testing" + + "gopkg.in/yaml.v3" +) + +// fuzzMaxInputSize caps fuzz input length; bigger inputs are skipped because +// they rarely add coverage that smaller inputs don't already hit. +const fuzzMaxInputSize = 16 * 1024 + +// curatedFuzzSeeds covers a broad cross-section of V2 syntax in compact form. +var curatedFuzzSeeds = []string{ + ``, + `output = 42`, + `output = "hello"`, + `output = -3.14`, + `output.x = true`, + `output.x = null`, + `output = "line\n\ttab"`, + `output = "uni é"`, + "output = `raw`", + `output = [1, 2, 3]`, + `output = {"a": 1, "b": 2}`, + `output = [1, [2, [3, [4]]]]`, + `output = 1 + 2 * 3 - 4 / 5 % 6`, + `output = a && b || !c`, + `output = (1 == 2) != (3 < 4) && (5 >= 6)`, + `output = -input.value`, + `output = input.foo.bar?.baz[0]?[1]`, + `output = input@.key`, + `$x = input.value`, + `output = uuid_v4()`, + `output = input.uppercase()`, + `output = input.map(x -> x * 2)`, + `output = input.fold(0, (acc, x) -> acc + x)`, + `output = math::double(5)`, + `output = foo(a: 1, b: 2)`, + "output = input.map(x -> {\n $y = x * 2\n $y + 1\n})", + "output = if input.x > 0 { \"big\" } else { \"small\" }", + "output = match input {\n this > 0 => \"pos\"\n _ => \"other\"\n}", + "map double { this * 2 }\noutput = input.apply(double)", + `import "lib.blobl" as lib`, + "output.a = 1\noutput.b = 2\noutput.c = 3", + "# leading\noutput = 1 # trailing\n\n# blank above", + `output = deleted()`, + `output = throw("boom")`, + `output = void()`, +} + +// adversarialFuzzSeeds are pathological inputs targeting known edge cases. +var adversarialFuzzSeeds = []string{ + "output = " + strings.Repeat("(", 200) + "1" + strings.Repeat(")", 200), + "output = " + strings.Repeat("input.x.", 100) + "y", + "\x00\x00\x00", + "\xef\xbb\xbfoutput = 1", + "output = \"\\uD800\"", + "output = \"unterminated", + "output = `unterminated raw", + "output = #incomplete", + "output = 1\r\n2\r\n3", + "output = 0x", + "output = 1.", + "output = .1", + "output = $", + "output = ::", + "output = []", + "output = {}", +} + +// loadFuzzCorpus returns the deduplicated full seed corpus: curated + +// adversarial + spec-derived mappings. +func loadFuzzCorpus(tb testing.TB) []string { + tb.Helper() + seen := make(map[string]struct{}) + var out []string + add := func(s string) { + if _, ok := seen[s]; ok { + return + } + seen[s] = struct{}{} + out = append(out, s) + } + for _, s := range curatedFuzzSeeds { + add(s) + } + for _, s := range adversarialFuzzSeeds { + add(s) + } + for _, s := range loadSpecMappings(tb) { + add(s) + } + return out +} + +// loadSpecMappings walks the spec/tests directory and extracts every +// `mapping:` field from each YAML test file (top-level and nested within +// `cases:`). Best-effort: returns whatever it can parse, silently skipping +// files it can't. +func loadSpecMappings(tb testing.TB) []string { + tb.Helper() + specDir := filepath.Join("..", "..", "..", "spec", "tests") + if _, err := os.Stat(specDir); err != nil { + tb.Logf("spec dir not found at %s; skipping spec corpus", specDir) + return nil + } + var mappings []string + walkErr := filepath.WalkDir(specDir, func(path string, d fs.DirEntry, err error) error { + if err != nil || d.IsDir() { + return nil + } + if !strings.HasSuffix(path, ".yaml") { + return nil + } + data, readErr := os.ReadFile(path) + if readErr != nil { + return nil + } + var f struct { + Tests []struct { + Mapping string `yaml:"mapping"` + Cases []struct { + Mapping string `yaml:"mapping"` + } `yaml:"cases"` + } `yaml:"tests"` + } + if err := yaml.Unmarshal(data, &f); err != nil { + return nil + } + for _, t := range f.Tests { + if t.Mapping != "" { + mappings = append(mappings, t.Mapping) + } + for _, c := range t.Cases { + if c.Mapping != "" { + mappings = append(mappings, c.Mapping) + } + } + } + return nil + }) + if walkErr != nil { + tb.Logf("walking spec dir: %v", walkErr) + } + return mappings +} diff --git a/internal/bloblang2/go/pratt/syntax/optimize.go b/internal/bloblang2/go/pratt/syntax/optimize.go new file mode 100644 index 000000000..5960dc2ba --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/optimize.go @@ -0,0 +1,593 @@ +package syntax + +import ( + "math" + "strconv" +) + +// Optimize performs post-parse AST optimizations on a program: +// - Path collapse: chains of field/index/method access → PathExpr +// - Constant folding: literal-only expressions evaluated at compile time +// - Dead code elimination: unreachable if/match branches pruned +// +// Call after Parse and before Resolve. +func Optimize(prog *Program) { + o := &optimizer{} + for i, stmt := range prog.Stmts { + prog.Stmts[i] = o.optimizeStmt(stmt) + } + for _, m := range prog.Maps { + o.optimizeExprBody(m.Body) + } +} + +type optimizer struct{} + +// ----------------------------------------------------------------------- +// Statement optimization +// ----------------------------------------------------------------------- + +func (o *optimizer) optimizeStmt(stmt Stmt) Stmt { + switch s := stmt.(type) { + case *Assignment: + s.Value = o.optimizeExpr(s.Value) + case *IfStmt: + o.optimizeIfStmt(s) + case *MatchStmt: + o.optimizeMatchStmt(s) + } + return stmt +} + +func (o *optimizer) optimizeIfStmt(s *IfStmt) { + // Dead code elimination: prune branches with literal boolean conditions. + var kept []IfBranch + for _, branch := range s.Branches { + branch.Cond = o.optimizeExpr(branch.Cond) + if lit, ok := branch.Cond.(*LiteralExpr); ok { + if lit.TokenType == TRUE { + // Condition is always true — keep this branch, discard the rest. + for i := range branch.Body { + branch.Body[i] = o.optimizeStmt(branch.Body[i]) + } + kept = append(kept, branch) + s.Branches = kept + s.Else = nil // unreachable + return + } + if lit.TokenType == FALSE { + // Condition is always false — skip this branch. + continue + } + // Non-boolean literal (string, int, null, etc.) — keep it; will error at runtime. + } + for i := range branch.Body { + branch.Body[i] = o.optimizeStmt(branch.Body[i]) + } + kept = append(kept, branch) + } + s.Branches = kept + for i := range s.Else { + s.Else[i] = o.optimizeStmt(s.Else[i]) + } +} + +func (o *optimizer) optimizeMatchStmt(s *MatchStmt) { + if s.Subject != nil { + s.Subject = o.optimizeExpr(s.Subject) + } + for i := range s.Cases { + if s.Cases[i].Pattern != nil { + s.Cases[i].Pattern = o.optimizeExpr(s.Cases[i].Pattern) + } + if body, ok := s.Cases[i].Body.([]Stmt); ok { + for j := range body { + body[j] = o.optimizeStmt(body[j]) + } + } + } +} + +// ----------------------------------------------------------------------- +// Expression optimization +// ----------------------------------------------------------------------- + +func (o *optimizer) optimizeExpr(expr Expr) Expr { + if expr == nil { + return nil + } + + switch e := expr.(type) { + case *BinaryExpr: + e.Left = o.optimizeExpr(e.Left) + e.Right = o.optimizeExpr(e.Right) + if folded := o.foldBinary(e); folded != nil { + return folded + } + return e + + case *UnaryExpr: + e.Operand = o.optimizeExpr(e.Operand) + if folded := o.foldUnary(e); folded != nil { + return folded + } + return e + + case *FieldAccessExpr: + e.Receiver = o.optimizeExpr(e.Receiver) + return o.tryCollapsePath(e) + + case *IndexExpr: + e.Receiver = o.optimizeExpr(e.Receiver) + e.Index = o.optimizeExpr(e.Index) + return o.tryCollapsePath(e) + + case *MethodCallExpr: + e.Receiver = o.optimizeExpr(e.Receiver) + for i := range e.Args { + e.Args[i].Value = o.optimizeExpr(e.Args[i].Value) + } + return o.tryCollapsePath(e) + + case *CallExpr: + for i := range e.Args { + e.Args[i].Value = o.optimizeExpr(e.Args[i].Value) + } + return e + + case *ArrayLiteral: + for i := range e.Elements { + e.Elements[i] = o.optimizeExpr(e.Elements[i]) + } + return e + + case *ObjectLiteral: + for i := range e.Entries { + e.Entries[i].Key = o.optimizeExpr(e.Entries[i].Key) + e.Entries[i].Value = o.optimizeExpr(e.Entries[i].Value) + } + return e + + case *IfExpr: + return o.optimizeIfExpr(e) + + case *MatchExpr: + return o.optimizeMatchExpr(e) + + case *LambdaExpr: + o.optimizeExprBody(e.Body) + return e + + case *PathExpr: + // Already collapsed — optimize sub-expressions in index segments. + for i := range e.Segments { + if e.Segments[i].Index != nil { + e.Segments[i].Index = o.optimizeExpr(e.Segments[i].Index) + } + for j := range e.Segments[i].Args { + e.Segments[i].Args[j].Value = o.optimizeExpr(e.Segments[i].Args[j].Value) + } + } + return e + + default: + // LiteralExpr, InputExpr, InputMetaExpr, OutputExpr, OutputMetaExpr, + // VarExpr, IdentExpr — no children to optimize. + return expr + } +} + +func (o *optimizer) optimizeExprBody(body *ExprBody) { + if body == nil { + return + } + for i := range body.Assignments { + body.Assignments[i].Value = o.optimizeExpr(body.Assignments[i].Value) + } + body.Result = o.optimizeExpr(body.Result) +} + +func (o *optimizer) optimizeIfExpr(e *IfExpr) Expr { + // Dead code elimination: prune branches with literal boolean conditions. + var kept []IfExprBranch + for _, branch := range e.Branches { + branch.Cond = o.optimizeExpr(branch.Cond) + if lit, ok := branch.Cond.(*LiteralExpr); ok { + if lit.TokenType == TRUE { + // Always true — this branch always executes. + o.optimizeExprBody(branch.Body) + kept = append(kept, branch) + e.Branches = kept + e.Else = nil + return e + } + if lit.TokenType == FALSE { + // Always false — skip this branch. + continue + } + // Non-boolean literal — keep it; will error at runtime. + } + o.optimizeExprBody(branch.Body) + kept = append(kept, branch) + } + e.Branches = kept + if e.Else != nil { + o.optimizeExprBody(e.Else) + } + return e +} + +func (o *optimizer) optimizeMatchExpr(e *MatchExpr) Expr { + if e.Subject != nil { + e.Subject = o.optimizeExpr(e.Subject) + } + for i := range e.Cases { + if e.Cases[i].Pattern != nil { + e.Cases[i].Pattern = o.optimizeExpr(e.Cases[i].Pattern) + } + switch body := e.Cases[i].Body.(type) { + case Expr: + e.Cases[i].Body = o.optimizeExpr(body) + case *ExprBody: + o.optimizeExprBody(body) + } + } + return e +} + +// ----------------------------------------------------------------------- +// Path collapse +// ----------------------------------------------------------------------- + +// tryCollapsePath attempts to collapse a postfix chain (field access, index, +// method call) rooted at a known context (input, output, variable, etc.) +// into a single PathExpr. +func (o *optimizer) tryCollapsePath(expr Expr) Expr { + // Unwind the chain bottom-up to find the root and collect segments. + var segments []PathSegment + current := expr + + for { + switch e := current.(type) { + case *FieldAccessExpr: + segments = append(segments, PathSegment{ + Kind: PathSegField, + Name: e.Field, + NullSafe: e.NullSafe, + Pos: e.FieldPos, + }) + current = e.Receiver + continue + + case *IndexExpr: + segments = append(segments, PathSegment{ + Kind: PathSegIndex, + Index: e.Index, + NullSafe: e.NullSafe, + Pos: e.LBracketPos, + }) + current = e.Receiver + continue + + case *MethodCallExpr: + // Intrinsic methods (catch, or) require special dispatch in the + // interpreter (short-circuit evaluation, error interception) and + // cannot be collapsed into PathExpr segments. + if e.Method == "catch" || e.Method == "or" { + return expr + } + segments = append(segments, PathSegment{ + Kind: PathSegMethod, + Name: e.Method, + Args: e.Args, + Named: e.Named, + NullSafe: e.NullSafe, + Pos: e.MethodPos, + }) + current = e.Receiver + continue + + case *InputExpr: + if len(segments) == 0 { + return expr + } + reverseSegments(segments) + return &PathExpr{TokenPos: e.TokenPos, Root: PathRootInput, Segments: segments} + + case *InputMetaExpr: + if len(segments) == 0 { + return expr + } + reverseSegments(segments) + return &PathExpr{TokenPos: e.TokenPos, Root: PathRootInputMeta, Segments: segments} + + case *OutputExpr: + if len(segments) == 0 { + return expr + } + reverseSegments(segments) + return &PathExpr{TokenPos: e.TokenPos, Root: PathRootOutput, Segments: segments} + + case *OutputMetaExpr: + if len(segments) == 0 { + return expr + } + reverseSegments(segments) + return &PathExpr{TokenPos: e.TokenPos, Root: PathRootOutputMeta, Segments: segments} + + case *VarExpr: + if len(segments) == 0 { + return expr + } + reverseSegments(segments) + return &PathExpr{TokenPos: e.TokenPos, Root: PathRootVar, VarName: e.Name, Segments: segments} + + case *IdentExpr: + // Bare identifier (parameter, match binding) — cannot collapse + // because we don't have a PathRoot for bare identifiers. + return expr + + default: + // Non-collapsible root (call expression, literal, etc.) + return expr + } + } +} + +func reverseSegments(segs []PathSegment) { + for i, j := 0, len(segs)-1; i < j; i, j = i+1, j-1 { + segs[i], segs[j] = segs[j], segs[i] + } +} + +// ----------------------------------------------------------------------- +// Constant folding +// ----------------------------------------------------------------------- + +// foldBinary attempts to evaluate a binary expression with literal operands +// at compile time. Returns nil if folding is not possible. +func (o *optimizer) foldBinary(e *BinaryExpr) Expr { + left, lok := e.Left.(*LiteralExpr) + right, rok := e.Right.(*LiteralExpr) + if !lok || !rok { + return nil + } + + pos := left.TokenPos + + // String concatenation. + if e.Op == PLUS && isStringLiteral(left) && isStringLiteral(right) { + return &LiteralExpr{TokenPos: pos, TokenType: STRING, Value: left.Value + right.Value} + } + + // Integer arithmetic. + if left.TokenType == INT && right.TokenType == INT { + a, aErr := strconv.ParseInt(left.Value, 10, 64) + b, bErr := strconv.ParseInt(right.Value, 10, 64) + if aErr != nil || bErr != nil { + return nil + } + result, ok := foldIntOp(a, b, e.Op) + if !ok { + return nil // overflow or unsupported op + } + return &LiteralExpr{TokenPos: pos, TokenType: INT, Value: strconv.FormatInt(result, 10)} + } + + // Float arithmetic (only when at least one operand is a float literal). + // Skip folding if an integer operand exceeds 2^53 (would lose precision). + if isNumericLiteral(left) && isNumericLiteral(right) && + (left.TokenType == FLOAT || right.TokenType == FLOAT) { + if !canSafelyPromoteToFloat(left) || !canSafelyPromoteToFloat(right) { + return nil // precision loss — let runtime handle the error + } + a, aOk := parseLiteralFloat(left) + b, bOk := parseLiteralFloat(right) + if !aOk || !bOk { + return nil + } + result, ok := foldFloatOp(a, b, e.Op) + if !ok { + return nil + } + return &LiteralExpr{TokenPos: pos, TokenType: FLOAT, Value: strconv.FormatFloat(result, 'g', -1, 64)} + } + + // Boolean logic. + if isBoolLiteral(left) && isBoolLiteral(right) { + a := left.TokenType == TRUE + b := right.TokenType == TRUE + var result bool + switch e.Op { + case AND: + result = a && b + case OR: + result = a || b + case EQ: + result = a == b + case NE: + result = a != b + default: + return nil + } + return boolLiteral(pos, result) + } + + // Equality of same-type literals. + if e.Op == EQ || e.Op == NE { + if left.TokenType == right.TokenType { + eq := left.Value == right.Value + if e.Op == NE { + eq = !eq + } + return boolLiteral(pos, eq) + } + // Cross-type equality is always false. + if isLiteralCrossType(left, right) { + return boolLiteral(pos, e.Op == NE) + } + } + + return nil +} + +// foldUnary attempts to evaluate a unary expression with a literal operand +// at compile time. +func (o *optimizer) foldUnary(e *UnaryExpr) Expr { + lit, ok := e.Operand.(*LiteralExpr) + if !ok { + return nil + } + pos := lit.TokenPos + + switch e.Op { + case BANG: + if lit.TokenType == TRUE { + return boolLiteral(pos, false) + } + if lit.TokenType == FALSE { + return boolLiteral(pos, true) + } + case MINUS: + if lit.TokenType == INT { + n, err := strconv.ParseInt(lit.Value, 10, 64) + if err != nil { + return nil + } + if n == math.MinInt64 { + return nil // -MinInt64 overflows + } + return &LiteralExpr{TokenPos: pos, TokenType: INT, Value: strconv.FormatInt(-n, 10)} + } + if lit.TokenType == FLOAT { + f, err := strconv.ParseFloat(lit.Value, 64) + if err != nil { + return nil + } + return &LiteralExpr{TokenPos: pos, TokenType: FLOAT, Value: strconv.FormatFloat(-f, 'g', -1, 64)} + } + } + return nil +} + +// ----------------------------------------------------------------------- +// Constant folding helpers +// ----------------------------------------------------------------------- + +// canSafelyPromoteToFloat checks whether an integer literal can be exactly +// represented as float64. Integers with magnitude > 2^53 cannot. +func canSafelyPromoteToFloat(l *LiteralExpr) bool { + if l.TokenType == FLOAT { + return true // already float + } + if l.TokenType != INT { + return false + } + n, err := strconv.ParseInt(l.Value, 10, 64) + if err != nil { + return false + } + const maxSafeInt = int64(1 << 53) + return n >= -maxSafeInt && n <= maxSafeInt +} + +func isStringLiteral(l *LiteralExpr) bool { + return l.TokenType == STRING || l.TokenType == RAW_STRING +} + +func isNumericLiteral(l *LiteralExpr) bool { + return l.TokenType == INT || l.TokenType == FLOAT +} + +func isBoolLiteral(l *LiteralExpr) bool { + return l.TokenType == TRUE || l.TokenType == FALSE +} + +func parseLiteralFloat(l *LiteralExpr) (float64, bool) { + f, err := strconv.ParseFloat(l.Value, 64) + return f, err == nil +} + +func boolLiteral(pos Pos, v bool) *LiteralExpr { + if v { + return &LiteralExpr{TokenPos: pos, TokenType: TRUE, Value: "true"} + } + return &LiteralExpr{TokenPos: pos, TokenType: FALSE, Value: "false"} +} + +func isLiteralCrossType(a, b *LiteralExpr) bool { + aFamily := literalFamily(a) + bFamily := literalFamily(b) + return aFamily != bFamily && aFamily != 0 && bFamily != 0 +} + +func literalFamily(l *LiteralExpr) int { + switch l.TokenType { + case INT, FLOAT: + return 1 // numeric + case STRING, RAW_STRING: + return 2 + case TRUE, FALSE: + return 3 + case NULL: + return 4 + default: + return 0 + } +} + +func foldIntOp(a, b int64, op TokenType) (int64, bool) { + switch op { + case PLUS: + r := a + b + if (b > 0 && r < a) || (b < 0 && r > a) { + return 0, false // overflow + } + return r, true + case MINUS: + r := a - b + if (b > 0 && r > a) || (b < 0 && r < a) { + return 0, false + } + return r, true + case STAR: + if a == 0 || b == 0 { + return 0, true + } + r := a * b + if r/a != b { + return 0, false + } + return r, true + case PERCENT: + if b == 0 { + return 0, false + } + return a % b, true + default: + return 0, false + } +} + +func foldFloatOp(a, b float64, op TokenType) (float64, bool) { + switch op { + case PLUS: + return a + b, true + case MINUS: + return a - b, true + case STAR: + return a * b, true + case SLASH: + if b == 0 { + return 0, false + } + return a / b, true + case PERCENT: + if b == 0 { + return 0, false + } + return math.Mod(a, b), true + default: + return 0, false + } +} diff --git a/internal/bloblang2/go/pratt/syntax/parser.go b/internal/bloblang2/go/pratt/syntax/parser.go new file mode 100644 index 000000000..5d9990d6e --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/parser.go @@ -0,0 +1,1281 @@ +package syntax + +import ( + "fmt" + "strings" +) + +// Parse parses a Bloblang V2 mapping and returns the AST. +// files provides a virtual filesystem for import resolution. +func Parse(src, file string, files map[string]string) (*Program, []PosError) { + p := &parser{ + files: files, + parsing: map[string]bool{file: true}, + currentFile: file, + } + p.init(src, file) + prog := p.parseProgram() + return prog, p.errors +} + +type parser struct { + s *scanner + tok Token // current token + files map[string]string + parsing map[string]bool // files currently being parsed (circular import detection) + currentFile string + errors []PosError +} + +func (p *parser) init(src, file string) { + p.s = newScanner(src, file) + p.currentFile = file + p.advance() // prime the first token +} + +// advance consumes the current token and moves to the next. +func (p *parser) advance() { + p.tok = p.s.next() + // Collect scanner errors. + for len(p.s.errors) > 0 { + p.errors = append(p.errors, p.s.errors...) + p.s.errors = p.s.errors[:0] + } +} + +// expect consumes the current token if it matches the expected type, +// otherwise adds an error. +func (p *parser) expect(typ TokenType) Token { + tok := p.tok + if tok.Type != typ { + p.error(tok.Pos, fmt.Sprintf("expected %s, got %s", typ, tok.Type)) + return tok + } + p.advance() + return tok +} + +// at reports whether the current token is of the given type. +func (p *parser) at(typ TokenType) bool { + return p.tok.Type == typ +} + +// skipNL consumes any NL tokens. +func (p *parser) skipNL() { + for p.tok.Type == NL { + p.advance() + } +} + +func (p *parser) error(pos Pos, msg string) { + p.errors = append(p.errors, PosError{Pos: pos, Msg: msg}) +} + +// recover skips tokens until the next NL or EOF for error recovery. +func (p *parser) recover() { + for p.tok.Type != NL && p.tok.Type != EOF { + p.advance() + } +} + +// ----------------------------------------------------------------------- +// Top-level parsing +// ----------------------------------------------------------------------- + +func (p *parser) parseProgram() *Program { + prog := &Program{ + Namespaces: make(map[string][]*MapDecl), + } + + p.skipNL() + for p.tok.Type != EOF { + switch p.tok.Type { + case MAP: + m := p.parseMapDecl() + if m != nil { + prog.Maps = append(prog.Maps, m) + } + case IMPORT: + imp := p.parseImport(prog) + if imp != nil { + prog.Imports = append(prog.Imports, imp) + } + default: + stmt := p.parseStatement() + if stmt != nil { + prog.Stmts = append(prog.Stmts, stmt) + } + } + // Consume statement separator. + if p.tok.Type == NL { + p.advance() + p.skipNL() + } else if p.tok.Type != EOF { + p.error(p.tok.Pos, fmt.Sprintf("expected newline or end of input, got %s", p.tok.Type)) + p.recover() + p.skipNL() + } + } + + return prog +} + +func (p *parser) parseMapDecl() *MapDecl { + pos := p.tok.Pos + p.advance() // skip 'map' + + nameTok := p.expect(IDENT) + p.expect(LPAREN) + params := p.parseParamList() + p.expect(RPAREN) + p.expect(LBRACE) + + body := p.parseExprBody() + + p.skipNL() + p.expect(RBRACE) + + return &MapDecl{ + TokenPos: pos, + Name: nameTok.Literal, + Params: params, + Body: body, + } +} + +func (p *parser) parseParamList() []Param { + if p.at(RPAREN) { + return nil + } + + var params []Param + params = append(params, p.parseParam()) + for p.at(COMMA) { + p.advance() + params = append(params, p.parseParam()) + } + return params +} + +func (p *parser) parseParam() Param { + pos := p.tok.Pos + + if p.at(UNDERSCORE) { + p.advance() + if p.at(ASSIGN) { + p.error(pos, "discard parameter _ cannot have a default value") + p.advance() // skip = + p.parseLiteral() // consume the value + } + return Param{Discard: true, Pos: pos} + } + + nameTok := p.expect(IDENT) + param := Param{Name: nameTok.Literal, Pos: pos} + + if p.at(ASSIGN) { + p.advance() + param.Default = p.parseLiteral() + // Check that the default is actually a single literal (not an expression). + if !p.at(COMMA) && !p.at(RPAREN) { + p.error(p.tok.Pos, "default parameter values must be literals, not expressions") + // Skip the rest of the expression to recover. + for !p.at(COMMA) && !p.at(RPAREN) && !p.at(EOF) { + p.advance() + } + } + } + + return param +} + +func (p *parser) parseLiteral() Expr { + tok := p.tok + switch tok.Type { + case INT, FLOAT, STRING, RAW_STRING, TRUE, FALSE, NULL: + p.advance() + return &LiteralExpr{TokenPos: tok.Pos, TokenType: tok.Type, Value: tok.Literal} + default: + p.error(tok.Pos, fmt.Sprintf("expected literal value, got %s", tok.Type)) + return &LiteralExpr{TokenPos: tok.Pos, TokenType: NULL, Value: "null"} + } +} + +func (p *parser) parseImport(prog *Program) *ImportStmt { + pos := p.tok.Pos + p.advance() // skip 'import' + + pathTok := p.tok + if pathTok.Type != STRING && pathTok.Type != RAW_STRING { + p.error(pathTok.Pos, "expected string literal for import path") + p.recover() + return nil + } + p.advance() + + p.expect(AS) + nsTok := p.expect(IDENT) + + imp := &ImportStmt{ + TokenPos: pos, + Path: pathTok.Literal, + Namespace: nsTok.Literal, + } + + // Resolve the import. + p.resolveImport(prog, imp) + + return imp +} + +func (p *parser) resolveImport(prog *Program, imp *ImportStmt) { + if _, exists := prog.Namespaces[imp.Namespace]; exists { + p.error(imp.TokenPos, fmt.Sprintf("duplicate namespace %q", imp.Namespace)) + return + } + + src, ok := p.files[imp.Path] + if !ok { + p.error(imp.TokenPos, fmt.Sprintf("import file %q not found", imp.Path)) + return + } + + if p.parsing[imp.Path] { + p.error(imp.TokenPos, fmt.Sprintf("circular import: %q", imp.Path)) + return + } + + // Parse the imported file recursively. + sub := &parser{ + files: p.files, + parsing: p.parsing, + currentFile: imp.Path, + } + p.parsing[imp.Path] = true + sub.init(src, imp.Path) + importProg := sub.parseProgram() + delete(p.parsing, imp.Path) + + // Collect errors from the imported file. + p.errors = append(p.errors, sub.errors...) + + // Only map declarations are allowed in imported files. + if len(importProg.Stmts) > 0 { + p.error(imp.TokenPos, fmt.Sprintf("imported file %q contains statements (only map declarations and imports are allowed)", imp.Path)) + } + + // Collect maps from the imported file. Attach the imported file's + // namespace tables to each map so the interpreter can resolve + // qualified calls (e.g., core::square) within those maps. + for _, m := range importProg.Maps { + if m.Namespaces == nil { + m.Namespaces = make(map[string][]*MapDecl) + } + // Merge the imported program's namespaces into each map. + for ns, maps := range importProg.Namespaces { + m.Namespaces[ns] = maps + } + // Also include the imported program's own maps (for self-references). + // These are accessible without namespace qualification within the file. + } + prog.Namespaces[imp.Namespace] = importProg.Maps +} + +// ----------------------------------------------------------------------- +// Statement parsing +// ----------------------------------------------------------------------- + +func (p *parser) parseStatement() Stmt { + switch p.tok.Type { + case IF: + return p.parseIfStmt() + case MATCH: + return p.parseMatchStmt() + default: + return p.parseAssignment() + } +} + +func (p *parser) parseAssignment() Stmt { + target, ok := p.parseAssignTarget() + if !ok { + p.recover() + return nil + } + + p.expect(ASSIGN) + value := p.parseExpr(0) + + return &Assignment{ + TokenPos: target.Pos, + Target: target, + Value: value, + } +} + +func (p *parser) parseAssignTarget() (AssignTarget, bool) { + var target AssignTarget + target.Pos = p.tok.Pos + + switch p.tok.Type { + case OUTPUT: + target.Root = AssignOutput + p.advance() + if p.at(AT) { + target.MetaAccess = true + p.advance() + } + + case VAR: + target.Root = AssignVar + target.VarName = p.tok.Literal + p.advance() + + default: + p.error(p.tok.Pos, fmt.Sprintf("unexpected expression in statement context (expected output or $variable assignment, got %s)", p.tok.Type)) + return target, false + } + + // Parse path components. + target.Path = p.parsePathSegments() + return target, true +} + +func (p *parser) parsePathSegments() []PathSegment { + var segs []PathSegment + for { + switch p.tok.Type { + case DOT: + pos := p.tok.Pos + p.advance() + name := p.expectWord() + // Check for method call: name( + if p.at(LPAREN) { + p.advance() + args, named := p.parseArgList() + p.expect(RPAREN) + segs = append(segs, PathSegment{Kind: PathSegMethod, Name: name, Args: args, Named: named, Pos: pos}) + } else { + segs = append(segs, PathSegment{Kind: PathSegField, Name: name, Pos: pos}) + } + case QDOT: + pos := p.tok.Pos + p.advance() + name := p.expectWord() + if p.at(LPAREN) { + p.advance() + args, named := p.parseArgList() + p.expect(RPAREN) + segs = append(segs, PathSegment{Kind: PathSegMethod, Name: name, Args: args, Named: named, NullSafe: true, Pos: pos}) + } else { + segs = append(segs, PathSegment{Kind: PathSegField, Name: name, NullSafe: true, Pos: pos}) + } + case LBRACKET: + pos := p.tok.Pos + p.advance() + idx := p.parseExpr(0) + p.expect(RBRACKET) + segs = append(segs, PathSegment{Kind: PathSegIndex, Index: idx, Pos: pos}) + case QLBRACKET: + pos := p.tok.Pos + p.advance() + idx := p.parseExpr(0) + p.expect(RBRACKET) + segs = append(segs, PathSegment{Kind: PathSegIndex, Index: idx, NullSafe: true, Pos: pos}) + default: + return segs + } + } +} + +// expectWord consumes the current token as a word (identifier, keyword, +// or quoted string). Keywords are valid as field names after dot. +// Quoted strings (."field with spaces") are also valid per spec Section 3.1. +func (p *parser) expectWord() string { + tok := p.tok + if tok.Type == IDENT || tok.Type.IsKeyword() || tok.Type == DELETED || tok.Type == THROW || tok.Type == VOID { + p.advance() + return tok.Literal + } + if tok.Type == STRING { + p.advance() + return tok.Literal + } + p.error(tok.Pos, fmt.Sprintf("expected field name, got %s", tok.Type)) + return "" +} + +func (p *parser) parseIfStmt() Stmt { + pos := p.tok.Pos + p.advance() // skip 'if' + + var branches []IfBranch + + cond := p.parseExpr(0) + p.expect(LBRACE) + body := p.parseStmtBody() + p.expect(RBRACE) + branches = append(branches, IfBranch{Cond: cond, Body: body}) + + // else-if / else + var elseBody []Stmt + for p.at(ELSE) { + p.advance() + if p.at(IF) { + p.advance() + cond := p.parseExpr(0) + p.expect(LBRACE) + body := p.parseStmtBody() + p.expect(RBRACE) + branches = append(branches, IfBranch{Cond: cond, Body: body}) + } else { + p.expect(LBRACE) + elseBody = p.parseStmtBody() + p.expect(RBRACE) + break + } + } + + return &IfStmt{TokenPos: pos, Branches: branches, Else: elseBody} +} + +func (p *parser) parseMatchStmt() Stmt { + pos := p.tok.Pos + p.advance() // skip 'match' + + var subject Expr + var binding string + + // Disambiguate: match { cases } vs match expr { cases } + if !p.at(LBRACE) { + subject = p.parseExpr(0) + if p.at(AS) { + p.advance() + binding = p.expect(IDENT).Literal + } + } + + p.expect(LBRACE) + p.skipNL() + + var cases []MatchCase + for !p.at(RBRACE) && !p.at(EOF) { + mc := p.parseMatchCaseStmt() + cases = append(cases, mc) + if p.at(COMMA) { + p.advance() + } + p.skipNL() + } + + p.expect(RBRACE) + + return &MatchStmt{TokenPos: pos, Subject: subject, Binding: binding, Cases: cases} +} + +func (p *parser) parseMatchCaseStmt() MatchCase { + var mc MatchCase + + if p.at(UNDERSCORE) { + mc.Wildcard = true + p.advance() + } else { + mc.Pattern = p.parseExpr(0) + } + + p.expect(FATARROW) + p.expect(LBRACE) + body := p.parseStmtBody() + p.expect(RBRACE) + mc.Body = body + + return mc +} + +func (p *parser) parseStmtBody() []Stmt { + p.skipNL() + var stmts []Stmt + for !p.at(RBRACE) && !p.at(EOF) { + stmt := p.parseStatement() + if stmt != nil { + stmts = append(stmts, stmt) + } + if p.at(NL) { + p.advance() + p.skipNL() + } else if !p.at(RBRACE) && !p.at(EOF) { + p.error(p.tok.Pos, fmt.Sprintf("expected newline or }, got %s", p.tok.Type)) + p.recover() + p.skipNL() + } + } + return stmts +} + +// ----------------------------------------------------------------------- +// Expression parsing (Pratt parser) +// ----------------------------------------------------------------------- + +// Binding powers. +const ( + bpNone = 0 + bpOr = 10 + bpAnd = 20 + bpEquality = 40 + bpComparison = 60 + bpAdditive = 80 + bpMultiply = 100 + bpUnary = 120 + bpPostfix = 140 +) + +func (p *parser) parseExpr(minBP int) Expr { + left := p.parsePrefix() + + for { + bp, rightBP, nonAssoc := infixBP(p.tok.Type) + if bp == bpNone || bp < minBP { + break + } + + switch p.tok.Type { + case DOT, QDOT: + left = p.parsePostfixDot(left) + case LBRACKET, QLBRACKET: + left = p.parsePostfixIndex(left) + default: + // Binary operator. + op := p.tok + p.advance() + right := p.parseExpr(rightBP) + + // Non-associative check: if the next token is at the same level, error. + if nonAssoc { + nextBP, _, _ := infixBP(p.tok.Type) + if nextBP == bp { + p.error(p.tok.Pos, fmt.Sprintf("cannot chain non-associative operator %s", p.tok.Type)) + } + } + + left = &BinaryExpr{Left: left, Op: op.Type, OpPos: op.Pos, Right: right} + } + } + + return left +} + +func infixBP(typ TokenType) (leftBP, rightBP int, nonAssoc bool) { + switch typ { + case OR: + return bpOr, bpOr + 1, false + case AND: + return bpAnd, bpAnd + 1, false + case EQ, NE: + return bpEquality, bpEquality + 1, true + case GT, GE, LT, LE: + return bpComparison, bpComparison + 1, true + case PLUS, MINUS: + return bpAdditive, bpAdditive + 1, false + case STAR, SLASH, PERCENT: + return bpMultiply, bpMultiply + 1, false + case DOT, QDOT, LBRACKET, QLBRACKET: + return bpPostfix, bpPostfix, false + default: + return bpNone, bpNone, false + } +} + +// ----------------------------------------------------------------------- +// Prefix / atom parsers (null-denotation) +// ----------------------------------------------------------------------- + +func (p *parser) parsePrefix() Expr { + tok := p.tok + + switch tok.Type { + case INT, FLOAT, STRING, RAW_STRING, TRUE, FALSE, NULL: + p.advance() + return &LiteralExpr{TokenPos: tok.Pos, TokenType: tok.Type, Value: tok.Literal} + + case MINUS: + p.advance() + operand := p.parseExpr(bpUnary) + return &UnaryExpr{Op: MINUS, OpPos: tok.Pos, Operand: operand} + + case BANG: + p.advance() + operand := p.parseExpr(bpUnary) + return &UnaryExpr{Op: BANG, OpPos: tok.Pos, Operand: operand} + + case LPAREN: + return p.parseParenOrLambda() + + case LBRACKET: + return p.parseArrayLiteral() + + case LBRACE: + return p.parseObjectLiteral() + + case IF: + return p.parseIfExpr() + + case MATCH: + return p.parseMatchExpr() + + case INPUT: + p.advance() + if p.at(AT) { + p.advance() + return &InputMetaExpr{TokenPos: tok.Pos} + } + return &InputExpr{TokenPos: tok.Pos} + + case OUTPUT: + p.advance() + if p.at(AT) { + p.advance() + return &OutputMetaExpr{TokenPos: tok.Pos} + } + return &OutputExpr{TokenPos: tok.Pos} + + case VAR: + p.advance() + if p.at(LPAREN) { + p.error(tok.Pos, fmt.Sprintf("$%s is a variable, not a callable function (use a named map instead)", tok.Literal)) + } + return &VarExpr{TokenPos: tok.Pos, Name: tok.Literal} + + case IDENT: + return p.parseIdentOrCall() + + case DELETED, THROW, VOID: + return p.parseReservedCall() + + case UNDERSCORE: + p.advance() + // _ -> body: discard lambda. + if p.at(THINARROW) { + p.advance() + body := p.parseLambdaBody() + return &LambdaExpr{ + TokenPos: tok.Pos, + Params: []Param{{Discard: true, Pos: tok.Pos}}, + Body: body, + } + } + // Underscore in other expression positions is not valid. + p.error(tok.Pos, "unexpected _ in expression position") + return &LiteralExpr{TokenPos: tok.Pos, TokenType: NULL, Value: "null"} + + default: + p.error(tok.Pos, fmt.Sprintf("expected expression, got %s", tok.Type)) + p.advance() + return &LiteralExpr{TokenPos: tok.Pos, TokenType: NULL, Value: "null"} + } +} + +func (p *parser) parseIdentOrCall() Expr { + tok := p.tok + p.advance() + + // Check for qualified name: namespace::name or namespace::name(args) + if p.at(DCOLON) { + p.advance() + name := p.expect(IDENT) + if p.at(LPAREN) { + // Qualified call: namespace::name(args) + p.advance() + args, named := p.parseArgList() + p.expect(RPAREN) + return &CallExpr{ + TokenPos: tok.Pos, + Namespace: tok.Literal, + Name: name.Literal, + Args: args, + Named: named, + } + } + // Bare qualified reference: namespace::name (for higher-order method args) + return &IdentExpr{ + TokenPos: tok.Pos, + Namespace: tok.Literal, + Name: name.Literal, + } + } + + // Check for function call: name( + if p.at(LPAREN) { + p.advance() + args, named := p.parseArgList() + p.expect(RPAREN) + return &CallExpr{ + TokenPos: tok.Pos, + Name: tok.Literal, + Args: args, + Named: named, + } + } + + // Check for single-param lambda: ident -> + if p.at(THINARROW) { + p.advance() + body := p.parseLambdaBody() + return &LambdaExpr{ + TokenPos: tok.Pos, + Params: []Param{{Name: tok.Literal, Pos: tok.Pos}}, + Body: body, + } + } + + // Bare identifier (parameter reference, map name reference). + return &IdentExpr{TokenPos: tok.Pos, Name: tok.Literal} +} + +func (p *parser) parseReservedCall() Expr { + tok := p.tok + p.advance() + p.expect(LPAREN) + args, named := p.parseArgList() + p.expect(RPAREN) + return &CallExpr{ + TokenPos: tok.Pos, + Name: tok.Literal, + Args: args, + Named: named, + } +} + +func (p *parser) parseParenOrLambda() Expr { + pos := p.tok.Pos + + // Lookahead: scan past matching ) and check for ->. + if p.isLambdaAhead() { + return p.parseMultiParamLambda(pos) + } + + // Grouped expression. + p.advance() // skip ( + expr := p.parseExpr(0) + p.expect(RPAREN) + return expr +} + +// isLambdaAhead scans forward from the current ( to the matching ) +// and checks if -> follows. Does not consume tokens. +func (p *parser) isLambdaAhead() bool { + // Save scanner state. + savedTok := p.tok + savedS := *p.s + + depth := 0 + p.advance() // skip ( + depth++ + for depth > 0 && p.tok.Type != EOF { + switch p.tok.Type { + case LPAREN: + depth++ + case RPAREN: + depth-- + } + if depth > 0 { + p.advance() + } + } + // Now at the matching ) — peek past it. + p.advance() // skip ) + isLambda := p.tok.Type == THINARROW + + // Restore state. + *p.s = savedS + p.tok = savedTok + + return isLambda +} + +func (p *parser) parseMultiParamLambda(pos Pos) Expr { + p.advance() // skip ( + params := p.parseParamList() + p.expect(RPAREN) + p.expect(THINARROW) + body := p.parseLambdaBody() + return &LambdaExpr{TokenPos: pos, Params: params, Body: body} +} + +func (p *parser) parseLambdaBody() *ExprBody { + if p.at(LBRACE) { + // Disambiguate: lambda block vs object literal. + // Per spec Section 10: {} is parsed as empty object literal. + // A lambda block requires at least one var assignment ($var = ...) + // followed by a final expression. If the content after { doesn't + // start with $var, parse as a single expression (object literal). + if p.isLambdaBlock() { + p.advance() // skip { + body := p.parseExprBody() + p.skipNL() + p.expect(RBRACE) + return body + } + // Object literal or other expression starting with {. + expr := p.parseExpr(0) + return &ExprBody{Result: expr} + } + // Single expression. + expr := p.parseExpr(0) + return &ExprBody{Result: expr} +} + +// isLambdaBlock peeks inside { to determine if it's a lambda block +// (has var assignments, output assignments, or identifier assignments) +// or an object literal. +func (p *parser) isLambdaBlock() bool { + savedTok := p.tok + savedS := *p.s + + p.advance() // skip { + // Skip optional NL. + for p.tok.Type == NL { + p.advance() + } + + var isBlock bool + switch p.tok.Type { + case VAR, OUTPUT: + // Definitely a block. + isBlock = true + case IDENT: + // Could be block (x = ...) or object literal (key: value). + // Peek ahead: if followed by = it's an assignment attempt. + savedInner := p.tok + savedInnerS := *p.s + p.advance() + isBlock = p.tok.Type == ASSIGN + *p.s = savedInnerS + p.tok = savedInner + } + + *p.s = savedS + p.tok = savedTok + return isBlock +} + +func (p *parser) parseArrayLiteral() Expr { + pos := p.tok.Pos + p.advance() // skip [ + + var elems []Expr + for !p.at(RBRACKET) && !p.at(EOF) { + elems = append(elems, p.parseExpr(0)) + if !p.at(RBRACKET) { + p.expect(COMMA) + } + } + p.expect(RBRACKET) + + return &ArrayLiteral{LBracketPos: pos, Elements: elems} +} + +func (p *parser) parseObjectLiteral() Expr { + pos := p.tok.Pos + p.advance() // skip { + p.skipNL() + + var entries []ObjectEntry + for !p.at(RBRACE) && !p.at(EOF) { + key := p.parseExpr(0) + p.expect(COLON) + value := p.parseExpr(0) + entries = append(entries, ObjectEntry{Key: key, Value: value}) + if !p.at(RBRACE) { + p.expect(COMMA) + p.skipNL() + } + } + p.skipNL() + p.expect(RBRACE) + + return &ObjectLiteral{LBracePos: pos, Entries: entries} +} + +// ----------------------------------------------------------------------- +// If/match expressions +// ----------------------------------------------------------------------- + +func (p *parser) parseIfExpr() Expr { + pos := p.tok.Pos + p.advance() // skip 'if' + + var branches []IfExprBranch + + cond := p.parseExpr(0) + p.expect(LBRACE) + body := p.parseExprBody() + p.skipNL() + p.expect(RBRACE) + branches = append(branches, IfExprBranch{Cond: cond, Body: body}) + + var elseBody *ExprBody + for p.at(ELSE) { + p.advance() + if p.at(IF) { + p.advance() + cond := p.parseExpr(0) + p.expect(LBRACE) + body := p.parseExprBody() + p.skipNL() + p.expect(RBRACE) + branches = append(branches, IfExprBranch{Cond: cond, Body: body}) + } else { + p.expect(LBRACE) + elseBody = p.parseExprBody() + p.skipNL() + p.expect(RBRACE) + break + } + } + + return &IfExpr{TokenPos: pos, Branches: branches, Else: elseBody} +} + +func (p *parser) parseMatchExpr() Expr { + pos := p.tok.Pos + p.advance() // skip 'match' + + var subject Expr + var binding string + + if !p.at(LBRACE) { + subject = p.parseExpr(0) + if p.at(AS) { + p.advance() + binding = p.expect(IDENT).Literal + } + } + + p.expect(LBRACE) + p.skipNL() + + var cases []MatchCase + for !p.at(RBRACE) && !p.at(EOF) { + mc := p.parseMatchCaseExpr() + cases = append(cases, mc) + if p.at(COMMA) { + p.advance() + } + p.skipNL() + } + + p.expect(RBRACE) + + return &MatchExpr{TokenPos: pos, Subject: subject, Binding: binding, Cases: cases} +} + +func (p *parser) parseMatchCaseExpr() MatchCase { + var mc MatchCase + + if p.at(UNDERSCORE) { + mc.Wildcard = true + p.advance() + } else { + mc.Pattern = p.parseExpr(0) + } + + p.expect(FATARROW) + + // Case body: bare expression or braced expr body. + if p.at(LBRACE) { + p.advance() + body := p.parseExprBody() + p.skipNL() + p.expect(RBRACE) + mc.Body = body + } else { + mc.Body = p.parseExpr(0) + } + + return mc +} + +// ----------------------------------------------------------------------- +// Expression body +// ----------------------------------------------------------------------- + +func (p *parser) parseExprBody() *ExprBody { + p.skipNL() + body := &ExprBody{} + + for { + // Check for output assignment in expression context — not allowed. + if p.at(OUTPUT) && p.isOutputAssignAhead() { + p.error(p.tok.Pos, "cannot assign to output in expression context (only $variable assignments are allowed)") + p.recover() + p.skipNL() + continue + } + + // Check for bare identifier assignment (param = value) — parameters are read-only. + if p.at(IDENT) { + savedTok := p.tok + savedS := *p.s + p.advance() + isAssign := p.tok.Type == ASSIGN + *p.s = savedS + p.tok = savedTok + if isAssign { + p.error(p.tok.Pos, "cannot assign to identifier (parameters are read-only, use $variable for local assignments)") + p.recover() + p.skipNL() + continue + } + } + + // Try to parse var assignment: $var[.path...] = expr + if p.at(VAR) && p.isVarAssignAhead() { + va := p.parseVarAssign() + body.Assignments = append(body.Assignments, va) + if p.at(NL) { + p.advance() + p.skipNL() + } + continue + } + break + } + + // Final expression. + body.Result = p.parseExpr(0) + return body +} + +// isOutputAssignAhead checks whether output is being used as an assignment +// target by scanning forward for `=` after `output[.path...]` or `output@[.path...]`. +func (p *parser) isOutputAssignAhead() bool { + savedTok := p.tok + savedS := *p.s + + p.advance() // skip OUTPUT + if p.tok.Type == AT { + p.advance() // skip @ + } + // Skip path components. + for p.tok.Type == DOT || p.tok.Type == LBRACKET || p.tok.Type == QLBRACKET || p.tok.Type == QDOT { + if p.tok.Type == LBRACKET || p.tok.Type == QLBRACKET { + depth := 1 + p.advance() + for depth > 0 && p.tok.Type != EOF { + switch p.tok.Type { + case LBRACKET, QLBRACKET: + depth++ + case RBRACKET: + depth-- + } + p.advance() + } + } else { + p.advance() // skip . or ?. + p.advance() // skip field name + } + } + isAssign := p.tok.Type == ASSIGN + + *p.s = savedS + p.tok = savedTok + return isAssign +} + +// isVarAssignAhead checks whether the current VAR token starts a var +// assignment ($var[.path...] = expr) by scanning ahead for '='. +func (p *parser) isVarAssignAhead() bool { + savedTok := p.tok + savedS := *p.s + + p.advance() // skip VAR + // Skip path components. + for p.tok.Type == DOT || p.tok.Type == LBRACKET || p.tok.Type == QLBRACKET || p.tok.Type == QDOT { + if p.tok.Type == LBRACKET || p.tok.Type == QLBRACKET { + // Skip past bracket contents (count depth). + depth := 1 + p.advance() + for depth > 0 && p.tok.Type != EOF { + switch p.tok.Type { + case LBRACKET, QLBRACKET: + depth++ + case RBRACKET: + depth-- + } + p.advance() + } + } else { + p.advance() // skip . or ?. + p.advance() // skip field name + } + } + isAssign := p.tok.Type == ASSIGN + + // Restore. + *p.s = savedS + p.tok = savedTok + + return isAssign +} + +func (p *parser) parseVarAssign() *VarAssign { + pos := p.tok.Pos + name := p.tok.Literal + p.advance() // skip VAR + + path := p.parsePathSegments() + p.expect(ASSIGN) + value := p.parseExpr(0) + + return &VarAssign{ + TokenPos: pos, + Name: name, + Path: path, + Value: value, + } +} + +// ----------------------------------------------------------------------- +// Postfix parsers (left-denotation) +// ----------------------------------------------------------------------- + +func (p *parser) parsePostfixDot(receiver Expr) Expr { + nullSafe := p.tok.Type == QDOT + dotPos := p.tok.Pos + p.advance() // skip . or ?. + + name := p.expectWord() + + // Method call: .name(args) + if p.at(LPAREN) { + p.advance() + args, named := p.parseArgList() + p.expect(RPAREN) + return &MethodCallExpr{ + Receiver: receiver, + Method: name, + MethodPos: dotPos, + Args: args, + Named: named, + NullSafe: nullSafe, + } + } + + // Field access: .name + return &FieldAccessExpr{ + Receiver: receiver, + Field: name, + FieldPos: dotPos, + NullSafe: nullSafe, + } +} + +func (p *parser) parsePostfixIndex(receiver Expr) Expr { + nullSafe := p.tok.Type == QLBRACKET + pos := p.tok.Pos + p.advance() // skip [ or ?[ + + index := p.parseExpr(0) + p.expect(RBRACKET) + + return &IndexExpr{ + Receiver: receiver, + Index: index, + LBracketPos: pos, + NullSafe: nullSafe, + } +} + +// ----------------------------------------------------------------------- +// Argument lists +// ----------------------------------------------------------------------- + +func (p *parser) parseArgList() ([]CallArg, bool) { + if p.at(RPAREN) { + return nil, false + } + + // Detect named vs positional: peek for "ident :" pattern. + named := p.isNamedArgList() + + var args []CallArg + for { + if named { + // Named mode: expect "name: value". + if p.tok.Type != IDENT { + p.error(p.tok.Pos, "cannot mix named and positional arguments in the same call") + // Recovery: skip to ) or EOF. + for !p.at(RPAREN) && !p.at(EOF) { + p.advance() + } + break + } + nameTok := p.expect(IDENT) + p.expect(COLON) + value := p.parseExpr(0) + args = append(args, CallArg{Name: nameTok.Literal, Value: value}) + } else { + // Positional mode: after parsing a value, check if ":" + // follows an identifier — that indicates named arg mixing. + value := p.parseExpr(0) + if p.at(COLON) { + p.error(p.tok.Pos, "cannot mix positional and named arguments in the same call") + for !p.at(RPAREN) && !p.at(EOF) { + p.advance() + } + break + } + args = append(args, CallArg{Value: value}) + } + if !p.at(COMMA) { + break + } + p.advance() // skip comma + } + return args, named +} + +// isNamedArgList checks if the argument list uses named arguments +// by peeking for the "ident :" pattern. +func (p *parser) isNamedArgList() bool { + if p.tok.Type != IDENT { + return false + } + savedTok := p.tok + savedS := *p.s + + p.advance() // skip ident + isNamed := p.tok.Type == COLON + + *p.s = savedS + p.tok = savedTok + return isNamed +} + +// ----------------------------------------------------------------------- +// Trailing comma support +// ----------------------------------------------------------------------- + +// Note: trailing commas in arrays, objects, and arg lists are handled +// by the parsing loops — they consume a comma then check if the closing +// delimiter follows. The grammar allows optional trailing commas: +// array := '[' [expression (',' expression)* ','?] ']' + +// ----------------------------------------------------------------------- +// Helpers +// ----------------------------------------------------------------------- + +// FormatErrors returns the collected parse errors as a formatted string. +func FormatErrors(errs []PosError) string { + if len(errs) == 0 { + return "" + } + var sb strings.Builder + for i, e := range errs { + if i > 0 { + sb.WriteByte('\n') + } + sb.WriteString(e.Error()) + } + return sb.String() +} diff --git a/internal/bloblang2/go/pratt/syntax/parser_test.go b/internal/bloblang2/go/pratt/syntax/parser_test.go new file mode 100644 index 000000000..32ab633ed --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/parser_test.go @@ -0,0 +1,645 @@ +package syntax + +import ( + "testing" +) + +func mustParse(t *testing.T, src string) *Program { + t.Helper() + prog, errs := Parse(src, "", nil) + if len(errs) > 0 { + t.Fatalf("unexpected parse errors:\n%s", FormatErrors(errs)) + } + return prog +} + +func expectError(t *testing.T, src string, substring string) { + t.Helper() + _, errs := Parse(src, "", nil) + if len(errs) == 0 { + t.Fatalf("expected parse error containing %q, but parsing succeeded", substring) + } + combined := FormatErrors(errs) + for _, e := range errs { + if contains(e.Msg, substring) { + return + } + } + t.Fatalf("no error contains %q, got:\n%s", substring, combined) +} + +func contains(s, substr string) bool { + return len(substr) == 0 || len(s) >= len(substr) && containsStr(s, substr) +} + +func containsStr(s, substr string) bool { + for i := 0; i <= len(s)-len(substr); i++ { + if s[i:i+len(substr)] == substr { + return true + } + } + return false +} + +// ----------------------------------------------------------------------- +// Basic assignments +// ----------------------------------------------------------------------- + +func TestParse_SimpleAssignment(t *testing.T) { + prog := mustParse(t, `output.x = 42`) + if len(prog.Stmts) != 1 { + t.Fatalf("expected 1 statement, got %d", len(prog.Stmts)) + } + assign, ok := prog.Stmts[0].(*Assignment) + if !ok { + t.Fatalf("expected *Assignment, got %T", prog.Stmts[0]) + } + if assign.Target.Root != AssignOutput { + t.Fatalf("expected AssignOutput root") + } + if len(assign.Target.Path) != 1 || assign.Target.Path[0].Name != "x" { + t.Fatalf("expected path [x], got %v", assign.Target.Path) + } + lit, ok := assign.Value.(*LiteralExpr) + if !ok || lit.Value != "42" { + t.Fatalf("expected LiteralExpr(42), got %T %v", assign.Value, assign.Value) + } +} + +func TestParse_VarAssignment(t *testing.T) { + prog := mustParse(t, `$x = 42`) + assign := prog.Stmts[0].(*Assignment) + if assign.Target.Root != AssignVar { + t.Fatal("expected AssignVar root") + } + if assign.Target.VarName != "x" { + t.Fatalf("expected var name 'x', got %q", assign.Target.VarName) + } +} + +func TestParse_MetadataAssignment(t *testing.T) { + prog := mustParse(t, `output@.key = "value"`) + assign := prog.Stmts[0].(*Assignment) + if !assign.Target.MetaAccess { + t.Fatal("expected metadata access") + } + if len(assign.Target.Path) != 1 || assign.Target.Path[0].Name != "key" { + t.Fatal("expected path [key]") + } +} + +func TestParse_MultipleStatements(t *testing.T) { + prog := mustParse(t, "output.a = 1\noutput.b = 2") + if len(prog.Stmts) != 2 { + t.Fatalf("expected 2 statements, got %d", len(prog.Stmts)) + } +} + +// ----------------------------------------------------------------------- +// Expressions +// ----------------------------------------------------------------------- + +func TestParse_BinaryExpr(t *testing.T) { + prog := mustParse(t, `output = 1 + 2 * 3`) + assign := prog.Stmts[0].(*Assignment) + // Should be 1 + (2 * 3) due to precedence. + bin, ok := assign.Value.(*BinaryExpr) + if !ok { + t.Fatalf("expected BinaryExpr, got %T", assign.Value) + } + if bin.Op != PLUS { + t.Fatalf("expected PLUS, got %s", bin.Op) + } + // Right should be 2 * 3. + right, ok := bin.Right.(*BinaryExpr) + if !ok || right.Op != STAR { + t.Fatalf("expected right to be STAR, got %T %v", bin.Right, bin.Right) + } +} + +func TestParse_UnaryMinus(t *testing.T) { + prog := mustParse(t, `output = -5`) + assign := prog.Stmts[0].(*Assignment) + unary, ok := assign.Value.(*UnaryExpr) + if !ok || unary.Op != MINUS { + t.Fatalf("expected UnaryExpr(-), got %T", assign.Value) + } +} + +func TestParse_MethodCallBindsTighterThanUnary(t *testing.T) { + // -5.string() should parse as -(5.string()) + prog := mustParse(t, `output = -5.string()`) + assign := prog.Stmts[0].(*Assignment) + unary, ok := assign.Value.(*UnaryExpr) + if !ok { + t.Fatalf("expected UnaryExpr, got %T", assign.Value) + } + _, ok = unary.Operand.(*MethodCallExpr) + if !ok { + t.Fatalf("expected MethodCallExpr inside unary, got %T", unary.Operand) + } +} + +func TestParse_NonAssociativeChaining(t *testing.T) { + expectError(t, `output = 1 < 2 < 3`, "chain") + expectError(t, `output = 1 == 2 == 3`, "chain") +} + +func TestParse_ComparisonBeforeEquality(t *testing.T) { + // 3 > 2 == true is valid: (3 > 2) == true + prog := mustParse(t, `output = 3 > 2 == true`) + assign := prog.Stmts[0].(*Assignment) + bin, ok := assign.Value.(*BinaryExpr) + if !ok || bin.Op != EQ { + t.Fatalf("expected outer EQ, got %T %v", assign.Value, assign.Value) + } +} + +// ----------------------------------------------------------------------- +// Field access and method calls +// ----------------------------------------------------------------------- + +func TestParse_FieldAccess(t *testing.T) { + prog := mustParse(t, `output = input.user.name`) + assign := prog.Stmts[0].(*Assignment) + fa, ok := assign.Value.(*FieldAccessExpr) + if !ok { + t.Fatalf("expected FieldAccessExpr, got %T", assign.Value) + } + if fa.Field != "name" { + t.Fatalf("expected field 'name', got %q", fa.Field) + } +} + +func TestParse_NullSafeFieldAccess(t *testing.T) { + prog := mustParse(t, `output = input?.name`) + assign := prog.Stmts[0].(*Assignment) + fa := assign.Value.(*FieldAccessExpr) + if !fa.NullSafe { + t.Fatal("expected null-safe") + } +} + +func TestParse_MethodCall(t *testing.T) { + prog := mustParse(t, `output = input.name.uppercase()`) + assign := prog.Stmts[0].(*Assignment) + mc, ok := assign.Value.(*MethodCallExpr) + if !ok { + t.Fatalf("expected MethodCallExpr, got %T", assign.Value) + } + if mc.Method != "uppercase" { + t.Fatalf("expected method 'uppercase', got %q", mc.Method) + } +} + +func TestParse_KeywordAsFieldName(t *testing.T) { + // input.map is valid (map is a keyword but valid as field name after .) + prog := mustParse(t, `output = input.map`) + assign := prog.Stmts[0].(*Assignment) + fa := assign.Value.(*FieldAccessExpr) + if fa.Field != "map" { + t.Fatalf("expected field 'map', got %q", fa.Field) + } +} + +func TestParse_IndexAccess(t *testing.T) { + prog := mustParse(t, `output = input.items[0]`) + assign := prog.Stmts[0].(*Assignment) + idx, ok := assign.Value.(*IndexExpr) + if !ok { + t.Fatalf("expected IndexExpr, got %T", assign.Value) + } + if idx.NullSafe { + t.Fatal("expected non-null-safe") + } +} + +func TestParse_NullSafeIndex(t *testing.T) { + prog := mustParse(t, `output = input?[0]`) + assign := prog.Stmts[0].(*Assignment) + idx := assign.Value.(*IndexExpr) + if !idx.NullSafe { + t.Fatal("expected null-safe") + } +} + +// ----------------------------------------------------------------------- +// Calls +// ----------------------------------------------------------------------- + +func TestParse_FunctionCall(t *testing.T) { + prog := mustParse(t, `output = uuid_v4()`) + assign := prog.Stmts[0].(*Assignment) + call, ok := assign.Value.(*CallExpr) + if !ok { + t.Fatalf("expected CallExpr, got %T", assign.Value) + } + if call.Name != "uuid_v4" || len(call.Args) != 0 { + t.Fatalf("expected uuid_v4(), got %s(%d args)", call.Name, len(call.Args)) + } +} + +func TestParse_QualifiedCall(t *testing.T) { + prog := mustParse(t, `output = math::double(5)`) + assign := prog.Stmts[0].(*Assignment) + call := assign.Value.(*CallExpr) + if call.Namespace != "math" || call.Name != "double" { + t.Fatalf("expected math::double, got %s::%s", call.Namespace, call.Name) + } +} + +func TestParse_NamedArgs(t *testing.T) { + prog := mustParse(t, `output = foo(a: 1, b: 2)`) + assign := prog.Stmts[0].(*Assignment) + call := assign.Value.(*CallExpr) + if !call.Named { + t.Fatal("expected named args") + } + if len(call.Args) != 2 || call.Args[0].Name != "a" || call.Args[1].Name != "b" { + t.Fatalf("expected named args a, b") + } +} + +func TestParse_DeletedCall(t *testing.T) { + prog := mustParse(t, `output = deleted()`) + assign := prog.Stmts[0].(*Assignment) + call, ok := assign.Value.(*CallExpr) + if !ok || call.Name != "deleted" { + t.Fatalf("expected CallExpr(deleted), got %T", assign.Value) + } +} + +func TestParse_ThrowCall(t *testing.T) { + prog := mustParse(t, `output = throw("error")`) + assign := prog.Stmts[0].(*Assignment) + call := assign.Value.(*CallExpr) + if call.Name != "throw" || len(call.Args) != 1 { + t.Fatal("expected throw with 1 arg") + } +} + +// ----------------------------------------------------------------------- +// Literals +// ----------------------------------------------------------------------- + +func TestParse_ArrayLiteral(t *testing.T) { + prog := mustParse(t, `output = [1, 2, 3]`) + assign := prog.Stmts[0].(*Assignment) + arr, ok := assign.Value.(*ArrayLiteral) + if !ok { + t.Fatalf("expected ArrayLiteral, got %T", assign.Value) + } + if len(arr.Elements) != 3 { + t.Fatalf("expected 3 elements, got %d", len(arr.Elements)) + } +} + +func TestParse_ObjectLiteral(t *testing.T) { + prog := mustParse(t, `output = {"a": 1, "b": 2}`) + assign := prog.Stmts[0].(*Assignment) + obj, ok := assign.Value.(*ObjectLiteral) + if !ok { + t.Fatalf("expected ObjectLiteral, got %T", assign.Value) + } + if len(obj.Entries) != 2 { + t.Fatalf("expected 2 entries, got %d", len(obj.Entries)) + } +} + +func TestParse_TrailingComma(t *testing.T) { + // Trailing comma in array. + mustParse(t, `output = [1, 2, 3,]`) + // Trailing comma in object. + mustParse(t, `output = {"a": 1, "b": 2,}`) +} + +// ----------------------------------------------------------------------- +// Lambdas +// ----------------------------------------------------------------------- + +func TestParse_SingleParamLambda(t *testing.T) { + prog := mustParse(t, `output = input.map(x -> x * 2)`) + assign := prog.Stmts[0].(*Assignment) + mc := assign.Value.(*MethodCallExpr) + lambda, ok := mc.Args[0].Value.(*LambdaExpr) + if !ok { + t.Fatalf("expected LambdaExpr, got %T", mc.Args[0].Value) + } + if len(lambda.Params) != 1 || lambda.Params[0].Name != "x" { + t.Fatal("expected single param x") + } +} + +func TestParse_MultiParamLambda(t *testing.T) { + prog := mustParse(t, `output = input.fold(0, (acc, x) -> acc + x)`) + assign := prog.Stmts[0].(*Assignment) + mc := assign.Value.(*MethodCallExpr) + lambda, ok := mc.Args[1].Value.(*LambdaExpr) + if !ok { + t.Fatalf("expected LambdaExpr, got %T", mc.Args[1].Value) + } + if len(lambda.Params) != 2 { + t.Fatalf("expected 2 params, got %d", len(lambda.Params)) + } +} + +func TestParse_DiscardParamLambda(t *testing.T) { + prog := mustParse(t, `output = input.map(_ -> 42)`) + assign := prog.Stmts[0].(*Assignment) + mc := assign.Value.(*MethodCallExpr) + lambda := mc.Args[0].Value.(*LambdaExpr) + if !lambda.Params[0].Discard { + t.Fatal("expected discard param") + } +} + +func TestParse_LambdaBlock(t *testing.T) { + prog := mustParse(t, "output = input.map(x -> {\n $y = x * 2\n $y + 1\n})") + assign := prog.Stmts[0].(*Assignment) + mc := assign.Value.(*MethodCallExpr) + lambda := mc.Args[0].Value.(*LambdaExpr) + if len(lambda.Body.Assignments) != 1 { + t.Fatalf("expected 1 var assignment in lambda body, got %d", len(lambda.Body.Assignments)) + } +} + +func TestParse_GroupedExprNotLambda(t *testing.T) { + // (1 + 2) is a grouped expression, not a lambda. + prog := mustParse(t, `output = (1 + 2) * 3`) + assign := prog.Stmts[0].(*Assignment) + bin := assign.Value.(*BinaryExpr) + if bin.Op != STAR { + t.Fatalf("expected STAR, got %s", bin.Op) + } +} + +// ----------------------------------------------------------------------- +// If expression +// ----------------------------------------------------------------------- + +func TestParse_IfExpr(t *testing.T) { + prog := mustParse(t, `output = if true { 1 } else { 2 }`) + assign := prog.Stmts[0].(*Assignment) + ifExpr, ok := assign.Value.(*IfExpr) + if !ok { + t.Fatalf("expected IfExpr, got %T", assign.Value) + } + if len(ifExpr.Branches) != 1 { + t.Fatalf("expected 1 branch, got %d", len(ifExpr.Branches)) + } + if ifExpr.Else == nil { + t.Fatal("expected else branch") + } +} + +func TestParse_IfExprWithoutElse(t *testing.T) { + prog := mustParse(t, `output = if true { 1 }`) + assign := prog.Stmts[0].(*Assignment) + ifExpr := assign.Value.(*IfExpr) + if ifExpr.Else != nil { + t.Fatal("expected no else branch") + } +} + +func TestParse_IfElseIfElse(t *testing.T) { + prog := mustParse(t, `output = if false { 1 } else if true { 2 } else { 3 }`) + assign := prog.Stmts[0].(*Assignment) + ifExpr := assign.Value.(*IfExpr) + if len(ifExpr.Branches) != 2 { + t.Fatalf("expected 2 branches (if + else-if), got %d", len(ifExpr.Branches)) + } + if ifExpr.Else == nil { + t.Fatal("expected else branch") + } +} + +// ----------------------------------------------------------------------- +// If statement +// ----------------------------------------------------------------------- + +func TestParse_IfStmt(t *testing.T) { + prog := mustParse(t, "if true {\n output.x = 1\n}") + if len(prog.Stmts) != 1 { + t.Fatalf("expected 1 statement, got %d", len(prog.Stmts)) + } + ifStmt, ok := prog.Stmts[0].(*IfStmt) + if !ok { + t.Fatalf("expected IfStmt, got %T", prog.Stmts[0]) + } + if len(ifStmt.Branches) != 1 { + t.Fatalf("expected 1 branch, got %d", len(ifStmt.Branches)) + } + if len(ifStmt.Branches[0].Body) != 1 { + t.Fatalf("expected 1 statement in body, got %d", len(ifStmt.Branches[0].Body)) + } +} + +// ----------------------------------------------------------------------- +// Match expression +// ----------------------------------------------------------------------- + +func TestParse_MatchEqualityExpr(t *testing.T) { + prog := mustParse(t, `output = match input.x { "a" => 1, "b" => 2, _ => 3 }`) + assign := prog.Stmts[0].(*Assignment) + matchExpr, ok := assign.Value.(*MatchExpr) + if !ok { + t.Fatalf("expected MatchExpr, got %T", assign.Value) + } + if len(matchExpr.Cases) != 3 { + t.Fatalf("expected 3 cases, got %d", len(matchExpr.Cases)) + } + if !matchExpr.Cases[2].Wildcard { + t.Fatal("expected last case to be wildcard") + } +} + +func TestParse_MatchAsExpr(t *testing.T) { + prog := mustParse(t, `output = match input.score as s { s >= 90 => "A", _ => "F" }`) + assign := prog.Stmts[0].(*Assignment) + matchExpr := assign.Value.(*MatchExpr) + if matchExpr.Binding != "s" { + t.Fatalf("expected binding 's', got %q", matchExpr.Binding) + } +} + +func TestParse_MatchBooleanExpr(t *testing.T) { + // match { bool_cases } + prog := mustParse(t, `output = match { input.x > 0 => "pos", _ => "neg" }`) + assign := prog.Stmts[0].(*Assignment) + matchExpr := assign.Value.(*MatchExpr) + if matchExpr.Subject != nil { + t.Fatal("expected no subject for boolean match") + } +} + +func TestParse_MatchCaseWithBracedBody(t *testing.T) { + src := `output = match input.x { + "a" => { + $v = 1 + $v + 10 + }, + _ => 0, +}` + prog := mustParse(t, src) + assign := prog.Stmts[0].(*Assignment) + matchExpr := assign.Value.(*MatchExpr) + body, ok := matchExpr.Cases[0].Body.(*ExprBody) + if !ok { + t.Fatalf("expected *ExprBody for braced case, got %T", matchExpr.Cases[0].Body) + } + if len(body.Assignments) != 1 { + t.Fatalf("expected 1 var assignment, got %d", len(body.Assignments)) + } +} + +// ----------------------------------------------------------------------- +// Map declarations +// ----------------------------------------------------------------------- + +func TestParse_MapDecl(t *testing.T) { + prog := mustParse(t, "map double(x) {\n x * 2\n}") + if len(prog.Maps) != 1 { + t.Fatalf("expected 1 map, got %d", len(prog.Maps)) + } + m := prog.Maps[0] + if m.Name != "double" { + t.Fatalf("expected map name 'double', got %q", m.Name) + } + if len(m.Params) != 1 || m.Params[0].Name != "x" { + t.Fatal("expected single param x") + } +} + +func TestParse_MapDeclWithDefaults(t *testing.T) { + prog := mustParse(t, `map fmt(amount, currency = "USD") { currency + " " + amount.string() }`) + m := prog.Maps[0] + if len(m.Params) != 2 { + t.Fatalf("expected 2 params, got %d", len(m.Params)) + } + if m.Params[1].Default == nil { + t.Fatal("expected default for second param") + } +} + +func TestParse_MapDeclWithDiscard(t *testing.T) { + prog := mustParse(t, `map ignore(_, x) { x }`) + m := prog.Maps[0] + if !m.Params[0].Discard { + t.Fatal("expected first param to be discard") + } +} + +// ----------------------------------------------------------------------- +// Imports +// ----------------------------------------------------------------------- + +func TestParse_Import(t *testing.T) { + files := map[string]string{ + "helpers.blobl": `map double(x) { x * 2 }`, + } + prog, errs := Parse(`import "helpers.blobl" as h`+"\n"+`output = h::double(5)`, "", files) + if len(errs) > 0 { + t.Fatalf("unexpected errors:\n%s", FormatErrors(errs)) + } + if len(prog.Imports) != 1 { + t.Fatalf("expected 1 import, got %d", len(prog.Imports)) + } + if len(prog.Namespaces["h"]) != 1 { + t.Fatalf("expected 1 map in namespace h, got %d", len(prog.Namespaces["h"])) + } +} + +func TestParse_ImportFileNotFound(t *testing.T) { + _, errs := Parse(`import "missing.blobl" as m`, "", nil) + if len(errs) == 0 { + t.Fatal("expected error for missing file") + } +} + +func TestParse_CircularImport(t *testing.T) { + files := map[string]string{ + "a.blobl": `import "b.blobl" as b`, + "b.blobl": `import "a.blobl" as a`, + } + _, errs := Parse(`import "a.blobl" as a`, "", files) + found := false + for _, e := range errs { + if containsStr(e.Msg, "circular") { + found = true + } + } + if !found { + t.Fatalf("expected circular import error, got:\n%s", FormatErrors(errs)) + } +} + +// ----------------------------------------------------------------------- +// Discard _ as lambda parameter +// ----------------------------------------------------------------------- + +func TestParse_UnderscoreInExprIsError(t *testing.T) { + expectError(t, `output = _ + 1`, "_") +} + +// ----------------------------------------------------------------------- +// Expression body with var assignments +// ----------------------------------------------------------------------- + +func TestParse_ExprBodyWithVarAssign(t *testing.T) { + prog := mustParse(t, "output = if true {\n $x = 10\n $x + 1\n}") + assign := prog.Stmts[0].(*Assignment) + ifExpr := assign.Value.(*IfExpr) + body := ifExpr.Branches[0].Body + if len(body.Assignments) != 1 { + t.Fatalf("expected 1 var assignment, got %d", len(body.Assignments)) + } + if body.Assignments[0].Name != "x" { + t.Fatalf("expected var name 'x', got %q", body.Assignments[0].Name) + } +} + +// ----------------------------------------------------------------------- +// Input/output atoms +// ----------------------------------------------------------------------- + +func TestParse_InputAtom(t *testing.T) { + prog := mustParse(t, `output = input`) + assign := prog.Stmts[0].(*Assignment) + _, ok := assign.Value.(*InputExpr) + if !ok { + t.Fatalf("expected InputExpr, got %T", assign.Value) + } +} + +func TestParse_InputMetaAtom(t *testing.T) { + prog := mustParse(t, `output = input@`) + assign := prog.Stmts[0].(*Assignment) + _, ok := assign.Value.(*InputMetaExpr) + if !ok { + t.Fatalf("expected InputMetaExpr, got %T", assign.Value) + } +} + +func TestParse_OutputMetaAssignment(t *testing.T) { + prog := mustParse(t, `output@ = {}`) + assign := prog.Stmts[0].(*Assignment) + if assign.Target.Root != AssignOutput || !assign.Target.MetaAccess { + t.Fatal("expected output@ target") + } +} + +// ----------------------------------------------------------------------- +// Error recovery +// ----------------------------------------------------------------------- + +func TestParse_ErrorRecovery(t *testing.T) { + // First line has error, second line should still parse. + prog, errs := Parse("output = @@@\noutput.x = 1", "", nil) + if len(errs) == 0 { + t.Fatal("expected errors") + } + // Should have recovered and parsed the second statement. + if len(prog.Stmts) < 1 { + t.Fatal("expected at least 1 statement after error recovery") + } +} diff --git a/internal/bloblang2/go/pratt/syntax/print.go b/internal/bloblang2/go/pratt/syntax/print.go new file mode 100644 index 000000000..3ebcfa72c --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/print.go @@ -0,0 +1,1025 @@ +package syntax + +import ( + "fmt" + "io" + "strings" +) + +// Print emits p as a formatted Bloblang V2 source string. +func Print(p *Program) string { + var sb strings.Builder + pr := newPrinter(&sb) + pr.printProgram(p) + return sb.String() +} + +// PrintTo emits p to the given writer. +func PrintTo(w io.Writer, p *Program) error { + var sb strings.Builder + pr := newPrinter(&sb) + pr.printProgram(p) + _, err := io.WriteString(w, sb.String()) + return err +} + +// ----------------------------------------------------------------------- +// printer state +// ----------------------------------------------------------------------- + +type printer struct { + w *strings.Builder + indent int +} + +func newPrinter(w *strings.Builder) *printer { + return &printer{w: w} +} + +func (p *printer) write(s string) { + p.w.WriteString(s) +} + +func (p *printer) writeIndent() { + for range p.indent { + p.w.WriteString(" ") + } +} + +func (p *printer) newline() { + p.w.WriteByte('\n') +} + +// ----------------------------------------------------------------------- +// Top-level +// ----------------------------------------------------------------------- + +func (p *printer) printProgram(prog *Program) { + wrote := false + + // Imports first. + for _, imp := range prog.Imports { + if wrote { + p.newline() + } + p.printLeadingTrivia(imp.Trivia().Leading) + p.printImport(imp) + p.printTrailingTrivia(imp.Trivia().Trailing) + wrote = true + } + + // Blank line before maps if we had imports — unless the first map + // already carries a user-supplied blank line in its leading trivia. + if len(prog.Maps) > 0 && wrote && !leadingStartsWithBlank(prog.Maps[0]) { + p.newline() + p.newline() + } else if len(prog.Maps) > 0 && wrote { + p.newline() + } + + for i, m := range prog.Maps { + if i > 0 { + if leadingStartsWithBlank(m) { + p.newline() + } else { + p.newline() + p.newline() + } + } + p.printLeadingTrivia(m.Trivia().Leading) + p.printMapDecl(m) + p.printTrailingTrivia(m.Trivia().Trailing) + wrote = true + } + + // Blank line before top-level statements. + if len(prog.Stmts) > 0 && wrote && !leadingStartsWithBlank(prog.Stmts[0]) { + p.newline() + p.newline() + } else if len(prog.Stmts) > 0 && wrote { + p.newline() + } + + for i, s := range prog.Stmts { + if i > 0 { + p.newline() + } + p.printLeadingTrivia(s.Trivia().Leading) + p.writeIndent() + p.printStmt(s) + p.printTrailingTrivia(s.Trivia().Trailing) + } + + if wrote || len(prog.Stmts) > 0 { + p.newline() + } +} + +// printLeadingTrivia emits a node's leading trivia: blank-line markers +// become a bare newline, comments become `# \n`. +func (p *printer) printLeadingTrivia(tri []Trivia) { + for _, t := range tri { + switch t.Kind { + case TriviaBlankLine: + p.newline() + case TriviaComment: + p.writeIndent() + p.write("#") + p.write(t.Text) + p.newline() + } + } +} + +// printTrailingTrivia emits a node's trailing trivia — a same-line comment +// appended after the statement's last character (before the newline the +// caller adds). +func (p *printer) printTrailingTrivia(tri []Trivia) { + for _, t := range tri { + if t.Kind == TriviaComment { + p.write(" #") + p.write(t.Text) + } + } +} + +// leadingStartsWithBlank reports whether a statement-like node's leading +// trivia starts with a blank-line marker, so we can suppress auto-inserted +// blank separators in favour of the user's version. +func leadingStartsWithBlank(s interface{ Trivia() *TriviaSet }) bool { + tri := s.Trivia() + if tri == nil || len(tri.Leading) == 0 { + return false + } + return tri.Leading[0].Kind == TriviaBlankLine +} + +func (p *printer) printImport(imp *ImportStmt) { + p.write("import ") + p.write(quoteString(imp.Path)) + p.write(" as ") + p.write(imp.Namespace) +} + +func (p *printer) printMapDecl(m *MapDecl) { + p.writeIndent() + p.write("map ") + p.write(m.Name) + p.write("(") + for i, param := range m.Params { + if i > 0 { + p.write(", ") + } + p.printParam(param) + } + p.write(") {") + p.newline() + p.indent++ + p.printExprBody(m.Body) + p.indent-- + p.newline() + p.writeIndent() + p.write("}") +} + +func (p *printer) printParam(param Param) { + if param.Discard { + p.write("_") + return + } + p.write(param.Name) + if param.Default != nil { + p.write(" = ") + p.printExpr(param.Default, 0) + } +} + +// ----------------------------------------------------------------------- +// Statements +// ----------------------------------------------------------------------- + +func (p *printer) printStmt(stmt Stmt) { + switch s := stmt.(type) { + case *Assignment: + p.printAssignment(s) + case *IfStmt: + p.printIfStmt(s) + case *MatchStmt: + p.printMatchStmt(s) + default: + p.write(fmt.Sprintf("/* unknown stmt: %T */", stmt)) + } +} + +func (p *printer) printAssignment(a *Assignment) { + p.printAssignTarget(a.Target) + p.write(" = ") + p.printExpr(a.Value, 0) +} + +func (p *printer) printAssignTarget(t AssignTarget) { + switch t.Root { + case AssignOutput: + p.write("output") + if t.MetaAccess { + p.write("@") + } + case AssignVar: + p.write("$") + p.write(t.VarName) + } + for _, seg := range t.Path { + p.printPathSegment(seg) + } +} + +func (p *printer) printIfStmt(s *IfStmt) { + for i, b := range s.Branches { + if i == 0 { + p.write("if ") + } else { + p.write(" else if ") + } + p.printExpr(b.Cond, 0) + p.write(" {") + if len(b.Body) == 0 { + p.write("}") + continue + } + p.newline() + p.indent++ + p.printNestedStmts(b.Body) + p.indent-- + p.newline() + p.writeIndent() + p.write("}") + } + if s.Else != nil { + p.write(" else {") + if len(s.Else) == 0 { + p.write("}") + return + } + p.newline() + p.indent++ + p.printNestedStmts(s.Else) + p.indent-- + p.newline() + p.writeIndent() + p.write("}") + } +} + +// printNestedStmts emits a list of statements one per line, rendering the +// leading/trailing trivia on each. +func (p *printer) printNestedStmts(stmts []Stmt) { + for j, st := range stmts { + if j > 0 { + p.newline() + } + p.printLeadingTrivia(st.Trivia().Leading) + p.writeIndent() + p.printStmt(st) + p.printTrailingTrivia(st.Trivia().Trailing) + } +} + +func (p *printer) printMatchStmt(s *MatchStmt) { + p.write("match") + if s.Subject != nil { + p.write(" ") + p.printExpr(s.Subject, 0) + } + if s.Binding != "" { + p.write(" as ") + p.write(s.Binding) + } + p.write(" {") + p.newline() + p.indent++ + for i, c := range s.Cases { + if i > 0 { + p.newline() + } + p.writeIndent() + p.printMatchCase(c, true) + } + p.indent-- + p.newline() + p.writeIndent() + p.write("}") +} + +// printMatchCase prints a match case. isStmt indicates whether this is a +// statement-context match (always braced body) or an expression-context +// match (body may be bare Expr or braced ExprBody). +func (p *printer) printMatchCase(c MatchCase, isStmt bool) { + if c.Wildcard { + p.write("_") + } else { + p.printExpr(c.Pattern, 0) + } + p.write(" => ") + + switch body := c.Body.(type) { + case []Stmt: + // Statement match case. + p.write("{") + if len(body) == 0 { + p.write("}") + return + } + p.newline() + p.indent++ + p.printNestedStmts(body) + p.indent-- + p.newline() + p.writeIndent() + p.write("}") + + case *ExprBody: + // Expression match case with braced body. Always emit braces — the + // parser distinguishes between a bare-expression case and a braced + // expr-body case, so we must preserve the AST shape. + if len(body.Assignments) == 0 && isSimpleExpr(body.Result) { + p.write("{ ") + p.printExpr(body.Result, 0) + p.write(" }") + return + } + p.write("{") + p.newline() + p.indent++ + p.printExprBody(body) + p.indent-- + p.newline() + p.writeIndent() + p.write("}") + + case Expr: + // Bare expression body. Object literals must be wrapped in parens + // because `{` would be parsed as a braced body introducer. + if _, isObj := body.(*ObjectLiteral); isObj { + p.write("(") + p.printExpr(body, 0) + p.write(")") + return + } + p.printExpr(body, 0) + + case nil: + // empty + default: + if isStmt { + p.write(fmt.Sprintf("/* unknown stmt body: %T */", body)) + } else { + p.write(fmt.Sprintf("/* unknown expr body: %T */", body)) + } + } +} + +// ----------------------------------------------------------------------- +// Expression bodies +// ----------------------------------------------------------------------- + +// printExprBody prints an ExprBody indented to the current level. +// Each assignment and the result live on their own line. +func (p *printer) printExprBody(body *ExprBody) { + for i, va := range body.Assignments { + if i > 0 { + p.newline() + } + p.printLeadingTrivia(va.Trivia().Leading) + p.writeIndent() + p.printVarAssign(va) + p.printTrailingTrivia(va.Trivia().Trailing) + } + if len(body.Assignments) > 0 { + p.newline() + } + p.writeIndent() + p.printExpr(body.Result, 0) +} + +func (p *printer) printVarAssign(va *VarAssign) { + p.write("$") + p.write(va.Name) + for _, seg := range va.Path { + p.printPathSegment(seg) + } + p.write(" = ") + p.printExpr(va.Value, 0) +} + +// ----------------------------------------------------------------------- +// Expressions — precedence-aware +// ----------------------------------------------------------------------- + +// Precedence levels used to determine whether parens are needed. +// Higher = tighter binding. +const ( + precLowest = 0 + precOr = 1 + precAnd = 2 + precEquality = 3 + precComparison = 4 + precAdditive = 5 + precMultiply = 6 + precUnary = 7 + precPostfix = 8 + precAtom = 9 +) + +func binaryPrec(op TokenType) int { + switch op { + case OR: + return precOr + case AND: + return precAnd + case EQ, NE: + return precEquality + case GT, GE, LT, LE: + return precComparison + case PLUS, MINUS: + return precAdditive + case STAR, SLASH, PERCENT: + return precMultiply + default: + return precLowest + } +} + +// exprPrec returns the precedence of the outermost operator of expr. +// Atoms (literals, idents, input/output, var, paren-wrappable things) +// and postfix-ish nodes return precAtom. +func exprPrec(expr Expr) int { + switch e := expr.(type) { + case *BinaryExpr: + return binaryPrec(e.Op) + case *UnaryExpr: + return precUnary + case *IfExpr, *MatchExpr, *LambdaExpr: + // These need parens when used inside binary/unary/postfix chains, + // treat as low precedence. + return precLowest + default: + return precAtom + } +} + +// printExpr prints expr, wrapping in parens if its precedence is lower +// than minPrec (i.e. the context binds tighter). +func (p *printer) printExpr(expr Expr, minPrec int) { + ep := exprPrec(expr) + needParens := ep < minPrec + if needParens { + p.write("(") + } + p.printExprRaw(expr) + if needParens { + p.write(")") + } +} + +func (p *printer) printExprRaw(expr Expr) { + switch e := expr.(type) { + case *LiteralExpr: + p.printLiteral(e) + case *InputExpr: + p.write("input") + case *InputMetaExpr: + p.write("input@") + case *OutputExpr: + p.write("output") + case *OutputMetaExpr: + p.write("output@") + case *VarExpr: + p.write("$") + p.write(e.Name) + case *IdentExpr: + if e.Namespace != "" { + p.write(e.Namespace) + p.write("::") + } + p.write(e.Name) + case *BinaryExpr: + p.printBinary(e) + case *UnaryExpr: + p.printUnary(e) + case *CallExpr: + p.printCall(e) + case *MethodCallExpr: + p.printMethodCall(e) + case *FieldAccessExpr: + p.printFieldAccess(e) + case *IndexExpr: + p.printIndex(e) + case *LambdaExpr: + p.printLambda(e) + case *ArrayLiteral: + p.printArray(e) + case *ObjectLiteral: + p.printObject(e) + case *IfExpr: + p.printIfExpr(e) + case *MatchExpr: + p.printMatchExpr(e) + case *PathExpr: + p.printPath(e) + default: + p.write(fmt.Sprintf("/* unknown expr: %T */", expr)) + } +} + +// ----------------------------------------------------------------------- +// Atoms / postfix +// ----------------------------------------------------------------------- + +func (p *printer) printLiteral(l *LiteralExpr) { + switch l.TokenType { + case INT, FLOAT: + p.write(l.Value) + case STRING: + p.write(quoteString(l.Value)) + case RAW_STRING: + // Prefer raw if no backticks present; otherwise fall back to quoted. + if !strings.Contains(l.Value, "`") { + p.write("`") + p.write(l.Value) + p.write("`") + } else { + p.write(quoteString(l.Value)) + } + case TRUE: + p.write("true") + case FALSE: + p.write("false") + case NULL: + p.write("null") + default: + p.write(l.Value) + } +} + +func (p *printer) printBinary(b *BinaryExpr) { + myPrec := binaryPrec(b.Op) + + // Left side: same prec is OK for left-associative operators. For + // non-associative operators (==, !=, <, <=, >, >=) the parser rejects + // chains at the same precedence, so equal-prec children must be + // parenthesised on the left as well as the right. + leftMin := myPrec + if isNonAssocOp(b.Op) { + leftMin = myPrec + 1 + } + p.printExpr(b.Left, leftMin) + p.write(" ") + p.write(b.Op.String()) + p.write(" ") + // Right side needs tighter binding; use myPrec+1 to force parens around + // a right-hand child with equal or lower precedence. + p.printExpr(b.Right, myPrec+1) +} + +func isNonAssocOp(op TokenType) bool { + switch op { + case EQ, NE, GT, GE, LT, LE: + return true + } + return false +} + +func (p *printer) printUnary(u *UnaryExpr) { + p.write(u.Op.String()) + p.printExpr(u.Operand, precUnary) +} + +func (p *printer) printCall(c *CallExpr) { + if c.Namespace != "" { + p.write(c.Namespace) + p.write("::") + } + p.write(c.Name) + p.write("(") + p.printArgs(c.Args, c.Named) + p.write(")") +} + +func (p *printer) printArgs(args []CallArg, named bool) { + for i, a := range args { + if i > 0 { + p.write(", ") + } + if named { + p.write(a.Name) + p.write(": ") + } + p.printExpr(a.Value, 0) + } +} + +func (p *printer) printMethodCall(m *MethodCallExpr) { + // Receiver needs postfix-level binding. + p.printExpr(m.Receiver, precPostfix) + if m.NullSafe { + p.write("?.") + } else { + p.write(".") + } + p.write(m.Method) + p.write("(") + p.printArgs(m.Args, m.Named) + p.write(")") +} + +func (p *printer) printFieldAccess(f *FieldAccessExpr) { + p.printExpr(f.Receiver, precPostfix) + if f.NullSafe { + p.write("?.") + } else { + p.write(".") + } + p.write(fieldName(f.Field)) +} + +func (p *printer) printIndex(i *IndexExpr) { + p.printExpr(i.Receiver, precPostfix) + if i.NullSafe { + p.write("?[") + } else { + p.write("[") + } + p.printExpr(i.Index, 0) + p.write("]") +} + +func (p *printer) printPath(e *PathExpr) { + switch e.Root { + case PathRootInput: + p.write("input") + case PathRootInputMeta: + p.write("input@") + case PathRootOutput: + p.write("output") + case PathRootOutputMeta: + p.write("output@") + case PathRootVar: + p.write("$") + p.write(e.VarName) + } + for _, seg := range e.Segments { + p.printPathSegment(seg) + } +} + +func (p *printer) printPathSegment(seg PathSegment) { + switch seg.Kind { + case PathSegField: + if seg.NullSafe { + p.write("?.") + } else { + p.write(".") + } + p.write(fieldName(seg.Name)) + case PathSegIndex: + if seg.NullSafe { + p.write("?[") + } else { + p.write("[") + } + p.printExpr(seg.Index, 0) + p.write("]") + case PathSegMethod: + if seg.NullSafe { + p.write("?.") + } else { + p.write(".") + } + p.write(seg.Name) + p.write("(") + p.printArgs(seg.Args, seg.Named) + p.write(")") + } +} + +// fieldName returns the textual form of a field name — bare word if it is +// a valid identifier or keyword word, otherwise a quoted string. +func fieldName(name string) string { + if isWord(name) { + return name + } + return quoteString(name) +} + +func isWord(s string) bool { + if s == "" { + return false + } + for i := range len(s) { + ch := s[i] + isStart := (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch == '_' + if i == 0 { + if !isStart { + return false + } + continue + } + isCont := isStart || (ch >= '0' && ch <= '9') + if !isCont { + return false + } + } + return true +} + +// ----------------------------------------------------------------------- +// Literals +// ----------------------------------------------------------------------- + +// quoteString produces a Go-style double-quoted string literal whose +// contents round-trip through Bloblang's string scanner. +func quoteString(s string) string { + var sb strings.Builder + sb.WriteByte('"') + for _, r := range s { + switch r { + case '\\': + sb.WriteString(`\\`) + case '"': + sb.WriteString(`\"`) + case '\n': + sb.WriteString(`\n`) + case '\r': + sb.WriteString(`\r`) + case '\t': + sb.WriteString(`\t`) + default: + if r < 0x20 { + sb.WriteString(fmt.Sprintf(`\u%04X`, r)) + } else { + sb.WriteRune(r) + } + } + } + sb.WriteByte('"') + return sb.String() +} + +// ----------------------------------------------------------------------- +// Array / object +// ----------------------------------------------------------------------- + +func (p *printer) printArray(a *ArrayLiteral) { + if shouldWrapArray(a) { + p.write("[") + p.newline() + p.indent++ + for i, el := range a.Elements { + p.writeIndent() + p.printExpr(el, 0) + if i < len(a.Elements)-1 { + p.write(",") + } + p.newline() + } + p.indent-- + p.writeIndent() + p.write("]") + return + } + p.write("[") + for i, el := range a.Elements { + if i > 0 { + p.write(", ") + } + p.printExpr(el, 0) + } + p.write("]") +} + +func (p *printer) printObject(o *ObjectLiteral) { + if len(o.Entries) == 0 { + p.write("{}") + return + } + if shouldWrapObject(o) { + p.write("{") + p.newline() + p.indent++ + for _, entry := range o.Entries { + p.writeIndent() + p.printExpr(entry.Key, 0) + p.write(": ") + p.printExpr(entry.Value, 0) + // Always emit a trailing comma so newlines between entries and the + // closing brace are handled uniformly by the parser. + p.write(",") + p.newline() + } + p.indent-- + p.writeIndent() + p.write("}") + return + } + p.write("{") + for i, entry := range o.Entries { + if i > 0 { + p.write(", ") + } + p.printExpr(entry.Key, 0) + p.write(": ") + p.printExpr(entry.Value, 0) + } + p.write("}") +} + +// shouldWrapArray returns true if the array should be emitted multi-line. +func shouldWrapArray(a *ArrayLiteral) bool { + if len(a.Elements) >= 3 { + for _, el := range a.Elements { + if hasStructured(el) { + return true + } + } + } + for _, el := range a.Elements { + if hasStructured(el) { + return true + } + } + return false +} + +// shouldWrapObject returns true if the object should be emitted multi-line. +func shouldWrapObject(o *ObjectLiteral) bool { + if len(o.Entries) >= 3 { + return true + } + for _, entry := range o.Entries { + if hasStructured(entry.Value) || hasStructured(entry.Key) { + return true + } + } + return false +} + +// hasStructured reports whether e contains (or is) a nested array/object. +func hasStructured(e Expr) bool { + switch v := e.(type) { + case *ArrayLiteral: + return len(v.Elements) > 0 + case *ObjectLiteral: + return len(v.Entries) > 0 + } + return false +} + +// ----------------------------------------------------------------------- +// Lambda / control flow +// ----------------------------------------------------------------------- + +func (p *printer) printLambda(l *LambdaExpr) { + // Single-param form: name -> body (only when one non-default non-discard + // param without parens ambiguity). + if len(l.Params) == 1 && !l.Params[0].Discard && l.Params[0].Default == nil { + p.write(l.Params[0].Name) + } else if len(l.Params) == 1 && l.Params[0].Discard { + p.write("_") + } else { + p.write("(") + for i, param := range l.Params { + if i > 0 { + p.write(", ") + } + p.printParam(param) + } + p.write(")") + } + p.write(" -> ") + // If body has variable assignments, use a block. Otherwise, bare expression. + if len(l.Body.Assignments) == 0 { + // If result is an object literal, we must wrap in parens to avoid + // object-literal vs block ambiguity? Actually the parser peeks inside + // to disambiguate — a bare { without a leading $var/output is parsed + // as an object literal expression. So bare { key: value } is fine as + // a lambda body. + p.printExpr(l.Body.Result, 0) + return + } + p.write("{") + p.newline() + p.indent++ + p.printExprBody(l.Body) + p.indent-- + p.newline() + p.writeIndent() + p.write("}") +} + +func (p *printer) printIfExpr(e *IfExpr) { + for i, b := range e.Branches { + if i == 0 { + p.write("if ") + } else { + p.write(" else if ") + } + p.printExpr(b.Cond, 0) + p.write(" {") + p.printBracedExprBody(b.Body) + p.write("}") + } + if e.Else != nil { + p.write(" else {") + p.printBracedExprBody(e.Else) + p.write("}") + } +} + +// printBracedExprBody emits an expression body inside braces. If the body is +// a single simple result, it is printed inline with spaces. Otherwise, it is +// printed on multiple indented lines. +func (p *printer) printBracedExprBody(body *ExprBody) { + if body == nil { + return + } + if len(body.Assignments) == 0 && isSimpleExpr(body.Result) { + p.write(" ") + p.printExpr(body.Result, 0) + p.write(" ") + return + } + p.newline() + p.indent++ + p.printExprBody(body) + p.indent-- + p.newline() + p.writeIndent() +} + +// isSimpleExpr reports whether expr is simple enough to fit inline in a +// braced body. +func isSimpleExpr(expr Expr) bool { + switch v := expr.(type) { + case *LiteralExpr, *InputExpr, *InputMetaExpr, *OutputExpr, *OutputMetaExpr, + *VarExpr, *IdentExpr: + return true + case *ArrayLiteral: + if len(v.Elements) == 0 { + return true + } + return false + case *ObjectLiteral: + return len(v.Entries) == 0 + case *PathExpr: + // Simple if all segments are simple field/index/method segs. + for _, s := range v.Segments { + if s.Kind == PathSegMethod && len(s.Args) > 1 { + return false + } + } + return true + case *FieldAccessExpr, *IndexExpr: + return true + case *MethodCallExpr: + return len(v.Args) <= 1 + case *CallExpr: + return len(v.Args) <= 2 + case *UnaryExpr: + return isSimpleExpr(v.Operand) + case *BinaryExpr: + return isSimpleExpr(v.Left) && isSimpleExpr(v.Right) + } + return false +} + +func (p *printer) printMatchExpr(m *MatchExpr) { + p.write("match") + if m.Subject != nil { + p.write(" ") + p.printExpr(m.Subject, 0) + } + if m.Binding != "" { + p.write(" as ") + p.write(m.Binding) + } + p.write(" {") + p.newline() + p.indent++ + for i, c := range m.Cases { + if i > 0 { + p.write(",") + p.newline() + } + p.writeIndent() + p.printMatchCase(c, false) + } + // Trailing comma only when there are cases — `match {}` must not + // emit a stray `,` between the braces. + if len(m.Cases) > 0 { + p.write(",") + p.newline() + } + p.indent-- + p.writeIndent() + p.write("}") +} diff --git a/internal/bloblang2/go/pratt/syntax/print_test.go b/internal/bloblang2/go/pratt/syntax/print_test.go new file mode 100644 index 000000000..ad52bd242 --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/print_test.go @@ -0,0 +1,571 @@ +package syntax + +import ( + "fmt" + "os" + "path/filepath" + "reflect" + "sort" + "strings" + "testing" + + "gopkg.in/yaml.v3" +) + +// ----------------------------------------------------------------------- +// Round-trip test against the spec corpus. +// ----------------------------------------------------------------------- + +// testYAMLFile is a minimal mirror of spectest.TestFile, defined locally +// to avoid an import cycle (spectest imports this package transitively). +type testYAMLFile struct { + Description string `yaml:"description"` + Files map[string]string `yaml:"files"` + Tests []testYAMLCase `yaml:"tests"` +} + +type testYAMLCase struct { + Name string `yaml:"name"` + Mapping string `yaml:"mapping"` + CompileError string `yaml:"compile_error"` + Error string `yaml:"error"` + Cases []yaml.Node `yaml:"cases"` + Files map[string]string `yaml:"files"` +} + +func loadYAML(path string) (*testYAMLFile, error) { + data, err := os.ReadFile(path) + if err != nil { + return nil, err + } + var tf testYAMLFile + if err := yaml.Unmarshal(data, &tf); err != nil { + return nil, err + } + return &tf, nil +} + +func discoverYAML(root string) ([]string, error) { + var files []string + err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error { + if err != nil { + return err + } + if info.IsDir() { + return nil + } + if strings.HasSuffix(info.Name(), ".yaml") { + files = append(files, path) + } + return nil + }) + if err != nil { + return nil, err + } + sort.Strings(files) + return files, nil +} + +// specTestsDir locates the spec/tests directory relative to this source file. +func specTestsDir(t *testing.T) string { + t.Helper() + // go/pratt/syntax/print_test.go — spec/tests is ../../../spec/tests + // from the file's directory. + dir, err := os.Getwd() + if err != nil { + t.Fatalf("getwd: %v", err) + } + // Walk up until we find spec/tests. + cur := dir + for i := 0; i < 6; i++ { + candidate := filepath.Join(cur, "spec", "tests") + if info, err := os.Stat(candidate); err == nil && info.IsDir() { + return candidate + } + cur = filepath.Dir(cur) + } + t.Fatalf("could not locate spec/tests relative to %s", dir) + return "" +} + +// TestPrintRoundTrip parses every mapping in the spec corpus, prints it, +// re-parses the print, and asserts structural equivalence. +func TestPrintRoundTrip(t *testing.T) { + root := specTestsDir(t) + files, err := discoverYAML(root) + if err != nil { + t.Fatalf("discover: %v", err) + } + + var ( + total int + okCount int + failures []string + ) + + for _, path := range files { + tf, err := loadYAML(path) + if err != nil { + t.Logf("skip %s: %v", path, err) + continue + } + for _, tc := range tf.Tests { + // Skip multi-case tests for simplicity. + if len(tc.Cases) > 0 { + continue + } + // Skip tests that are expected to fail compilation. + if tc.CompileError != "" { + continue + } + if tc.Mapping == "" { + continue + } + + total++ + caseFiles := tc.Files + if caseFiles == nil { + caseFiles = tf.Files + } + + if roundTripOK(t, tc.Mapping, caseFiles) { + okCount++ + continue + } + + rel, _ := filepath.Rel(root, path) + failures = append(failures, fmt.Sprintf("%s :: %s", rel, tc.Name)) + } + } + + if total == 0 { + t.Fatal("no round-trip test cases collected") + } + + rate := float64(okCount) / float64(total) + t.Logf("round-trip: %d/%d passed (%.2f%%)", okCount, total, rate*100) + for _, f := range failures { + t.Logf(" fail: %s", f) + } + + if rate < 0.95 { + t.Fatalf("round-trip success rate %.2f%% is below 95%%", rate*100) + } +} + +// roundTripOK parses src, prints the result, re-parses the printed form, +// and reports whether the two ASTs are structurally equivalent (positions +// ignored). +func roundTripOK(t *testing.T, src string, files map[string]string) bool { + t.Helper() + + prog1, errs1 := Parse(src, "", files) + if len(errs1) > 0 { + // Skip — parent mapping was not expected to parse cleanly. + return true + } + + printed := Print(prog1) + + prog2, errs2 := Parse(printed, "", files) + if len(errs2) > 0 { + t.Logf("re-parse failed for:\n--- original ---\n%s\n--- printed ---\n%s\n--- errors ---\n%s", + src, printed, FormatErrors(errs2)) + return false + } + + p1 := cloneAndZero(prog1) + p2 := cloneAndZero(prog2) + + if !reflect.DeepEqual(p1, p2) { + t.Logf("AST mismatch for:\n--- original ---\n%s\n--- printed ---\n%s", src, printed) + return false + } + return true +} + +// cloneAndZero returns a copy of p with all position data zeroed so that +// two ASTs parsed from different strings can be compared for structural +// equivalence. +func cloneAndZero(p *Program) *Program { + c := &Program{ + Stmts: make([]Stmt, len(p.Stmts)), + Maps: make([]*MapDecl, len(p.Maps)), + Imports: make([]*ImportStmt, len(p.Imports)), + MaxSlots: 0, + ReadsOutput: false, + } + for i, s := range p.Stmts { + c.Stmts[i] = zeroStmt(s) + } + for i, m := range p.Maps { + c.Maps[i] = zeroMap(m) + } + for i, imp := range p.Imports { + c.Imports[i] = &ImportStmt{Path: imp.Path, Namespace: imp.Namespace} + } + // Skip Namespaces — transitive structures can differ in pointer identity + // even when content is equivalent; not relevant for round-trip checks. + return c +} + +func zeroStmt(s Stmt) Stmt { + switch v := s.(type) { + case *Assignment: + return &Assignment{ + Target: zeroAssignTarget(v.Target), + Value: zeroExpr(v.Value), + } + case *IfStmt: + out := &IfStmt{} + for _, b := range v.Branches { + newB := IfBranch{Cond: zeroExpr(b.Cond)} + for _, st := range b.Body { + newB.Body = append(newB.Body, zeroStmt(st)) + } + out.Branches = append(out.Branches, newB) + } + for _, st := range v.Else { + out.Else = append(out.Else, zeroStmt(st)) + } + return out + case *MatchStmt: + out := &MatchStmt{ + Subject: zeroExpr(v.Subject), + Binding: v.Binding, + } + for _, c := range v.Cases { + out.Cases = append(out.Cases, zeroMatchCase(c)) + } + return out + } + return s +} + +func zeroAssignTarget(t AssignTarget) AssignTarget { + out := AssignTarget{ + Root: t.Root, + VarName: t.VarName, + MetaAccess: t.MetaAccess, + } + for _, seg := range t.Path { + out.Path = append(out.Path, zeroPathSegment(seg)) + } + return out +} + +func zeroMap(m *MapDecl) *MapDecl { + out := &MapDecl{ + Name: m.Name, + Body: zeroExprBody(m.Body), + } + for _, param := range m.Params { + out.Params = append(out.Params, zeroParam(param)) + } + return out +} + +func zeroParam(p Param) Param { + return Param{ + Name: p.Name, + Default: zeroExpr(p.Default), + Discard: p.Discard, + } +} + +func zeroMatchCase(c MatchCase) MatchCase { + out := MatchCase{ + Wildcard: c.Wildcard, + Pattern: zeroExpr(c.Pattern), + } + switch body := c.Body.(type) { + case []Stmt: + var stmts []Stmt + for _, s := range body { + stmts = append(stmts, zeroStmt(s)) + } + out.Body = stmts + case *ExprBody: + out.Body = zeroExprBody(body) + case Expr: + out.Body = zeroExpr(body) + } + return out +} + +func zeroExprBody(b *ExprBody) *ExprBody { + if b == nil { + return nil + } + out := &ExprBody{Result: zeroExpr(b.Result)} + for _, va := range b.Assignments { + newVA := &VarAssign{Name: va.Name, Value: zeroExpr(va.Value)} + for _, seg := range va.Path { + newVA.Path = append(newVA.Path, zeroPathSegment(seg)) + } + out.Assignments = append(out.Assignments, newVA) + } + return out +} + +func zeroPathSegment(s PathSegment) PathSegment { + out := PathSegment{ + Kind: s.Kind, + Name: s.Name, + Index: zeroExpr(s.Index), + NullSafe: s.NullSafe, + Named: s.Named, + } + for _, a := range s.Args { + out.Args = append(out.Args, CallArg{Name: a.Name, Value: zeroExpr(a.Value)}) + } + return out +} + +func zeroExpr(e Expr) Expr { + if e == nil { + return nil + } + switch v := e.(type) { + case *LiteralExpr: + return &LiteralExpr{TokenType: v.TokenType, Value: v.Value} + case *InputExpr: + return &InputExpr{} + case *InputMetaExpr: + return &InputMetaExpr{} + case *OutputExpr: + return &OutputExpr{} + case *OutputMetaExpr: + return &OutputMetaExpr{} + case *VarExpr: + return &VarExpr{Name: v.Name} + case *IdentExpr: + return &IdentExpr{Name: v.Name, Namespace: v.Namespace} + case *BinaryExpr: + return &BinaryExpr{Left: zeroExpr(v.Left), Op: v.Op, Right: zeroExpr(v.Right)} + case *UnaryExpr: + return &UnaryExpr{Op: v.Op, Operand: zeroExpr(v.Operand)} + case *CallExpr: + out := &CallExpr{Name: v.Name, Namespace: v.Namespace, Named: v.Named} + for _, a := range v.Args { + out.Args = append(out.Args, CallArg{Name: a.Name, Value: zeroExpr(a.Value)}) + } + return out + case *MethodCallExpr: + out := &MethodCallExpr{ + Receiver: zeroExpr(v.Receiver), + Method: v.Method, + Named: v.Named, + NullSafe: v.NullSafe, + } + for _, a := range v.Args { + out.Args = append(out.Args, CallArg{Name: a.Name, Value: zeroExpr(a.Value)}) + } + return out + case *FieldAccessExpr: + return &FieldAccessExpr{ + Receiver: zeroExpr(v.Receiver), + Field: v.Field, + NullSafe: v.NullSafe, + } + case *IndexExpr: + return &IndexExpr{ + Receiver: zeroExpr(v.Receiver), + Index: zeroExpr(v.Index), + NullSafe: v.NullSafe, + } + case *LambdaExpr: + out := &LambdaExpr{Body: zeroExprBody(v.Body)} + for _, p := range v.Params { + out.Params = append(out.Params, zeroParam(p)) + } + return out + case *ArrayLiteral: + out := &ArrayLiteral{} + for _, el := range v.Elements { + out.Elements = append(out.Elements, zeroExpr(el)) + } + return out + case *ObjectLiteral: + out := &ObjectLiteral{} + for _, entry := range v.Entries { + out.Entries = append(out.Entries, ObjectEntry{ + Key: zeroExpr(entry.Key), + Value: zeroExpr(entry.Value), + }) + } + return out + case *IfExpr: + out := &IfExpr{Else: zeroExprBody(v.Else)} + for _, b := range v.Branches { + out.Branches = append(out.Branches, IfExprBranch{ + Cond: zeroExpr(b.Cond), + Body: zeroExprBody(b.Body), + }) + } + return out + case *MatchExpr: + out := &MatchExpr{Subject: zeroExpr(v.Subject), Binding: v.Binding} + for _, c := range v.Cases { + out.Cases = append(out.Cases, zeroMatchCase(c)) + } + return out + case *PathExpr: + out := &PathExpr{Root: v.Root, VarName: v.VarName} + for _, seg := range v.Segments { + out.Segments = append(out.Segments, zeroPathSegment(seg)) + } + return out + } + return e +} + +// ----------------------------------------------------------------------- +// Hand-crafted formatting expectations. +// ----------------------------------------------------------------------- + +func TestPrintFormatting(t *testing.T) { + cases := []struct { + name string + src string + want string + }{ + { + name: "simple assignment stays single line", + src: `output.x = 1 + 2`, + want: "output.x = 1 + 2\n", + }, + { + name: "four-entry object wraps", + src: `output = {"a": 1, "b": 2, "c": 3, "d": 4}`, + want: `output = { + "a": 1, + "b": 2, + "c": 3, + "d": 4, +} +`, + }, + { + name: "two-entry object stays compact", + src: `output = {"a": 1, "b": 2}`, + want: "output = {\"a\": 1, \"b\": 2}\n", + }, + { + name: "nested object wraps parent", + src: `output = {"a": {"x": 1, "y": 2}}`, + want: `output = { + "a": {"x": 1, "y": 2}, +} +`, + }, + { + name: "simple array stays single line", + src: `output = [1, 2]`, + want: "output = [1, 2]\n", + }, + { + name: "if-expression inline when simple", + src: `output.x = if true { 1 } else { 2 }`, + want: "output.x = if true { 1 } else { 2 }\n", + }, + { + name: "import before map before stmts", + src: `output.y = 1 +map greet(name) { "hi " + name } +import "h.blobl" as h`, + want: `import "h.blobl" as h + +map greet(name) { + "hi " + name +} + +output.y = 1 +`, + }, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + prog, errs := Parse(tc.src, "", map[string]string{ + "h.blobl": `map dummy(x) { x }`, + }) + if len(errs) > 0 { + t.Fatalf("parse errors: %s", FormatErrors(errs)) + } + got := Print(prog) + if got != tc.want { + t.Fatalf("mismatch:\n--- got ---\n%s\n--- want ---\n%s", got, tc.want) + } + }) + } +} + +// TestPrintEmptyMatch locks in the rule that an empty match expression +// prints as `match {}` without a stray trailing comma between the braces. +// Surfaced by FuzzParse on input `$A=match{}`. +func TestPrintEmptyMatch(t *testing.T) { + src := `$x = match {}` + prog, errs := Parse(src, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors: %s", FormatErrors(errs)) + } + got := Print(prog) + if _, errs := Parse(got, "", nil); len(errs) > 0 { + t.Fatalf("re-parse of printed output errored:\n%s\nerrors: %s", got, FormatErrors(errs)) + } +} + +// TestPrintNonAssocParens locks in the printer rule that non-associative +// operators (==, !=, <, <=, >, >=) preserve parens around equal-precedence +// children — both left and right — so the printed output remains parseable. +func TestPrintNonAssocParens(t *testing.T) { + cases := []struct { + name string + src string + want string + }{ + { + name: "equality LHS is comparison chain", + src: `output = (1 == 2) != (3 < 4) && (5 >= 6)`, + // The original parens around (3 < 4) and (5 >= 6) are redundant — + // < binds tighter than != and >= binds tighter than &&, so the + // printed form remains unambiguous and re-parses cleanly. The + // parens around (1 == 2) on the LHS are required: != is + // non-associative with == at the same precedence. + want: "output = (1 == 2) != 3 < 4 && 5 >= 6\n", + }, + { + name: "equality LHS is equality", + src: `output = (a == b) == c`, + want: "output = (a == b) == c\n", + }, + { + name: "comparison LHS is comparison", + src: `output = (a < b) < c`, + want: "output = (a < b) < c\n", + }, + { + name: "left-associative additive needs no LHS parens", + src: `output = 1 + 2 + 3`, + want: "output = 1 + 2 + 3\n", + }, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + prog, errs := Parse(tc.src, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors: %s", FormatErrors(errs)) + } + got := Print(prog) + if got != tc.want { + t.Fatalf("print mismatch:\n--- got ---\n%s\n--- want ---\n%s", got, tc.want) + } + // Re-parse must succeed — the printed form is the actual contract. + if _, errs := Parse(got, "", nil); len(errs) > 0 { + t.Fatalf("re-parse of printed output errored:\n%s\nerrors: %s", got, FormatErrors(errs)) + } + }) + } +} diff --git a/internal/bloblang2/go/pratt/syntax/print_trivia_test.go b/internal/bloblang2/go/pratt/syntax/print_trivia_test.go new file mode 100644 index 000000000..40e6e75e6 --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/print_trivia_test.go @@ -0,0 +1,103 @@ +package syntax + +import ( + "strings" + "testing" +) + +// Helper: build a bare output.x = input.x assignment with the given trivia. +func testAssign(name string, leading, trailing []Trivia) *Assignment { + return &Assignment{ + TriviaSet: TriviaSet{Leading: leading, Trailing: trailing}, + Target: AssignTarget{ + Root: AssignOutput, + Path: []PathSegment{{Kind: PathSegField, Name: name}}, + }, + Value: &PathExpr{Root: PathRootInput, Segments: []PathSegment{{Kind: PathSegField, Name: name}}}, + } +} + +func TestPrintLeadingComment(t *testing.T) { + prog := &Program{ + Stmts: []Stmt{ + testAssign("a", []Trivia{{Kind: TriviaComment, Text: " the answer"}}, nil), + }, + } + got := Print(prog) + want := "# the answer\noutput.a = input.a\n" + if got != want { + t.Errorf("got:\n%s\nwant:\n%s", got, want) + } +} + +func TestPrintTrailingComment(t *testing.T) { + prog := &Program{ + Stmts: []Stmt{ + testAssign("a", nil, []Trivia{{Kind: TriviaComment, Text: " why"}}), + }, + } + got := Print(prog) + want := "output.a = input.a # why\n" + if got != want { + t.Errorf("got: %q, want: %q", got, want) + } +} + +func TestPrintBlankLineBetweenStatements(t *testing.T) { + prog := &Program{ + Stmts: []Stmt{ + testAssign("a", nil, nil), + testAssign("b", []Trivia{{Kind: TriviaBlankLine}}, nil), + }, + } + got := Print(prog) + want := "output.a = input.a\n\noutput.b = input.b\n" + if got != want { + t.Errorf("got:\n%q\nwant:\n%q", got, want) + } +} + +func TestPrintCommentBlockAndBlankBeforeStatement(t *testing.T) { + prog := &Program{ + Stmts: []Stmt{ + testAssign("a", nil, nil), + testAssign("b", []Trivia{ + {Kind: TriviaBlankLine}, + {Kind: TriviaComment, Text: " section B"}, + }, []Trivia{{Kind: TriviaComment, Text: " inline"}}), + }, + } + got := Print(prog) + want := "output.a = input.a\n\n# section B\noutput.b = input.b # inline\n" + if got != want { + t.Errorf("got:\n%s\nwant:\n%s", got, want) + } +} + +func TestPrintCommentInsideMapBody(t *testing.T) { + prog := &Program{ + Maps: []*MapDecl{{ + TokenPos: Pos{Line: 1, Column: 1}, + Name: "m", + Params: []Param{{Name: "v"}}, + Body: &ExprBody{ + Assignments: []*VarAssign{ + { + TriviaSet: TriviaSet{Leading: []Trivia{{Kind: TriviaComment, Text: " inside body"}}}, + Name: "t", + Value: &LiteralExpr{TokenType: INT, Value: "1"}, + }, + }, + Result: &VarExpr{Name: "t"}, + }, + }}, + } + got := Print(prog) + // Must contain the comment indented one level inside the map body. + if !strings.Contains(got, " # inside body\n") { + t.Errorf("missing indented comment in:\n%s", got) + } + if !strings.Contains(got, "$t = 1") { + t.Errorf("missing var assign in:\n%s", got) + } +} diff --git a/internal/bloblang2/go/pratt/syntax/resolver.go b/internal/bloblang2/go/pratt/syntax/resolver.go new file mode 100644 index 000000000..c45cb4f0e --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/resolver.go @@ -0,0 +1,1056 @@ +package syntax + +import "fmt" + +// ArgFolder performs parse-time evaluation of a stdlib call's arguments +// so the runtime can skip repeat work. The folder inspects the AST args +// (typically checking for string-literal shapes) and returns a +// same-length slice of folded values, using nil for argument positions +// that aren't eligible for folding. On success the resolver writes +// each non-nil entry onto the corresponding CallArg.Folded, and the +// interpreter substitutes the folded value for the arg at runtime. +// +// Returning a non-nil error surfaces as a resolver diagnostic anchored +// at the call site. That's the right behaviour for cases like an +// invalid regex pattern — the caller learns about the problem at +// parse time rather than on first call. +type ArgFolder func(args []CallArg) (folded []any, err error) + +// CallFolder is the per-call-site analogue of ArgFolder. Where ArgFolder +// precomputes individual argument values, CallFolder precomputes a dispatch +// target for the call as a whole. The returned value (when non-nil) is +// written to the call node's Prebound slot and consulted by the interpreter +// before normal dispatch. +// +// The primary consumer is the public plugin surface: when every argument +// at a call site is a literal, the plugin's constructor can be invoked once +// at parse time and the resulting closure cached on the AST, eliminating +// per-call constructor overhead. Runs after ArgFolder so it can see already- +// folded values. +type CallFolder func(args []CallArg) (prebound any, err error) + +// FunctionInfo carries compile-time metadata about a stdlib function. +type FunctionInfo struct { + // Required is the number of required parameters. + Required int + // Total is the total number of parameters (required + optional). + // -1 means no arity checking (variadic or handled at runtime). + Total int + // Params is per-parameter metadata, parallel to declared positions. + // Empty when the function does not declare per-position lambda + // acceptance (the default). + Params []FunctionParamInfo + // ArgFolder, if set, is invoked by the resolver to precompute + // literal arguments (see ArgFolder docs). + ArgFolder ArgFolder + // CallFolder, if set, is invoked by the resolver to precompute a + // call-site dispatch target (see CallFolder docs). + CallFolder CallFolder +} + +// FunctionParamInfo carries compile-time metadata about one function +// parameter. Mirrors MethodParamInfo for the function dispatch path. +type FunctionParamInfo struct { + Name string + HasDefault bool + AcceptsLambda bool +} + +// ParamAcceptsLambda reports whether a lambda is accepted at the given +// argument position. Mirrors MethodInfo.ParamAcceptsLambda. +func (fi FunctionInfo) ParamAcceptsLambda(position int, name string) bool { + if len(fi.Params) == 0 { + return false + } + if name != "" { + for _, p := range fi.Params { + if p.Name == name { + return p.AcceptsLambda + } + } + return false + } + if position < 0 || position >= len(fi.Params) { + return false + } + return fi.Params[position].AcceptsLambda +} + +// MethodInfo carries compile-time metadata about a stdlib method. +type MethodInfo struct { + // Required is the number of required parameters. + Required int + // Total is the total number of parameters (required + optional). + // -1 means no arity checking (params not declared, validated at runtime). + Total int + // Params is per-parameter metadata, parallel to declared positions. + // Empty when the method doesn't declare params (variadic — e.g. .sort); + // in that case AcceptsLambda is the method-level fallback. + Params []MethodParamInfo + // AcceptsLambda is the method-level fallback used when Params is empty. + AcceptsLambda bool + // ArgFolder, if set, is invoked by the resolver to precompute + // literal arguments (see ArgFolder docs). + ArgFolder ArgFolder + // CallFolder, if set, is invoked by the resolver to precompute a + // call-site dispatch target (see CallFolder docs). + CallFolder CallFolder +} + +// MethodParamInfo carries compile-time metadata about one method parameter. +type MethodParamInfo struct { + Name string + HasDefault bool + AcceptsLambda bool +} + +// ParamAcceptsLambda reports whether a lambda is accepted at the given +// argument position. For named args, name selects the param; for positional +// args, position is used. +func (mi MethodInfo) ParamAcceptsLambda(position int, name string) bool { + if len(mi.Params) == 0 { + return mi.AcceptsLambda + } + if name != "" { + for _, p := range mi.Params { + if p.Name == name { + return p.AcceptsLambda + } + } + return false + } + if position < 0 || position >= len(mi.Params) { + return false + } + return mi.Params[position].AcceptsLambda +} + +// ResolveOptions configures the semantic analysis pass. +type ResolveOptions struct { + // Methods maps method names to their compile-time arity metadata. + Methods map[string]MethodInfo + // Functions maps function names to their compile-time arity metadata. + Functions map[string]FunctionInfo + // MethodOpcodes maps method names to opcode IDs for runtime dispatch. + // Nil to skip opcode annotation (e.g. LSP diagnostics-only mode). + MethodOpcodes map[string]uint16 + // FunctionOpcodes maps function names to opcode IDs for runtime dispatch. + // Nil to skip opcode annotation. + FunctionOpcodes map[string]uint16 +} + +// Resolve performs semantic analysis on a parsed program, checking for: +// - Undeclared variable references +// - Block-scoped variable visibility +// - Map isolation (no input/output in map bodies) +// - Lambda purity (no output assignments) +// - Boolean literal cases in equality match +// - Duplicate map names +// - Function arity mismatches +// +// When opcode maps are provided in opts, AST nodes are annotated with +// compile-time opcode IDs for fast runtime dispatch. +func Resolve(prog *Program, opts ResolveOptions) []PosError { + r := &resolver{ + prog: prog, + knownMethods: opts.Methods, + knownFunctions: opts.Functions, + methodOpcodes: opts.MethodOpcodes, + functionOpcodes: opts.FunctionOpcodes, + } + r.resolve() + return r.errors +} + +type resolver struct { + prog *Program + knownMethods map[string]MethodInfo + knownFunctions map[string]FunctionInfo + methodOpcodes map[string]uint16 // nil = skip annotation + functionOpcodes map[string]uint16 // nil = skip annotation + errors []PosError + scope *resolveScope + inMap bool // true when inside a map body + inMethodArg bool // true when resolving a method argument + maxSlots int // high-water mark for the current scope tree (program or map) +} + +// trackSlots updates the high-water mark for the current scope. +// trackSlots updates the high-water mark for the current scope and +// propagates the child scope's slot usage back to the parent to prevent +// slot collisions between child expressions and subsequent parent allocations. +func (r *resolver) trackSlots() { + if r.scope.nextSlot > r.maxSlots { + r.maxSlots = r.scope.nextSlot + } + // Propagate child's nextSlot to parent so parent doesn't reuse + // slots that were allocated inside the child scope. + if r.scope.parent != nil && r.scope.nextSlot > r.scope.parent.nextSlot { + r.scope.parent.nextSlot = r.scope.nextSlot + } +} + +// scopeMode determines how variable assignment interacts with outer scopes. +type resolveScopeMode int + +const ( + // resolveScopeStatement: assigning to an existing outer variable targets + // the ancestor's slot. New variables are block-scoped. + resolveScopeStatement resolveScopeMode = iota + // resolveScopeExpression: assignment always shadows (writes locally). + resolveScopeExpression +) + +type resolveScope struct { + parent *resolveScope + vars map[string]int // declared variables → slot index + params map[string]int // parameters → slot index + nextSlot int // next available slot in this scope tree + mode resolveScopeMode +} + +func newResolveScope(parent *resolveScope, mode resolveScopeMode) *resolveScope { + // Slot 0 is reserved (Go zero-value for int fields on AST nodes means + // "unresolved"), so root scopes start allocating from slot 1. + nextSlot := 1 + if parent != nil { + nextSlot = parent.nextSlot + } + return &resolveScope{ + parent: parent, + vars: make(map[string]int), + params: make(map[string]int), + nextSlot: nextSlot, + mode: mode, + } +} + +func (s *resolveScope) isDeclared(name string) bool { + for cur := s; cur != nil; cur = cur.parent { + if _, ok := cur.vars[name]; ok { + return true + } + if _, ok := cur.params[name]; ok { + return true + } + } + return false +} + +// lookupSlot finds the slot index for a variable/parameter by walking the scope chain. +func (s *resolveScope) lookupSlot(name string) (int, bool) { + for cur := s; cur != nil; cur = cur.parent { + if slot, ok := cur.vars[name]; ok { + return slot, true + } + if slot, ok := cur.params[name]; ok { + return slot, true + } + } + return -1, false +} + +// lookupParamSlot finds the slot index for a parameter (not a variable) by +// walking the scope chain. Used for bare identifier resolution — bare +// identifiers must not resolve to $variables. +func (s *resolveScope) lookupParamSlot(name string) (int, bool) { + for cur := s; cur != nil; cur = cur.parent { + if slot, ok := cur.params[name]; ok { + return slot, true + } + } + return -1, false +} + +// allocSlot assigns the next available slot index and returns it. +func (s *resolveScope) allocSlot() int { + slot := s.nextSlot + s.nextSlot++ + return slot +} + +// declareVar declares a variable in this scope. In statement mode, if the +// variable exists in an ancestor, returns the ancestor's slot. Otherwise +// allocates a new slot. +func (s *resolveScope) declareVar(name string) int { + if s.mode == resolveScopeStatement { + // Check ancestors for existing declaration (write-through). + for cur := s.parent; cur != nil; cur = cur.parent { + if slot, ok := cur.vars[name]; ok { + return slot + } + } + } + // Check if already declared in this scope. + if slot, ok := s.vars[name]; ok { + return slot + } + slot := s.allocSlot() + s.vars[name] = slot + return slot +} + +func (r *resolver) error(pos Pos, msg string) { + r.errors = append(r.errors, PosError{Pos: pos, Msg: msg}) +} + +func (r *resolver) resolve() { + // Check for duplicate map names. + seen := make(map[string]Pos) + for _, m := range r.prog.Maps { + if prev, exists := seen[m.Name]; exists { + r.error(m.TokenPos, fmt.Sprintf("duplicate map name %q (previously declared at %s)", m.Name, prev)) + } + seen[m.Name] = m.TokenPos + } + + // Build top-level scope. + r.scope = newResolveScope(nil, resolveScopeStatement) + r.maxSlots = 0 + + // Resolve map bodies (isolated scope trees with independent slot spaces). + for _, m := range r.prog.Maps { + r.resolveMapDecl(m) + } + + // Resolve top-level statements. + for _, stmt := range r.prog.Stmts { + r.resolveStmt(stmt) + } + r.trackSlots() + r.prog.MaxSlots = r.maxSlots +} + +func (r *resolver) resolveMapDecl(m *MapDecl) { + // Validate parameter list. + r.validateParams(m.Params, m.TokenPos) + + saved := r.scope + savedInMap := r.inMap + savedMaxSlots := r.maxSlots + + r.inMap = true + r.maxSlots = 0 + mapScope := newResolveScope(nil, resolveScopeExpression) // isolated: no parent + for i := range m.Params { + p := &m.Params[i] + if !p.Discard { + p.SlotIndex = mapScope.allocSlot() + mapScope.params[p.Name] = p.SlotIndex + } + } + r.scope = mapScope + r.resolveExprBody(m.Body) + r.trackSlots() + m.MaxSlots = r.maxSlots + + r.scope = saved + r.inMap = savedInMap + r.maxSlots = savedMaxSlots +} + +func (r *resolver) validateParams(params []Param, _ Pos) { + seenDefault := false + for _, p := range params { + if p.Discard { + if p.Default != nil { + r.error(p.Pos, "discard parameter _ cannot have a default value") + } + continue + } + if p.Default != nil { + seenDefault = true + } else if seenDefault { + r.error(p.Pos, "required parameter after default parameter") + } + } +} + +func (r *resolver) resolveStmt(stmt Stmt) { + switch s := stmt.(type) { + case *Assignment: + r.resolveAssignment(s) + case *IfStmt: + r.resolveIfStmt(s) + case *MatchStmt: + r.resolveMatchStmt(s) + } +} + +func (r *resolver) resolveAssignment(a *Assignment) { + // Lambdas in any non-argument position are rejected by resolveExpr's + // *LambdaExpr case (spec Section 3.4). No additional check needed here. + + // Map/function names cannot be stored in variables. + if ident, ok := a.Value.(*IdentExpr); ok { + _, isFn := r.knownFunctions[ident.Name] + if a.Target.Root == AssignVar && (r.isKnownMap(ident.Name) || isFn) { + r.error(a.TokenPos, fmt.Sprintf("cannot store %s in a variable (it is not a value)", ident.Name)) + } + } + + r.resolveExpr(a.Value) + + // Resolve expressions inside assignment target path segments (e.g., output[$key]). + for _, seg := range a.Target.Path { + if seg.Index != nil { + r.resolveExpr(seg.Index) + } + } + + // Track variable declarations. + if a.Target.Root == AssignVar { + a.Target.SlotIndex = r.scope.declareVar(a.Target.VarName) + } +} + +func (r *resolver) resolveIfStmt(s *IfStmt) { + for _, branch := range s.Branches { + r.resolveExpr(branch.Cond) + child := newResolveScope(r.scope, resolveScopeStatement) + saved := r.scope + r.scope = child + for _, stmt := range branch.Body { + r.resolveStmt(stmt) + } + r.trackSlots() + r.scope = saved + } + if s.Else != nil { + child := newResolveScope(r.scope, resolveScopeStatement) + saved := r.scope + r.scope = child + for _, stmt := range s.Else { + r.resolveStmt(stmt) + } + r.trackSlots() + r.scope = saved + } +} + +func (r *resolver) resolveMatchStmt(s *MatchStmt) { + if s.Subject != nil { + r.resolveExpr(s.Subject) + } + // Allocate binding slot once so all cases share it. + if s.Binding != "" { + s.BindingSlot = r.scope.allocSlot() + } + for _, c := range s.Cases { + child := newResolveScope(r.scope, resolveScopeStatement) + if s.Binding != "" { + child.params[s.Binding] = s.BindingSlot + } + saved := r.scope + r.scope = child + + if c.Pattern != nil && !c.Wildcard { + r.resolveExpr(c.Pattern) + } + if body, ok := c.Body.([]Stmt); ok { + for _, stmt := range body { + r.resolveStmt(stmt) + } + } + r.trackSlots() + r.scope = saved + } +} + +func (r *resolver) resolveExprBody(body *ExprBody) { + if body == nil { + return + } + for _, va := range body.Assignments { + // Lambdas in non-argument positions are caught by resolveExpr's + // *LambdaExpr case (spec Section 3.4). + r.resolveExpr(va.Value) + // Resolve expressions inside path segments (e.g., $acc[item.k] = ...). + for _, seg := range va.Path { + if seg.Index != nil { + r.resolveExpr(seg.Index) + } + } + if len(va.Path) > 0 { + // Path assignment: mutate the existing variable if it's declared + // anywhere in scope, otherwise declare it in the current scope + // (Section 3.7: path assignment to undeclared is a declaration). + if slot, ok := r.scope.lookupSlot(va.Name); ok { + va.SlotIndex = slot + } else { + va.SlotIndex = r.scope.declareVar(va.Name) + } + } else { + va.SlotIndex = r.scope.declareVar(va.Name) + } + } + r.resolveExpr(body.Result) +} + +func (r *resolver) resolveExpr(expr Expr) { + if expr == nil { + return + } + switch e := expr.(type) { + case *LiteralExpr: + // no-op + case *InputExpr: + if r.inMap { + r.error(e.TokenPos, "cannot access input inside a map body") + } + case *InputMetaExpr: + if r.inMap { + r.error(e.TokenPos, "cannot access input inside a map body") + } + case *OutputExpr: + if r.inMap { + r.error(e.TokenPos, "cannot access output inside a map body") + } + r.prog.ReadsOutput = true + case *OutputMetaExpr: + if r.inMap { + r.error(e.TokenPos, "cannot access output inside a map body") + } + r.prog.ReadsOutput = true + case *VarExpr: + if !r.scope.isDeclared(e.Name) { + r.error(e.TokenPos, "undeclared variable $"+e.Name) + } else if slot, ok := r.scope.lookupSlot(e.Name); ok { + e.SlotIndex = slot + } + case *IdentExpr: + if e.Namespace != "" { + // Qualified reference (e.g., math::double) — only valid as + // a method argument to higher-order methods. + if !r.inMethodArg { + r.error(e.TokenPos, e.Namespace+"::"+e.Name+" is not a valid expression (call it with parentheses or pass to a method)") + } + r.resolveQualifiedIdent(e) + } else if slot, ok := r.scope.lookupParamSlot(e.Name); ok { + // Resolves to a parameter (map param, lambda param, match-as + // binding) — annotate with slot. Bare identifiers must NOT + // resolve to $variables (those require the $ prefix via VarExpr). + e.SlotIndex = slot + } else { + // Not a variable/parameter — check if it's a map or function name. + _, isFn := r.knownFunctions[e.Name] + if r.isKnownMap(e.Name) || isFn { + if !r.inMethodArg { + r.error(e.TokenPos, e.Name+" is not a valid expression (call it with parentheses or pass to a method)") + } + } else { + r.error(e.TokenPos, fmt.Sprintf("undeclared identifier %q", e.Name)) + } + } + case *BinaryExpr: + r.resolveExpr(e.Left) + r.resolveExpr(e.Right) + case *UnaryExpr: + r.resolveExpr(e.Operand) + case *CallExpr: + r.resolveCall(e) + case *MethodCallExpr: + r.resolveExpr(e.Receiver) + mi, miKnown := r.knownMethods[e.Method] + if miKnown { + r.checkMethodArity(e, mi) + r.applyArgFolder(mi.ArgFolder, e.Args, e.MethodPos, "."+e.Method+"()") + } + if r.methodOpcodes != nil { + e.MethodOpcode = r.methodOpcodes[e.Method] + } + saved := r.inMethodArg + r.inMethodArg = true + for i, arg := range e.Args { + // Check map name references passed to higher-order methods. + if ident, ok := arg.Value.(*IdentExpr); ok { + if ident.Namespace != "" { + // Qualified reference: check arity in namespace. + if m := r.findNamespacedMap(ident.Namespace, ident.Name); m != nil { + r.checkMapRefArity(ident.TokenPos, ident.Namespace+"::"+ident.Name, m) + } + } else if m := r.findMap(ident.Name); m != nil { + r.checkMapRefArity(ident.TokenPos, ident.Name, m) + } + } + acceptsLambda := !miKnown || mi.ParamAcceptsLambda(i, arg.Name) + r.resolveArgValue(arg.Value, acceptsLambda, e.Method) + } + r.inMethodArg = saved + if miKnown { + e.Prebound = r.applyCallFolder(mi.CallFolder, e.Args, e.MethodPos, "."+e.Method+"()") + } + case *FieldAccessExpr: + r.resolveExpr(e.Receiver) + case *IndexExpr: + r.resolveExpr(e.Receiver) + r.resolveExpr(e.Index) + case *ArrayLiteral: + for _, elem := range e.Elements { + r.resolveExpr(elem) + } + case *ObjectLiteral: + for _, entry := range e.Entries { + r.resolveExpr(entry.Key) + r.resolveExpr(entry.Value) + } + case *IfExpr: + r.resolveIfExpr(e) + case *MatchExpr: + r.resolveMatchExpr(e) + case *LambdaExpr: + r.error(e.TokenPos, "lambda is only valid as a call argument (spec Section 3.4)") + // Still resolve the body so downstream passes don't see unresolved + // parameter slots. Errors already emitted will surface the issue. + r.resolveLambda(e) + case *PathExpr: + // Check map isolation for the path root. + if r.inMap { + switch e.Root { + case PathRootInput, PathRootInputMeta: + r.error(e.TokenPos, "cannot access input inside a map body") + case PathRootOutput, PathRootOutputMeta: + r.error(e.TokenPos, "cannot access output inside a map body") + } + } + if e.Root == PathRootOutput || e.Root == PathRootOutputMeta { + r.prog.ReadsOutput = true + } + if e.Root == PathRootVar { + if !r.scope.isDeclared(e.VarName) { + r.error(e.TokenPos, "undeclared variable $"+e.VarName) + } else if slot, ok := r.scope.lookupSlot(e.VarName); ok { + e.VarSlotIndex = slot + } + } + for i := range e.Segments { + seg := &e.Segments[i] + if seg.Index != nil { + r.resolveExpr(seg.Index) + } + if seg.Kind == PathSegMethod { + if mi, ok := r.knownMethods[seg.Name]; ok { + r.checkMethodArityAt(seg.Pos, seg.Name, len(seg.Args), mi) + r.applyArgFolder(mi.ArgFolder, seg.Args, seg.Pos, "."+seg.Name+"()") + } + if r.methodOpcodes != nil { + seg.MethodOpcode = r.methodOpcodes[seg.Name] + } + } + if len(seg.Args) > 0 { + saved := r.inMethodArg + r.inMethodArg = true + var segMi MethodInfo + segMiKnown := false + if seg.Kind == PathSegMethod { + segMi, segMiKnown = r.knownMethods[seg.Name] + } + for i, arg := range seg.Args { + if ident, ok := arg.Value.(*IdentExpr); ok { + if ident.Namespace != "" { + if m := r.findNamespacedMap(ident.Namespace, ident.Name); m != nil { + r.checkMapRefArity(ident.TokenPos, ident.Namespace+"::"+ident.Name, m) + } + } else if m := r.findMap(ident.Name); m != nil { + r.checkMapRefArity(ident.TokenPos, ident.Name, m) + } + } + acceptsLambda := !segMiKnown || segMi.ParamAcceptsLambda(i, arg.Name) + r.resolveArgValue(arg.Value, acceptsLambda, seg.Name) + } + r.inMethodArg = saved + } + if seg.Kind == PathSegMethod { + if mi, ok := r.knownMethods[seg.Name]; ok { + seg.Prebound = r.applyCallFolder(mi.CallFolder, seg.Args, seg.Pos, "."+seg.Name+"()") + } + } + } + } +} + +func (r *resolver) resolveCall(e *CallExpr) { + // Validate named arg consistency. + if e.Named && len(e.Args) > 0 { + seen := make(map[string]bool) + for _, arg := range e.Args { + if arg.Name == "" { + r.error(e.TokenPos, "cannot mix positional and named arguments") + break + } + if seen[arg.Name] { + r.error(e.TokenPos, fmt.Sprintf("duplicate named argument %q", arg.Name)) + } + seen[arg.Name] = true + } + } + + // Check that the function/map exists and validate arity. + // User maps take priority over stdlib functions (maps shadow stdlib). + if e.Namespace == "" { + m := r.findMap(e.Name) + if m != nil { + r.checkMapArity(e, m) + } else if fi, ok := r.knownFunctions[e.Name]; ok { + r.checkFunctionArity(e, fi) + r.applyArgFolder(fi.ArgFolder, e.Args, e.TokenPos, e.Name+"()") + if r.functionOpcodes != nil { + e.FunctionOpcode = r.functionOpcodes[e.Name] + } + e.Prebound = r.applyCallFolder(fi.CallFolder, e.Args, e.TokenPos, e.Name+"()") + } else { + r.error(e.TokenPos, fmt.Sprintf("unknown function or map %q", e.Name)) + } + + // Special compile-time check: throw() literal arg must be a string. + if e.Name == "throw" && len(e.Args) == 1 { + if lit, ok := e.Args[0].Value.(*LiteralExpr); ok { + if lit.TokenType != STRING && lit.TokenType != RAW_STRING { + r.error(e.TokenPos, "throw() requires a string argument") + } + } + } + } + + // Namespace-qualified call: check namespace and map exist. + if e.Namespace != "" { + maps, nsExists := r.prog.Namespaces[e.Namespace] + if !nsExists { + r.error(e.TokenPos, fmt.Sprintf("unknown namespace %q", e.Namespace)) + } else { + found := false + for _, m := range maps { + if m.Name == e.Name { + found = true + break + } + } + if !found { + r.error(e.TokenPos, fmt.Sprintf("nonexistent map %s::%s()", e.Namespace, e.Name)) + } + } + } + + // Functions may accept lambda arguments per-position when their + // FunctionInfo declares Params with AcceptsLambda set; user maps never + // accept lambdas as arguments (a map parameter always receives a + // value). + var fi FunctionInfo + var fiKnown bool + if e.Namespace == "" && r.findMap(e.Name) == nil { + fi, fiKnown = r.knownFunctions[e.Name] + } + for i, arg := range e.Args { + acceptsLambda := false + if fiKnown { + acceptsLambda = fi.ParamAcceptsLambda(i, arg.Name) + } + r.resolveArgValue(arg.Value, acceptsLambda, e.Name) + } +} + +// applyCallFolder runs a CallFolder against the call site's args and +// returns the Prebound value (or nil if the folder declined to fold). +// Folder errors are recorded as resolver diagnostics anchored at pos. +func (r *resolver) applyCallFolder(folder CallFolder, args []CallArg, pos Pos, calleeLabel string) any { + if folder == nil { + return nil + } + prebound, err := folder(args) + if err != nil { + r.error(pos, calleeLabel+": "+err.Error()) + return nil + } + return prebound +} + +// applyArgFolder runs folder against args and, on success, attaches +// non-nil folded values to the matching CallArg.Folded field. A folder +// error is recorded as a resolver diagnostic anchored at pos. Silently +// tolerates folder-returned slices of the wrong length (a contract +// violation we don't want to block compilation for). +func (r *resolver) applyArgFolder(folder ArgFolder, args []CallArg, pos Pos, calleeLabel string) { + if folder == nil || len(args) == 0 { + return + } + folded, err := folder(args) + if err != nil { + r.error(pos, calleeLabel+": "+err.Error()) + return + } + if len(folded) != len(args) { + return + } + for i := range args { + if folded[i] != nil { + args[i].Folded = folded[i] + } + } +} + +// resolveArgValue resolves a call argument's value. Lambdas are only legal +// in this position (spec Section 3.4); they're rejected everywhere else by +// resolveExpr's *LambdaExpr case. When acceptsLambda is false, a lambda +// argument is rejected with a compile error that names the callee. +func (r *resolver) resolveArgValue(value Expr, acceptsLambda bool, calleeName string) { + if lam, ok := value.(*LambdaExpr); ok { + if !acceptsLambda { + r.error(lam.TokenPos, calleeName+"() does not accept a lambda argument") + } + r.resolveLambda(lam) + return + } + r.resolveExpr(value) +} + +// findNamespacedMap looks up a map by namespace and name. +func (r *resolver) findNamespacedMap(namespace, name string) *MapDecl { + maps, ok := r.prog.Namespaces[namespace] + if !ok { + return nil + } + for _, m := range maps { + if m.Name == name { + return m + } + } + return nil +} + +// checkMapRefArity verifies a map reference passed to a higher-order method +// has exactly 1 required parameter. +func (r *resolver) checkMapRefArity(pos Pos, displayName string, m *MapDecl) { + required := 0 + for _, p := range m.Params { + if p.Default == nil && !p.Discard { + required++ + } + } + if required != 1 { + r.error(pos, fmt.Sprintf("arity mismatch: %s() requires %d arguments, but higher-order methods pass 1", displayName, required)) + } +} + +// resolveQualifiedIdent checks that a qualified identifier (namespace::name) +// refers to a valid namespace and map. +func (r *resolver) resolveQualifiedIdent(e *IdentExpr) { + maps, nsExists := r.prog.Namespaces[e.Namespace] + if !nsExists { + r.error(e.TokenPos, fmt.Sprintf("unknown namespace %q", e.Namespace)) + return + } + found := false + for _, m := range maps { + if m.Name == e.Name { + found = true + break + } + } + if !found { + r.error(e.TokenPos, fmt.Sprintf("nonexistent map %s::%s", e.Namespace, e.Name)) + } +} + +func (r *resolver) checkMapArity(e *CallExpr, m *MapDecl) { + required := 0 + total := 0 + hasDiscard := false + for _, p := range m.Params { + total++ + if p.Discard { + hasDiscard = true + required++ // discard params still need an argument + } else if p.Default == nil { + required++ + } + } + + if e.Named && hasDiscard { + r.error(e.TokenPos, "cannot use named arguments with discard parameters") + return + } + + if e.Named { + // Named args: check for unknown arg names. + paramNames := make(map[string]bool) + for _, p := range m.Params { + if !p.Discard { + paramNames[p.Name] = true + } + } + for _, arg := range e.Args { + if !paramNames[arg.Name] { + r.error(e.TokenPos, fmt.Sprintf("unknown named argument %q", arg.Name)) + } + } + // Check required params are provided. + provided := make(map[string]bool) + for _, arg := range e.Args { + provided[arg.Name] = true + } + for _, p := range m.Params { + if p.Discard { + continue + } + if !provided[p.Name] && p.Default == nil { + r.error(e.TokenPos, fmt.Sprintf("arity mismatch: missing required named argument %q", p.Name)) + } + } + } else { + // Positional args: check count. + if len(e.Args) < required { + r.error(e.TokenPos, fmt.Sprintf("arity mismatch: %s() requires at least %d arguments, got %d", + e.Name, required, len(e.Args))) + } + if len(e.Args) > total { + r.error(e.TokenPos, fmt.Sprintf("arity mismatch: %s() accepts at most %d arguments, got %d", + e.Name, total, len(e.Args))) + } + } +} + +func (r *resolver) checkFunctionArity(e *CallExpr, fi FunctionInfo) { + if fi.Total < 0 { + return // no arity checking + } + nArgs := len(e.Args) + if nArgs < fi.Required { + r.error(e.TokenPos, fmt.Sprintf("%s() requires at least %d arguments, got %d", + e.Name, fi.Required, nArgs)) + } + if nArgs > fi.Total { + r.error(e.TokenPos, fmt.Sprintf("%s() accepts at most %d arguments, got %d", + e.Name, fi.Total, nArgs)) + } +} + +func (r *resolver) checkMethodArity(e *MethodCallExpr, mi MethodInfo) { + r.checkMethodArityAt(e.MethodPos, e.Method, len(e.Args), mi) +} + +func (r *resolver) checkMethodArityAt(pos Pos, name string, nArgs int, mi MethodInfo) { + if mi.Total < 0 { + return // no arity checking + } + if nArgs < mi.Required { + r.error(pos, fmt.Sprintf("%s() requires at least %d arguments, got %d", + name, mi.Required, nArgs)) + } + if nArgs > mi.Total { + r.error(pos, fmt.Sprintf("%s() accepts at most %d arguments, got %d", + name, mi.Total, nArgs)) + } +} + +func (r *resolver) findMap(name string) *MapDecl { + for _, m := range r.prog.Maps { + if m.Name == name { + return m + } + } + return nil +} + +func (r *resolver) resolveIfExpr(e *IfExpr) { + for _, branch := range e.Branches { + r.resolveExpr(branch.Cond) + child := newResolveScope(r.scope, resolveScopeExpression) + saved := r.scope + r.scope = child + r.resolveExprBody(branch.Body) + r.trackSlots() + r.scope = saved + } + if e.Else != nil { + child := newResolveScope(r.scope, resolveScopeExpression) + saved := r.scope + r.scope = child + r.resolveExprBody(e.Else) + r.trackSlots() + r.scope = saved + } +} + +func (r *resolver) resolveMatchExpr(e *MatchExpr) { + if e.Subject != nil { + r.resolveExpr(e.Subject) + } + + // Allocate the as-binding slot ONCE in the parent scope so all cases + // share the same slot. + if e.Binding != "" { + e.BindingSlot = r.scope.allocSlot() + } + + // Check for boolean literal cases in equality match (no 'as', has subject). + isEqualityMatch := e.Subject != nil && e.Binding == "" + + for _, c := range e.Cases { + if c.Pattern != nil && !c.Wildcard { + if isEqualityMatch { + if lit, ok := c.Pattern.(*LiteralExpr); ok { + if lit.TokenType == TRUE || lit.TokenType == FALSE { + r.error(lit.TokenPos, "boolean literal as case value in equality match (use 'as' for boolean conditions)") + } + } + } + child := newResolveScope(r.scope, resolveScopeExpression) + if e.Binding != "" { + child.params[e.Binding] = e.BindingSlot + } + saved := r.scope + r.scope = child + r.resolveExpr(c.Pattern) + r.trackSlots() + r.scope = saved + } + switch body := c.Body.(type) { + case Expr: + child := newResolveScope(r.scope, resolveScopeExpression) + if e.Binding != "" { + child.params[e.Binding] = e.BindingSlot + } + saved := r.scope + r.scope = child + r.resolveExpr(body) + r.trackSlots() + r.scope = saved + case *ExprBody: + child := newResolveScope(r.scope, resolveScopeExpression) + if e.Binding != "" { + child.params[e.Binding] = e.BindingSlot + } + saved := r.scope + r.scope = child + r.resolveExprBody(body) + r.trackSlots() + r.scope = saved + } + } +} + +func (r *resolver) resolveLambda(e *LambdaExpr) { + r.validateParams(e.Params, e.TokenPos) + + child := newResolveScope(r.scope, resolveScopeExpression) + for i := range e.Params { + if !e.Params[i].Discard { + e.Params[i].SlotIndex = child.allocSlot() + child.params[e.Params[i].Name] = e.Params[i].SlotIndex + } + } + saved := r.scope + r.scope = child + r.resolveExprBody(e.Body) + r.trackSlots() + r.scope = saved +} + +func (r *resolver) isKnownMap(name string) bool { + for _, m := range r.prog.Maps { + if m.Name == name { + return true + } + } + return false +} diff --git a/internal/bloblang2/go/pratt/syntax/scanner.go b/internal/bloblang2/go/pratt/syntax/scanner.go new file mode 100644 index 000000000..37b922184 --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/scanner.go @@ -0,0 +1,566 @@ +package syntax + +import ( + "fmt" + "strconv" + "strings" + "unicode/utf8" +) + +// scanner tokenizes Bloblang V2 source code. +type scanner struct { + src string // source code + file string // filename for positions + + pos int // current byte offset + line int // 1-based current line + col int // 1-based current column + prevTok TokenType // type of the last non-NL token emitted + + parenDepth int // () nesting depth + bracketDepth int // [] nesting depth + + // Buffered token for lookahead (used for newline suppression). + peeked *Token + + errors []PosError +} + +// PosError is a compile error with source position. +type PosError struct { + Pos Pos + Msg string +} + +func (e PosError) Error() string { + return fmt.Sprintf("%s: %s", e.Pos, e.Msg) +} + +func newScanner(src, file string) *scanner { + return &scanner{ + src: src, + file: file, + line: 1, + col: 1, + prevTok: NL, // suppress leading newlines + } +} + +// next returns the next token. Returns EOF repeatedly after input is exhausted. +func (s *scanner) next() Token { + if s.peeked != nil { + tok := *s.peeked + s.peeked = nil + s.trackToken(tok) + return tok + } + return s.scan() +} + +func (s *scanner) trackToken(tok Token) { + if tok.Type != NL { + s.prevTok = tok.Type + } + switch tok.Type { + case LPAREN: + s.parenDepth++ + case RPAREN: + if s.parenDepth > 0 { + s.parenDepth-- + } + case LBRACKET, QLBRACKET: + s.bracketDepth++ + case RBRACKET: + if s.bracketDepth > 0 { + s.bracketDepth-- + } + } +} + +// scan produces the next token, applying newline suppression rules. +func (s *scanner) scan() Token { + for { + tok := s.scanRaw() + if tok.Type != NL { + s.trackToken(tok) + return tok + } + + // Newline suppression mechanism 1: inside () or []. + if s.parenDepth > 0 || s.bracketDepth > 0 { + continue + } + + // Newline suppression mechanism 3: previous token suppresses NL. + if suppressesFollowingNL(s.prevTok) { + continue + } + + // Newline suppression mechanism 2: next token is postfix continuation. + nextTok := s.peekNextNonNL() + if isPostfixContinuation(nextTok.Type) { + continue + } + + // Collapse consecutive NLs: if previous emitted token was already NL, skip. + if s.prevTok == NL { + continue + } + + // Emit the newline. + s.prevTok = NL + return tok + } +} + +// peekNextNonNL scans forward past any NL tokens to find the next +// substantive token, without consuming it. +func (s *scanner) peekNextNonNL() Token { + // Save state. + savedPos := s.pos + savedLine := s.line + savedCol := s.col + savedErrors := len(s.errors) + + for { + tok := s.scanRaw() + if tok.Type != NL { + // Restore scanner to before we started peeking. + s.pos = savedPos + s.line = savedLine + s.col = savedCol + s.errors = s.errors[:savedErrors] + return tok + } + } +} + +// scanRaw produces the next raw token without newline suppression. +func (s *scanner) scanRaw() Token { + s.skipWhitespaceAndComments() + + if s.pos >= len(s.src) { + return s.makeToken(EOF, "") + } + + ch := s.src[s.pos] + + // Newlines. + if ch == '\n' { + tok := s.makeToken(NL, "\n") + s.advance() + return tok + } + if ch == '\r' { + tok := s.makeToken(NL, "\n") + s.advance() + if s.pos < len(s.src) && s.src[s.pos] == '\n' { + s.advance() + } + return tok + } + + // String literals. + if ch == '"' { + return s.scanString() + } + if ch == '`' { + return s.scanRawString() + } + + // Numbers. + if isDigit(ch) { + return s.scanNumber() + } + + // Variable $name. + if ch == '$' { + return s.scanVar() + } + + // Identifiers and keywords. + if isIdentStart(ch) { + return s.scanWord() + } + + // Multi-character operators and delimiters. + return s.scanOperator() +} + +func (s *scanner) scanString() Token { + startPos := s.currentPos() + s.advance() // skip opening " + + var sb strings.Builder + for s.pos < len(s.src) { + ch := s.src[s.pos] + if ch == '"' { + s.advance() // skip closing " + return Token{Type: STRING, Literal: sb.String(), Pos: startPos} + } + if ch == '\n' || ch == '\r' { + s.addError(s.currentPos(), "unterminated string literal") + return Token{Type: ILLEGAL, Literal: sb.String(), Pos: startPos} + } + if ch == '\\' { + s.advance() + escaped, ok := s.scanEscapeSeq() + if !ok { + return Token{Type: ILLEGAL, Literal: sb.String(), Pos: startPos} + } + sb.WriteString(escaped) + continue + } + // Regular character — read full UTF-8 rune. + r, size := utf8.DecodeRuneInString(s.src[s.pos:]) + sb.WriteRune(r) + s.advanceN(size) + } + s.addError(startPos, "unterminated string literal") + return Token{Type: ILLEGAL, Literal: sb.String(), Pos: startPos} +} + +func (s *scanner) scanEscapeSeq() (string, bool) { + if s.pos >= len(s.src) { + s.addError(s.currentPos(), "unterminated escape sequence") + return "", false + } + ch := s.src[s.pos] + chPos := s.currentPos() + s.advance() + switch ch { + case '"': + return "\"", true + case '\\': + return "\\", true + case 'n': + return "\n", true + case 't': + return "\t", true + case 'r': + return "\r", true + case 'u': + return s.scanUnicodeEscape() + default: + s.addError(chPos, fmt.Sprintf("invalid escape character %q", ch)) + return "", false + } +} + +func (s *scanner) scanUnicodeEscape() (string, bool) { + if s.pos >= len(s.src) { + s.addError(s.currentPos(), "unterminated unicode escape") + return "", false + } + + // \u{X...} form: 1-6 hex digits. + if s.src[s.pos] == '{' { + s.advance() // skip { + start := s.pos + for s.pos < len(s.src) && isHexDigit(s.src[s.pos]) { + s.advance() + } + hexStr := s.src[start:s.pos] + if len(hexStr) == 0 || len(hexStr) > 6 { + s.addError(s.currentPos(), "\\u{} requires 1-6 hex digits") + return "", false + } + if s.pos >= len(s.src) || s.src[s.pos] != '}' { + s.addError(s.currentPos(), "unterminated \\u{} escape") + return "", false + } + s.advance() // skip } + codepoint, _ := strconv.ParseUint(hexStr, 16, 32) + if codepoint > 0x10FFFF { + s.addError(s.currentPos(), fmt.Sprintf("unicode codepoint U+%X out of range", codepoint)) + return "", false + } + if codepoint >= 0xD800 && codepoint <= 0xDFFF { + s.addError(s.currentPos(), fmt.Sprintf("surrogate codepoint U+%X is invalid", codepoint)) + return "", false + } + return string(rune(codepoint)), true + } + + // \uXXXX form: exactly 4 hex digits. + if s.pos+4 > len(s.src) { + s.addError(s.currentPos(), "\\uXXXX requires exactly 4 hex digits") + return "", false + } + hexStr := s.src[s.pos : s.pos+4] + for _, c := range []byte(hexStr) { + if !isHexDigit(c) { + s.addError(s.currentPos(), fmt.Sprintf("invalid hex digit %q in \\uXXXX", c)) + return "", false + } + } + s.advanceN(4) + codepoint, _ := strconv.ParseUint(hexStr, 16, 32) + if codepoint >= 0xD800 && codepoint <= 0xDFFF { + s.addError(s.currentPos(), fmt.Sprintf("surrogate codepoint U+%04X is invalid", codepoint)) + return "", false + } + return string(rune(codepoint)), true +} + +func (s *scanner) scanRawString() Token { + startPos := s.currentPos() + s.advance() // skip opening ` + + start := s.pos + for s.pos < len(s.src) { + if s.src[s.pos] == '`' { + lit := s.src[start:s.pos] + s.advance() // skip closing ` + return Token{Type: RAW_STRING, Literal: lit, Pos: startPos} + } + s.advance() // advance() handles newline tracking + } + s.addError(startPos, "unterminated raw string literal") + return Token{Type: ILLEGAL, Literal: s.src[start:], Pos: startPos} +} + +func (s *scanner) scanNumber() Token { + startPos := s.currentPos() + start := s.pos + + for s.pos < len(s.src) && isDigit(s.src[s.pos]) { + s.advance() + } + + // Check for float: digits.digits + if s.pos < len(s.src) && s.src[s.pos] == '.' { + // Peek ahead — must have digit after dot for float literal. + if s.pos+1 < len(s.src) && isDigit(s.src[s.pos+1]) { + s.advance() // skip . + for s.pos < len(s.src) && isDigit(s.src[s.pos]) { + s.advance() + } + return Token{Type: FLOAT, Literal: s.src[start:s.pos], Pos: startPos} + } + // Dot without digit after — it's an int followed by a dot operator. + } + + // Integer — validate range at scan time. + lit := s.src[start:s.pos] + _, err := strconv.ParseInt(lit, 10, 64) + if err != nil { + s.addError(startPos, fmt.Sprintf("integer literal %s exceeds int64 range", lit)) + return Token{Type: ILLEGAL, Literal: lit, Pos: startPos} + } + return Token{Type: INT, Literal: lit, Pos: startPos} +} + +func (s *scanner) scanVar() Token { + startPos := s.currentPos() + s.advance() // skip $ + + if s.pos >= len(s.src) || !isIdentStart(s.src[s.pos]) { + s.addError(startPos, "expected identifier after $") + return Token{Type: ILLEGAL, Literal: "$", Pos: startPos} + } + + start := s.pos + for s.pos < len(s.src) && isIdentContinue(s.src[s.pos]) { + s.advance() + } + + name := s.src[start:s.pos] + if _, reserved := reservedNames[name]; reserved { + s.addError(startPos, fmt.Sprintf("%q is a reserved function name and cannot be used as a variable name", name)) + } + return Token{Type: VAR, Literal: name, Pos: startPos} +} + +func (s *scanner) scanWord() Token { + startPos := s.currentPos() + start := s.pos + for s.pos < len(s.src) && isIdentContinue(s.src[s.pos]) { + s.advance() + } + word := s.src[start:s.pos] + return Token{Type: LookupIdent(word), Literal: word, Pos: startPos} +} + +func (s *scanner) scanOperator() Token { + startPos := s.currentPos() + ch := s.src[s.pos] + s.advance() + + switch ch { + case '.': + return Token{Type: DOT, Literal: ".", Pos: startPos} + case '@': + return Token{Type: AT, Literal: "@", Pos: startPos} + case '(': + return Token{Type: LPAREN, Literal: "(", Pos: startPos} + case ')': + return Token{Type: RPAREN, Literal: ")", Pos: startPos} + case '{': + return Token{Type: LBRACE, Literal: "{", Pos: startPos} + case '}': + return Token{Type: RBRACE, Literal: "}", Pos: startPos} + case '[': + return Token{Type: LBRACKET, Literal: "[", Pos: startPos} + case ']': + return Token{Type: RBRACKET, Literal: "]", Pos: startPos} + case ',': + return Token{Type: COMMA, Literal: ",", Pos: startPos} + case '+': + return Token{Type: PLUS, Literal: "+", Pos: startPos} + case '*': + return Token{Type: STAR, Literal: "*", Pos: startPos} + case '/': + return Token{Type: SLASH, Literal: "/", Pos: startPos} + case '%': + return Token{Type: PERCENT, Literal: "%", Pos: startPos} + + case '?': + if s.pos < len(s.src) { + switch s.src[s.pos] { + case '.': + s.advance() + return Token{Type: QDOT, Literal: "?.", Pos: startPos} + case '[': + s.advance() + return Token{Type: QLBRACKET, Literal: "?[", Pos: startPos} + } + } + s.addError(startPos, "unexpected character '?'") + return Token{Type: ILLEGAL, Literal: "?", Pos: startPos} + + case ':': + if s.pos < len(s.src) && s.src[s.pos] == ':' { + s.advance() + return Token{Type: DCOLON, Literal: "::", Pos: startPos} + } + return Token{Type: COLON, Literal: ":", Pos: startPos} + + case '=': + if s.pos < len(s.src) { + switch s.src[s.pos] { + case '=': + s.advance() + return Token{Type: EQ, Literal: "==", Pos: startPos} + case '>': + s.advance() + return Token{Type: FATARROW, Literal: "=>", Pos: startPos} + } + } + return Token{Type: ASSIGN, Literal: "=", Pos: startPos} + + case '!': + if s.pos < len(s.src) && s.src[s.pos] == '=' { + s.advance() + return Token{Type: NE, Literal: "!=", Pos: startPos} + } + return Token{Type: BANG, Literal: "!", Pos: startPos} + + case '>': + if s.pos < len(s.src) && s.src[s.pos] == '=' { + s.advance() + return Token{Type: GE, Literal: ">=", Pos: startPos} + } + return Token{Type: GT, Literal: ">", Pos: startPos} + + case '<': + if s.pos < len(s.src) && s.src[s.pos] == '=' { + s.advance() + return Token{Type: LE, Literal: "<=", Pos: startPos} + } + return Token{Type: LT, Literal: "<", Pos: startPos} + + case '&': + if s.pos < len(s.src) && s.src[s.pos] == '&' { + s.advance() + return Token{Type: AND, Literal: "&&", Pos: startPos} + } + s.addError(startPos, "unexpected character '&', did you mean '&&'?") + return Token{Type: ILLEGAL, Literal: "&", Pos: startPos} + + case '|': + if s.pos < len(s.src) && s.src[s.pos] == '|' { + s.advance() + return Token{Type: OR, Literal: "||", Pos: startPos} + } + s.addError(startPos, "unexpected character '|', did you mean '||'?") + return Token{Type: ILLEGAL, Literal: "|", Pos: startPos} + + case '-': + if s.pos < len(s.src) && s.src[s.pos] == '>' { + s.advance() + return Token{Type: THINARROW, Literal: "->", Pos: startPos} + } + return Token{Type: MINUS, Literal: "-", Pos: startPos} + } + + r, _ := utf8.DecodeRuneInString(s.src[s.pos-1:]) + s.addError(startPos, fmt.Sprintf("unexpected character %q", r)) + return Token{Type: ILLEGAL, Literal: string(r), Pos: startPos} +} + +// skipWhitespaceAndComments skips horizontal whitespace and comments +// (but not newlines — those are significant tokens). +func (s *scanner) skipWhitespaceAndComments() { + for s.pos < len(s.src) { + ch := s.src[s.pos] + if ch == ' ' || ch == '\t' { + s.advance() + continue + } + if ch == '#' { + // Comment: skip to end of line (but don't consume the newline). + for s.pos < len(s.src) && s.src[s.pos] != '\n' && s.src[s.pos] != '\r' { + s.advance() + } + continue + } + break + } +} + +func (s *scanner) currentPos() Pos { + return Pos{File: s.file, Line: s.line, Column: s.col} +} + +func (s *scanner) makeToken(typ TokenType, lit string) Token { + return Token{Type: typ, Literal: lit, Pos: s.currentPos()} +} + +func (s *scanner) advance() { + if s.pos < len(s.src) { + if s.src[s.pos] == '\n' { + s.line++ + s.col = 1 + } else { + s.col++ + } + s.pos++ + } +} + +func (s *scanner) advanceN(n int) { + for range n { + s.advance() + } +} + +func (s *scanner) addError(pos Pos, msg string) { + s.errors = append(s.errors, PosError{Pos: pos, Msg: msg}) +} + +func isDigit(ch byte) bool { + return ch >= '0' && ch <= '9' +} + +func isHexDigit(ch byte) bool { + return (ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f') || (ch >= 'A' && ch <= 'F') +} + +func isIdentStart(ch byte) bool { + return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch == '_' +} + +func isIdentContinue(ch byte) bool { + return isIdentStart(ch) || isDigit(ch) +} diff --git a/internal/bloblang2/go/pratt/syntax/scanner_test.go b/internal/bloblang2/go/pratt/syntax/scanner_test.go new file mode 100644 index 000000000..3494a273e --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/scanner_test.go @@ -0,0 +1,438 @@ +package syntax + +import ( + "testing" +) + +func scanAll(src string) []Token { + s := newScanner(src, "") + var tokens []Token + for { + tok := s.next() + tokens = append(tokens, tok) + if tok.Type == EOF { + break + } + } + return tokens +} + +func scanTypes(src string) []TokenType { + tokens := scanAll(src) + types := make([]TokenType, len(tokens)) + for i, t := range tokens { + types[i] = t.Type + } + return types +} + +func requireTypes(t *testing.T, src string, expected ...TokenType) { + t.Helper() + got := scanTypes(src) + if len(got) != len(expected) { + t.Fatalf("token count: expected %d, got %d\nsource: %q\nexpected: %v\ngot: %v", len(expected), len(got), src, expected, got) + } + for i := range expected { + if got[i] != expected[i] { + t.Fatalf("token %d: expected %s, got %s\nsource: %q\nfull: %v", i, expected[i], got[i], src, got) + } + } +} + +func TestScanner_BasicTokens(t *testing.T) { + requireTypes(t, "42", INT, EOF) + requireTypes(t, "3.14", FLOAT, EOF) + requireTypes(t, `"hello"`, STRING, EOF) + requireTypes(t, "`raw`", RAW_STRING, EOF) + requireTypes(t, "true", TRUE, EOF) + requireTypes(t, "false", FALSE, EOF) + requireTypes(t, "null", NULL, EOF) + requireTypes(t, "foo", IDENT, EOF) + requireTypes(t, "$x", VAR, EOF) + requireTypes(t, "_", UNDERSCORE, EOF) +} + +func TestScanner_Keywords(t *testing.T) { + requireTypes(t, "input", INPUT, EOF) + requireTypes(t, "output", OUTPUT, EOF) + requireTypes(t, "if", IF, EOF) + requireTypes(t, "else", ELSE, EOF) + requireTypes(t, "match", MATCH, EOF) + requireTypes(t, "as", AS, EOF) + requireTypes(t, "map", MAP, EOF) + requireTypes(t, "import", IMPORT, EOF) + requireTypes(t, "deleted", DELETED, EOF) + requireTypes(t, "throw", THROW, EOF) +} + +func TestScanner_Operators(t *testing.T) { + requireTypes(t, ".", DOT, EOF) + requireTypes(t, "?.", QDOT, EOF) + requireTypes(t, "@", AT, EOF) + requireTypes(t, "::", DCOLON, EOF) + requireTypes(t, "=", ASSIGN, EOF) + requireTypes(t, "+", PLUS, EOF) + requireTypes(t, "-", MINUS, EOF) + requireTypes(t, "*", STAR, EOF) + requireTypes(t, "/", SLASH, EOF) + requireTypes(t, "%", PERCENT, EOF) + requireTypes(t, "!", BANG, EOF) + requireTypes(t, ">", GT, EOF) + requireTypes(t, ">=", GE, EOF) + requireTypes(t, "==", EQ, EOF) + requireTypes(t, "!=", NE, EOF) + requireTypes(t, "<", LT, EOF) + requireTypes(t, "<=", LE, EOF) + requireTypes(t, "&&", AND, EOF) + requireTypes(t, "||", OR, EOF) + requireTypes(t, "=>", FATARROW, EOF) + requireTypes(t, "->", THINARROW, EOF) + requireTypes(t, "?[", QLBRACKET, EOF) +} + +func TestScanner_Delimiters(t *testing.T) { + requireTypes(t, "(", LPAREN, EOF) + requireTypes(t, ")", RPAREN, EOF) + requireTypes(t, "{", LBRACE, EOF) + requireTypes(t, "}", RBRACE, EOF) + requireTypes(t, "[", LBRACKET, EOF) + requireTypes(t, "]", RBRACKET, EOF) + requireTypes(t, ",", COMMA, EOF) + requireTypes(t, ":", COLON, EOF) +} + +func TestScanner_StringEscapes(t *testing.T) { + tests := []struct { + src string + expected string + }{ + {`"hello"`, "hello"}, + {`"line\none"`, "line\none"}, + {`"tab\there"`, "tab\there"}, + {`"cr\rhere"`, "cr\rhere"}, + {`"quote\"here"`, "quote\"here"}, + {`"slash\\here"`, "slash\\here"}, + {`"\u0041"`, "A"}, + {`"\u{41}"`, "A"}, + {`"\u{1F600}"`, "\U0001F600"}, + } + + for _, tt := range tests { + t.Run(tt.src, func(t *testing.T) { + tokens := scanAll(tt.src) + if tokens[0].Type != STRING { + t.Fatalf("expected STRING, got %s", tokens[0].Type) + } + if tokens[0].Literal != tt.expected { + t.Fatalf("expected literal %q, got %q", tt.expected, tokens[0].Literal) + } + }) + } +} + +func TestScanner_RawString(t *testing.T) { + tokens := scanAll("`no\\escapes`") + if tokens[0].Type != RAW_STRING { + t.Fatalf("expected RAW_STRING, got %s", tokens[0].Type) + } + if tokens[0].Literal != "no\\escapes" { + t.Fatalf("expected literal %q, got %q", "no\\escapes", tokens[0].Literal) + } +} + +func TestScanner_IntegerRange(t *testing.T) { + // Max int64 is valid. + tokens := scanAll("9223372036854775807") + if tokens[0].Type != INT { + t.Fatalf("expected INT, got %s", tokens[0].Type) + } + + // Max int64 + 1 is illegal. + s := newScanner("9223372036854775808", "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for overflow, got %s", tok.Type) + } + if len(s.errors) == 0 { + t.Fatal("expected error for overflow") + } +} + +func TestScanner_FloatRequiresDigitsBothSides(t *testing.T) { + // 5. is int + dot (not a float). + requireTypes(t, "5.", INT, DOT, EOF) + + // .5 is dot + int (not a float). + requireTypes(t, ".5", DOT, INT, EOF) + + // 5.0 is a float. + requireTypes(t, "5.0", FLOAT, EOF) +} + +func TestScanner_VarToken(t *testing.T) { + tokens := scanAll("$count") + if tokens[0].Type != VAR { + t.Fatalf("expected VAR, got %s", tokens[0].Type) + } + if tokens[0].Literal != "count" { + t.Fatalf("expected literal %q, got %q", "count", tokens[0].Literal) + } +} + +func TestScanner_Comments(t *testing.T) { + requireTypes(t, "42 # comment", INT, EOF) + requireTypes(t, "# full line comment\n42", INT, EOF) +} + +func TestScanner_NewlineSuppression_ParenNesting(t *testing.T) { + // Inside parens, newlines are suppressed. + requireTypes(t, "(1\n+\n2)", LPAREN, INT, PLUS, INT, RPAREN, EOF) + + // Inside brackets, newlines are suppressed. + requireTypes(t, "[1\n2\n3]", LBRACKET, INT, INT, INT, RBRACKET, EOF) + + // Inside braces, newlines are NOT suppressed. + requireTypes(t, "{\n}", LBRACE, NL, RBRACE, EOF) +} + +func TestScanner_NewlineSuppression_PostfixContinuation(t *testing.T) { + // . on next line suppresses NL. + requireTypes(t, "input\n.field", INPUT, DOT, IDENT, EOF) + + // ?. on next line suppresses NL. + requireTypes(t, "input\n?.field", INPUT, QDOT, IDENT, EOF) + + // [ on next line suppresses NL. + requireTypes(t, "arr\n[0]", IDENT, LBRACKET, INT, RBRACKET, EOF) + + // ?[ on next line suppresses NL. + requireTypes(t, "arr\n?[0]", IDENT, QLBRACKET, INT, RBRACKET, EOF) + + // else on next line suppresses NL. + requireTypes(t, "}\nelse", RBRACE, ELSE, EOF) +} + +func TestScanner_NewlineSuppression_OperatorContinuation(t *testing.T) { + // Trailing + suppresses NL. + requireTypes(t, "a +\nb", IDENT, PLUS, IDENT, EOF) + + // Trailing && suppresses NL. + requireTypes(t, "a &&\nb", IDENT, AND, IDENT, EOF) + + // Trailing = suppresses NL. + requireTypes(t, "output.x =\n42", OUTPUT, DOT, IDENT, ASSIGN, INT, EOF) + + // Trailing => suppresses NL. + requireTypes(t, "\"a\" =>\n1", STRING, FATARROW, INT, EOF) + + // Trailing -> suppresses NL. + requireTypes(t, "x ->\nx", IDENT, THINARROW, IDENT, EOF) +} + +func TestScanner_NewlineEmitted(t *testing.T) { + // Normal statement separator. + requireTypes(t, "a = 1\nb = 2", IDENT, ASSIGN, INT, NL, IDENT, ASSIGN, INT, EOF) + + // Multiple newlines collapse to one. + requireTypes(t, "a = 1\n\n\nb = 2", IDENT, ASSIGN, INT, NL, IDENT, ASSIGN, INT, EOF) +} + +func TestScanner_Positions(t *testing.T) { + tokens := scanAll("ab\ncd") + // ab at 1:1 + if tokens[0].Pos.Line != 1 || tokens[0].Pos.Column != 1 { + t.Fatalf("expected ab at 1:1, got %s", tokens[0].Pos) + } + // NL + // cd at 2:1 + if tokens[2].Pos.Line != 2 || tokens[2].Pos.Column != 1 { + t.Fatalf("expected cd at 2:1, got %s", tokens[2].Pos) + } +} + +func TestScanner_Expression(t *testing.T) { + requireTypes(t, "output.result = input.x + 5", + OUTPUT, DOT, IDENT, ASSIGN, INPUT, DOT, IDENT, PLUS, INT, EOF) +} + +func TestScanner_MethodChain(t *testing.T) { + requireTypes(t, `input.name.uppercase().length()`, + INPUT, DOT, IDENT, DOT, IDENT, LPAREN, RPAREN, DOT, IDENT, LPAREN, RPAREN, EOF) +} + +func TestScanner_MatchExpression(t *testing.T) { + requireTypes(t, `match input.x as v { v > 0 => "pos", _ => "neg" }`, + MATCH, INPUT, DOT, IDENT, AS, IDENT, LBRACE, + IDENT, GT, INT, FATARROW, STRING, COMMA, + UNDERSCORE, FATARROW, STRING, + RBRACE, EOF) +} + +func TestScanner_Lambda(t *testing.T) { + requireTypes(t, "x -> x * 2", + IDENT, THINARROW, IDENT, STAR, INT, EOF) +} + +func TestScanner_MultiParamLambda(t *testing.T) { + requireTypes(t, "(a, b) -> a + b", + LPAREN, IDENT, COMMA, IDENT, RPAREN, THINARROW, IDENT, PLUS, IDENT, EOF) +} + +func TestScanner_QualifiedName(t *testing.T) { + requireTypes(t, "math::double(5)", + IDENT, DCOLON, IDENT, LPAREN, INT, RPAREN, EOF) +} + +func TestScanner_MultilineMethodChain(t *testing.T) { + src := "input.items\n .filter(x -> x > 0)\n .map(x -> x * 2)" + requireTypes(t, src, + INPUT, DOT, IDENT, + // NL suppressed by postfix continuation (.) + DOT, IDENT, LPAREN, IDENT, THINARROW, IDENT, GT, INT, RPAREN, + // NL suppressed by postfix continuation (.) + DOT, MAP, LPAREN, IDENT, THINARROW, IDENT, STAR, INT, RPAREN, + // Note: .map() scans "map" as the MAP keyword. The parser + // handles keywords-as-method-names after dot. + EOF) +} + +func TestScanner_IfElseMultiline(t *testing.T) { + src := "if true { 1 }\nelse { 2 }" + requireTypes(t, src, + IF, TRUE, LBRACE, INT, RBRACE, + // NL suppressed by postfix continuation (else) + ELSE, LBRACE, INT, RBRACE, EOF) +} + +func TestScanner_SurrogateCodepointError(t *testing.T) { + s := newScanner(`"\uD800"`, "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for surrogate, got %s", tok.Type) + } +} + +func TestScanner_NullSafeIndexBracketDepth(t *testing.T) { + // Newlines inside ?[...] should be suppressed by bracket nesting. + requireTypes(t, "arr?[\n0\n]", IDENT, QLBRACKET, INT, RBRACKET, EOF) +} + +func TestScanner_WindowsLineEndings(t *testing.T) { + requireTypes(t, "a = 1\r\nb = 2", IDENT, ASSIGN, INT, NL, IDENT, ASSIGN, INT, EOF) +} + +func TestScanner_UnterminatedString(t *testing.T) { + s := newScanner(`"unterminated`, "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for unterminated string, got %s", tok.Type) + } + if len(s.errors) == 0 { + t.Fatal("expected error for unterminated string") + } +} + +func TestScanner_UnterminatedRawString(t *testing.T) { + s := newScanner("`unterminated", "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for unterminated raw string, got %s", tok.Type) + } +} + +func TestScanner_LoneAmpersand(t *testing.T) { + s := newScanner("&", "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for lone &, got %s", tok.Type) + } +} + +func TestScanner_LonePipe(t *testing.T) { + s := newScanner("|", "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for lone |, got %s", tok.Type) + } +} + +func TestScanner_DollarWithoutIdent(t *testing.T) { + s := newScanner("$ ", "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for bare $, got %s", tok.Type) + } +} + +func TestScanner_UnicodeEscapeEdgeCases(t *testing.T) { + // Empty braces. + s := newScanner(`"\u{}"`, "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for empty \\u{}, got %s", tok.Type) + } + + // Too many hex digits. + s = newScanner(`"\u{0000000}"`, "") + tok = s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for 7-digit \\u{}, got %s", tok.Type) + } + + // Out of unicode range. + s = newScanner(`"\u{110000}"`, "") + tok = s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for out-of-range codepoint, got %s", tok.Type) + } +} + +func TestScanner_MultipleErrors(t *testing.T) { + // Two illegal tokens on separate lines — both errors collected. + s := newScanner("&\n|", "") + s.next() // ILLEGAL & + s.next() // NL (or suppressed) + s.next() // ILLEGAL | + if len(s.errors) < 2 { + t.Fatalf("expected at least 2 errors, got %d", len(s.errors)) + } +} + +func TestScanner_RawStringMultiline(t *testing.T) { + // Raw string spanning lines — positions after it should be correct. + tokens := scanAll("`a\nb`\nfoo") + // raw string, NL, foo, EOF + if tokens[0].Type != RAW_STRING { + t.Fatalf("expected RAW_STRING, got %s", tokens[0].Type) + } + if tokens[0].Literal != "a\nb" { + t.Fatalf("expected literal %q, got %q", "a\nb", tokens[0].Literal) + } + // "foo" should be on line 3. + fooTok := tokens[2] + if fooTok.Type != IDENT { + t.Fatalf("expected IDENT, got %s", fooTok.Type) + } + if fooTok.Pos.Line != 3 { + t.Fatalf("expected foo on line 3, got line %d", fooTok.Pos.Line) + } +} + +func TestScanner_FilePosition(t *testing.T) { + s := newScanner("42", "test.blobl") + tok := s.next() + if tok.Pos.File != "test.blobl" { + t.Fatalf("expected file %q, got %q", "test.blobl", tok.Pos.File) + } +} + +func TestScanner_InvalidEscapeCharacter(t *testing.T) { + s := newScanner(`"\x"`, "") + tok := s.next() + if tok.Type != ILLEGAL { + t.Fatalf("expected ILLEGAL for invalid escape, got %s", tok.Type) + } + if len(s.errors) == 0 { + t.Fatal("expected error for invalid escape") + } +} diff --git a/internal/bloblang2/go/pratt/syntax/testdata/fuzz/FuzzParse/2fa037f88dc1a315 b/internal/bloblang2/go/pratt/syntax/testdata/fuzz/FuzzParse/2fa037f88dc1a315 new file mode 100644 index 000000000..6da7dd009 --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/testdata/fuzz/FuzzParse/2fa037f88dc1a315 @@ -0,0 +1,2 @@ +go test fuzz v1 +string("$A0000=match{}") diff --git a/internal/bloblang2/go/pratt/syntax/token.go b/internal/bloblang2/go/pratt/syntax/token.go new file mode 100644 index 000000000..ac00fe220 --- /dev/null +++ b/internal/bloblang2/go/pratt/syntax/token.go @@ -0,0 +1,288 @@ +package syntax + +import "fmt" + +// TokenType represents the type of a lexical token. +type TokenType int + +const ( + // ILLEGAL represents an invalid token. + ILLEGAL TokenType = iota + // EOF signals end of input. + EOF + // NL is a newline statement separator. + NL + + // INT is an integer literal (e.g., 42). + INT + // FLOAT is a float literal (e.g., 3.14). + FLOAT + // STRING is an escape-processed string literal. + STRING + // RAW_STRING is a raw backtick string literal. + RAW_STRING + + // IDENT is a user-defined identifier (excludes keywords and reserved names). + IDENT + // VAR is a variable reference (e.g., $name — literal is "name" without the $). + VAR + + // INPUT is the "input" keyword. + INPUT + // OUTPUT is the "output" keyword. + OUTPUT + // IF is the "if" keyword. + IF + // ELSE is the "else" keyword. + ELSE + // MATCH is the "match" keyword. + MATCH + // AS is the "as" keyword. + AS + // MAP is the "map" keyword. + MAP + // IMPORT is the "import" keyword. + IMPORT + // TRUE is the "true" keyword. + TRUE + // FALSE is the "false" keyword. + FALSE + // NULL is the "null" keyword. + NULL + // UNDERSCORE is the "_" keyword. + UNDERSCORE + + // DELETED is the reserved function name "deleted". + DELETED + // THROW is the reserved function name "throw". + THROW + // VOID is the reserved function name "void". + VOID + + // DOT is the "." operator. + DOT + // QDOT is the "?." null-safe operator. + QDOT + // AT is the "@" metadata accessor. + AT + // DCOLON is the "::" namespace separator. + DCOLON + // ASSIGN is the "=" assignment operator. + ASSIGN + // PLUS is the "+" operator. + PLUS + // MINUS is the "-" operator. + MINUS + // STAR is the "*" operator. + STAR + // SLASH is the "/" operator. + SLASH + // PERCENT is the "%" operator. + PERCENT + // BANG is the "!" operator. + BANG + // GT is the ">" operator. + GT + // GE is the ">=" operator. + GE + // EQ is the "==" operator. + EQ + // NE is the "!=" operator. + NE + // LT is the "<" operator. + LT + // LE is the "<=" operator. + LE + // AND is the "&&" operator. + AND + // OR is the "||" operator. + OR + // FATARROW is the "=>" operator. + FATARROW + // THINARROW is the "->" operator. + THINARROW + + // LPAREN is the "(" delimiter. + LPAREN + // RPAREN is the ")" delimiter. + RPAREN + // LBRACE is the "{" delimiter. + LBRACE + // RBRACE is the "}" delimiter. + RBRACE + // LBRACKET is the "[" delimiter. + LBRACKET + // RBRACKET is the "]" delimiter. + RBRACKET + // QLBRACKET is the "?[" null-safe index delimiter. + QLBRACKET + // COMMA is the "," delimiter. + COMMA + // COLON is the ":" delimiter. + COLON +) + +var tokenNames = map[TokenType]string{ + ILLEGAL: "ILLEGAL", + EOF: "EOF", + NL: "NL", + INT: "INT", + FLOAT: "FLOAT", + STRING: "STRING", + RAW_STRING: "RAW_STRING", + IDENT: "IDENT", + VAR: "VAR", + INPUT: "input", + OUTPUT: "output", + IF: "if", + ELSE: "else", + MATCH: "match", + AS: "as", + MAP: "map", + IMPORT: "import", + TRUE: "true", + FALSE: "false", + NULL: "null", + UNDERSCORE: "_", + DELETED: "deleted", + THROW: "throw", + VOID: "void", + DOT: ".", + QDOT: "?.", + AT: "@", + DCOLON: "::", + ASSIGN: "=", + PLUS: "+", + MINUS: "-", + STAR: "*", + SLASH: "/", + PERCENT: "%", + BANG: "!", + GT: ">", + GE: ">=", + EQ: "==", + NE: "!=", + LT: "<", + LE: "<=", + AND: "&&", + OR: "||", + FATARROW: "=>", + THINARROW: "->", + LPAREN: "(", + RPAREN: ")", + LBRACE: "{", + RBRACE: "}", + LBRACKET: "[", + RBRACKET: "]", + QLBRACKET: "?[", + COMMA: ",", + COLON: ":", +} + +func (t TokenType) String() string { + if s, ok := tokenNames[t]; ok { + return s + } + return fmt.Sprintf("TokenType(%d)", int(t)) +} + +// keywords maps keyword strings to their token types. +var keywords = map[string]TokenType{ + "input": INPUT, + "output": OUTPUT, + "if": IF, + "else": ELSE, + "match": MATCH, + "as": AS, + "map": MAP, + "import": IMPORT, + "true": TRUE, + "false": FALSE, + "null": NULL, + "_": UNDERSCORE, +} + +// reservedNames maps reserved function names to their token types. +var reservedNames = map[string]TokenType{ + "deleted": DELETED, + "throw": THROW, + "void": VOID, +} + +// LookupIdent returns the token type for a word: keyword, reserved name, +// or IDENT for user-defined identifiers. +func LookupIdent(word string) TokenType { + if tok, ok := keywords[word]; ok { + return tok + } + if tok, ok := reservedNames[word]; ok { + return tok + } + return IDENT +} + +// IsKeyword reports whether the token type is a keyword. +func (t TokenType) IsKeyword() bool { + _, ok := tokenNames[t] + return ok && t >= INPUT && t <= UNDERSCORE +} + +// Pos represents a source position. +type Pos struct { + File string // filename (empty for the main mapping) + Line int // 1-based line number + Column int // 1-based column number (byte offset in line) +} + +func (p Pos) String() string { + if p.File != "" { + return fmt.Sprintf("%s:%d:%d", p.File, p.Line, p.Column) + } + return fmt.Sprintf("%d:%d", p.Line, p.Column) +} + +// Token represents a single lexical token with its position and literal value. +type Token struct { + Type TokenType + Literal string // the literal text of the token + Pos Pos +} + +func (t Token) String() string { + if t.Literal != "" { + return fmt.Sprintf("%s(%q) at %s", t.Type, t.Literal, t.Pos) + } + return fmt.Sprintf("%s at %s", t.Type, t.Pos) +} + +// suppressesFollowingNL reports whether this token type suppresses a +// following newline (mechanism 3: operator continuation). These are +// tokens that cannot be the final token of a complete expression — +// the spec lists them explicitly. +func suppressesFollowingNL(t TokenType) bool { + switch t { + case PLUS, MINUS, STAR, SLASH, PERCENT, // binary/unary operators + EQ, NE, GT, GE, LT, LE, + AND, OR, + BANG, // unary not + ASSIGN, // = + FATARROW, // => + THINARROW, // -> + COLON: // : + return true + default: + return false + } +} + +// isPostfixContinuation reports whether this token type triggers +// newline suppression when it appears at the start of the next line +// (mechanism 2: postfix continuation). +func isPostfixContinuation(t TokenType) bool { + switch t { + case DOT, QDOT, LBRACKET, QLBRACKET, ELSE: + return true + default: + return false + } +} From 288e50763e180b3955aebbce684ef02472204946 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 28 Apr 2026 10:47:57 +0100 Subject: [PATCH 03/20] bloblang(v2): Add Go tree-walking interpreter and standard library Adds the runtime half of the Go implementation under internal/bloblang2/go/pratt/eval/: a tree-walking interpreter with a small opcode dispatch path for methods and functions, a variable stack with slot allocation handled by the resolver, message-context support, and the V2 standard library covering arithmetic, collections, string handling, lambdas, strftime, and message-coupled operations. Tested against the eval-side unit suite (interp, stdlib, strftime, argument folding). Spec conformance is exercised separately once the spectest runner and corpus land. --- .../bloblang2/go/pratt/eval/argfold_test.go | 156 ++ .../bloblang2/go/pratt/eval/arithmetic.go | 919 ++++++++ internal/bloblang2/go/pratt/eval/clone.go | 34 + internal/bloblang2/go/pratt/eval/interp.go | 2019 +++++++++++++++++ .../bloblang2/go/pratt/eval/interp_test.go | 615 +++++ .../bloblang2/go/pratt/eval/messagecontext.go | 58 + internal/bloblang2/go/pratt/eval/opcodes.go | 80 + internal/bloblang2/go/pratt/eval/scope.go | 54 + internal/bloblang2/go/pratt/eval/stdlib.go | 1894 ++++++++++++++++ .../bloblang2/go/pratt/eval/stdlib_lambda.go | 872 +++++++ .../bloblang2/go/pratt/eval/stdlib_message.go | 59 + internal/bloblang2/go/pratt/eval/strftime.go | 194 ++ .../bloblang2/go/pratt/eval/strftime_test.go | 92 + internal/bloblang2/go/pratt/eval/value.go | 56 + 14 files changed, 7102 insertions(+) create mode 100644 internal/bloblang2/go/pratt/eval/argfold_test.go create mode 100644 internal/bloblang2/go/pratt/eval/arithmetic.go create mode 100644 internal/bloblang2/go/pratt/eval/clone.go create mode 100644 internal/bloblang2/go/pratt/eval/interp.go create mode 100644 internal/bloblang2/go/pratt/eval/interp_test.go create mode 100644 internal/bloblang2/go/pratt/eval/messagecontext.go create mode 100644 internal/bloblang2/go/pratt/eval/opcodes.go create mode 100644 internal/bloblang2/go/pratt/eval/scope.go create mode 100644 internal/bloblang2/go/pratt/eval/stdlib.go create mode 100644 internal/bloblang2/go/pratt/eval/stdlib_lambda.go create mode 100644 internal/bloblang2/go/pratt/eval/stdlib_message.go create mode 100644 internal/bloblang2/go/pratt/eval/strftime.go create mode 100644 internal/bloblang2/go/pratt/eval/strftime_test.go create mode 100644 internal/bloblang2/go/pratt/eval/value.go diff --git a/internal/bloblang2/go/pratt/eval/argfold_test.go b/internal/bloblang2/go/pratt/eval/argfold_test.go new file mode 100644 index 000000000..c2c7993e6 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/argfold_test.go @@ -0,0 +1,156 @@ +package eval_test + +import ( + "regexp" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// TestArgFolderSubstitutesValue is the end-to-end regression for the +// parse-time fold mechanism. Compiles a mapping with a literal regex +// pattern, inspects the resolved AST to confirm the translator stashed +// a *regexp.Regexp on the argument, then executes the mapping and +// checks the result — the runtime must have used the precompiled +// value rather than compiling again. +func TestArgFolderSubstitutesValue(t *testing.T) { + methods, functions := eval.StdlibNames() + methodOpcodes, functionOpcodes := eval.StdlibOpcodes() + + prog, errs := syntax.Parse(`output = input.re_match("[0-9]+")`, "", nil) + if len(errs) > 0 { + t.Fatalf("parse: %v", errs) + } + syntax.Optimize(prog) + if rerrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, + Functions: functions, + MethodOpcodes: methodOpcodes, + FunctionOpcodes: functionOpcodes, + }); len(rerrs) > 0 { + t.Fatalf("resolve: %v", rerrs) + } + + // Walk the AST: the output = .re_match(...) assignment should + // carry a *regexp.Regexp on the method seg's first argument. + found := false + var walk func(any) + walk = func(n any) { + if found { + return + } + switch v := n.(type) { + case *syntax.PathExpr: + for _, seg := range v.Segments { + if seg.Kind == syntax.PathSegMethod && seg.Name == "re_match" { + if len(seg.Args) == 0 { + continue + } + if _, ok := seg.Args[0].Folded.(*regexp.Regexp); ok { + found = true + return + } + } + } + case *syntax.Assignment: + walk(v.Value) + case *syntax.MethodCallExpr: + walk(v.Receiver) + } + } + for _, s := range prog.Stmts { + walk(s) + } + if !found { + t.Fatalf("expected re_match's literal pattern to be folded into *regexp.Regexp, but CallArg.Folded was nil") + } + + interp := eval.NewWithStdlib(prog) + out, _, _, err := interp.Run("abc123", nil) + if err != nil { + t.Fatalf("run: %v", err) + } + if b, ok := out.(bool); !ok || !b { + t.Fatalf("expected true, got %v (%T)", out, out) + } +} + +// TestArgFolderRejectsInvalidLiteralAtParseTime confirms that a folder +// returning an error surfaces as a resolver diagnostic anchored at the +// call site, not a runtime error on first call. +func TestArgFolderRejectsInvalidLiteralAtParseTime(t *testing.T) { + methods, functions := eval.StdlibNames() + methodOpcodes, functionOpcodes := eval.StdlibOpcodes() + + prog, errs := syntax.Parse(`output = input.re_match("[unclosed")`, "", nil) + if len(errs) > 0 { + t.Fatalf("parse: %v", errs) + } + syntax.Optimize(prog) + rerrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, + Functions: functions, + MethodOpcodes: methodOpcodes, + FunctionOpcodes: functionOpcodes, + }) + if len(rerrs) == 0 { + t.Fatal("expected a resolver diagnostic for the invalid regex, got none") + } + // Should mention the method name so users can find the offending call. + found := false + for _, e := range rerrs { + if containsAny(e.Msg, "re_match", "invalid regex") { + found = true + break + } + } + if !found { + t.Fatalf("resolver diagnostic did not name the method/error: %v", rerrs) + } +} + +// TestArgFolderLeavesDynamicArgsAlone confirms a non-literal pattern +// (e.g. from a $variable) skips folding and the runtime compiles the +// pattern normally. +func TestArgFolderLeavesDynamicArgsAlone(t *testing.T) { + methods, functions := eval.StdlibNames() + methodOpcodes, functionOpcodes := eval.StdlibOpcodes() + + src := `$pat = "[0-9]+" +output = input.re_match($pat)` + prog, errs := syntax.Parse(src, "", nil) + if len(errs) > 0 { + t.Fatalf("parse: %v", errs) + } + syntax.Optimize(prog) + if rerrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, + Functions: functions, + MethodOpcodes: methodOpcodes, + FunctionOpcodes: functionOpcodes, + }); len(rerrs) > 0 { + t.Fatalf("resolve: %v", rerrs) + } + interp := eval.NewWithStdlib(prog) + out, _, _, err := interp.Run("abc123", nil) + if err != nil { + t.Fatalf("run: %v", err) + } + if b, ok := out.(bool); !ok || !b { + t.Fatalf("expected true, got %v", out) + } +} + +// containsAny is a tiny helper; strings.Contains loop inlined to avoid +// importing the strings package just for the test. +func containsAny(haystack string, needles ...string) bool { + for _, n := range needles { + for i := 0; i+len(n) <= len(haystack); i++ { + if haystack[i:i+len(n)] == n { + return true + } + } + } + return false +} diff --git a/internal/bloblang2/go/pratt/eval/arithmetic.go b/internal/bloblang2/go/pratt/eval/arithmetic.go new file mode 100644 index 000000000..804831080 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/arithmetic.go @@ -0,0 +1,919 @@ +package eval + +import ( + "fmt" + "math" + "time" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +func (interp *Interpreter) evalBinaryOp(op syntax.TokenType, left, right any) any { + // Timestamp subtraction: ts - ts = int64 nanoseconds. + if op == syntax.MINUS { + if lt, ok := left.(time.Time); ok { + if rt, ok := right.(time.Time); ok { + // Check for int64 nanosecond overflow before computing. + // math.MaxInt64 ns ≈ 9223372036 seconds ≈ 292 years. + const maxNanoSeconds = int64(math.MaxInt64 / 1_000_000_000) + diffSec := lt.Unix() - rt.Unix() + if diffSec > maxNanoSeconds || diffSec < -maxNanoSeconds { + return NewError("timestamp subtraction overflow: difference exceeds int64 nanosecond range (~292 years)") + } + return lt.Sub(rt).Nanoseconds() + } + return NewError("cannot subtract timestamp and " + typeName(right)) + } + } + // Timestamp operations. + if lt, ok := left.(time.Time); ok { + if rt, ok := right.(time.Time); ok { + switch op { + case syntax.GT: + return lt.After(rt) + case syntax.GE: + return !lt.Before(rt) + case syntax.LT: + return lt.Before(rt) + case syntax.LE: + return !lt.After(rt) + case syntax.EQ: + return lt.Equal(rt) + case syntax.NE: + return !lt.Equal(rt) + default: + // ts + ts, ts * ts, etc. + return NewError("cannot " + opVerb(op.String()) + " timestamp and timestamp") + } + } + if op == syntax.EQ || op == syntax.NE { + return op == syntax.NE // cross-family: always false/true + } + // ts + number, ts * number, ts > number, etc. — all errors except ts - ts (handled above). + return NewError("cannot " + opVerb(op.String()) + " timestamp and " + typeName(right)) + } + if _, ok := right.(time.Time); ok { + if op == syntax.EQ || op == syntax.NE { + return op == syntax.NE // cross-family + } + return NewError("cannot " + opVerb(op.String()) + " " + typeName(left) + " and timestamp") + } + + switch op { + case syntax.PLUS: + return evalAdd(left, right) + case syntax.MINUS: + return evalArith(left, right, "-") + case syntax.STAR: + return evalArith(left, right, "*") + case syntax.SLASH: + return evalDivide(left, right) + case syntax.PERCENT: + return evalModulo(left, right) + case syntax.EQ: + return evalEquality(left, right, false) + case syntax.NE: + return evalEquality(left, right, true) + case syntax.GT: + return evalCompare(left, right, ">") + case syntax.GE: + return evalCompare(left, right, ">=") + case syntax.LT: + return evalCompare(left, right, "<") + case syntax.LE: + return evalCompare(left, right, "<=") + default: + return NewError(fmt.Sprintf("unknown binary operator %s", op)) + } +} + +// valuesEqual implements Bloblang equality semantics. +func valuesEqual(a, b any) bool { + if a == nil && b == nil { + return true + } + if a == nil || b == nil { + return false + } + + // Numeric equality with promotion (non-error path for internal use). + if isNumeric(a) && isNumeric(b) { + r := numericEqualChecked(a, b, false) + if IsError(r) { + return false // promotion failed — treat as unequal for internal callers + } + return r.(bool) + } + + // Timestamp equality. + if at, ok := a.(time.Time); ok { + if bt, ok := b.(time.Time); ok { + return at.Equal(bt) + } + return false // cross-family + } + + // Bytes equality. + if ab, ok := a.([]byte); ok { + if bb, ok := b.([]byte); ok { + if len(ab) != len(bb) { + return false + } + for i := range ab { + if ab[i] != bb[i] { + return false + } + } + return true + } + return false // cross-family + } + + // Same type required for non-numeric. + switch av := a.(type) { + case string: + bv, ok := b.(string) + return ok && av == bv + case bool: + bv, ok := b.(bool) + return ok && av == bv + case []any: + bv, ok := b.([]any) + if !ok || len(av) != len(bv) { + return false + } + for i := range av { + if !valuesEqual(av[i], bv[i]) { + return false + } + } + return true + case map[string]any: + bv, ok := b.(map[string]any) + if !ok || len(av) != len(bv) { + return false + } + for k, v := range av { + bval, exists := bv[k] + if !exists || !valuesEqual(v, bval) { + return false + } + } + return true + default: + // Cross-family: always false. + return false + } +} + +func isNumeric(v any) bool { + switch v.(type) { + case int32, int64, uint32, uint64, float32, float64: + return true + default: + return false + } +} + +// evalEquality returns a bool (or an error if numeric promotion fails). +// When negate is true the sense is inverted (!=). +func evalEquality(a, b any, negate bool) any { + if a == nil && b == nil { + return !negate + } + if a == nil || b == nil { + return negate + } + + // Numeric equality with checked promotion. + if isNumeric(a) && isNumeric(b) { + return numericEqualChecked(a, b, negate) + } + + eq := valuesEqual(a, b) + if negate { + return !eq + } + return eq +} + +// numericEqualChecked compares two numeric values, returning an error if +// checked promotion fails (matching the behaviour of comparison operators). +func numericEqualChecked(a, b any, negate bool) any { + // Same type: compare directly (no promotion, no precision loss). + switch av := a.(type) { + case int64: + if bv, ok := b.(int64); ok { + r := av == bv + if negate { + return !r + } + return r + } + case int32: + if bv, ok := b.(int32); ok { + r := av == bv + if negate { + return !r + } + return r + } + case uint32: + if bv, ok := b.(uint32); ok { + r := av == bv + if negate { + return !r + } + return r + } + case uint64: + if bv, ok := b.(uint64); ok { + r := av == bv + if negate { + return !r + } + return r + } + case float64: + if bv, ok := b.(float64); ok { + if math.IsNaN(av) || math.IsNaN(bv) { + return negate // NaN != NaN is true, NaN == NaN is false + } + r := av == bv + if negate { + return !r + } + return r + } + case float32: + if bv, ok := b.(float32); ok { + if math.IsNaN(float64(av)) || math.IsNaN(float64(bv)) { + return negate + } + r := av == bv + if negate { + return !r + } + return r + } + } + + // Different numeric types: use checked promotion to a common type. + pl, pr, kind, promErr := promoteChecked(a, b) + if promErr != "" { + return NewError(promErr) + } + + var eq bool + switch kind { + case promoteInt64: + eq = pl.(int64) == pr.(int64) + case promoteInt32: + eq = pl.(int32) == pr.(int32) + case promoteUint32: + eq = pl.(uint32) == pr.(uint32) + case promoteUint64: + eq = pl.(uint64) == pr.(uint64) + case promoteFloat64: + af, bf := pl.(float64), pr.(float64) + if math.IsNaN(af) || math.IsNaN(bf) { + return negate + } + eq = af == bf + case promoteFloat32: + af, bf := pl.(float32), pr.(float32) + if math.IsNaN(float64(af)) || math.IsNaN(float64(bf)) { + return negate + } + eq = af == bf + default: + return negate // different families: EQ=false, NE=true + } + if negate { + return !eq + } + return eq +} + +func toFloat64(v any) float64 { + switch n := v.(type) { + case int64: + return float64(n) + case int32: + return float64(n) + case uint32: + return float64(n) + case uint64: + return float64(n) + case float64: + return n + case float32: + return float64(n) + default: + return math.NaN() + } +} + +func evalAdd(left, right any) any { + // String concatenation. + if ls, ok := left.(string); ok { + rs, ok := right.(string) + if !ok { + return NewError("cannot add string and " + typeName(right) + ": not numeric") + } + return ls + rs + } + if _, ok := right.(string); ok { + return NewError("cannot add " + typeName(left) + " and string: not numeric") + } + // Bytes concatenation. + if lb, ok := left.([]byte); ok { + rb, ok := right.([]byte) + if !ok { + return NewError("cannot add bytes and " + typeName(right)) + } + result := make([]byte, len(lb)+len(rb)) + copy(result, lb) + copy(result[len(lb):], rb) + return result + } + // Numeric addition. + return evalArith(left, right, "+") +} + +func evalArith(left, right any, op string) any { + if !isNumeric(left) || !isNumeric(right) { + return arithError(left, right, op) + } + + // Promote to common type. + pl, pr, kind, promErr := promoteChecked(left, right) + if promErr != "" { + return NewError(promErr) + } + _ = kind + + switch kind { + case promoteInt64: + a, b := pl.(int64), pr.(int64) + return checkedInt64Arith(a, b, op) + case promoteInt32: + a, b := pl.(int32), pr.(int32) + return checkedInt32Arith(a, b, op) + case promoteUint32: + a, b := pl.(uint32), pr.(uint32) + return checkedUint32Arith(a, b, op) + case promoteUint64: + a, b := pl.(uint64), pr.(uint64) + return checkedUint64Arith(a, b, op) + case promoteFloat64: + a, b := pl.(float64), pr.(float64) + return floatArith(a, b, op) + case promoteFloat32: + a, b := pl.(float32), pr.(float32) + return float32Arith(a, b, op) + default: + return NewError("unexpected promotion result") + } +} + +func evalDivide(left, right any) any { + if !isNumeric(left) || !isNumeric(right) { + return NewError(fmt.Sprintf("cannot divide %T by %T", left, right)) + } + + // Division always produces float. + // float32 / float32 → float32, all else → float64. + _, isLF32 := left.(float32) + _, isRF32 := right.(float32) + if isLF32 && isRF32 { + a, b := left.(float32), right.(float32) + if b == 0 { + return NewError("division by zero") + } + return a / b + } + + af, aOk := checkedToFloat64(left) + bf, bOk := checkedToFloat64(right) + if !aOk || !bOk { + return NewError("integer exceeds float64 exact range (magnitude > 2^53)") + } + if bf == 0 { + return NewError("division by zero") + } + return af / bf +} + +func evalModulo(left, right any) any { + if !isNumeric(left) || !isNumeric(right) { + return NewError(fmt.Sprintf("cannot modulo %T by %T", left, right)) + } + + pl, pr, kind, promErr := promoteChecked(left, right) + if promErr != "" { + return NewError(promErr) + } + + switch kind { + case promoteInt64: + a, b := pl.(int64), pr.(int64) + if b == 0 { + return NewError("modulo by zero") + } + if a == math.MinInt64 && b == -1 { + return NewError("int64 overflow") + } + return a % b + case promoteInt32: + a, b := pl.(int32), pr.(int32) + if b == 0 { + return NewError("modulo by zero") + } + if a == math.MinInt32 && b == -1 { + return NewError("int32 overflow") + } + return a % b + case promoteUint32: + a, b := pl.(uint32), pr.(uint32) + if b == 0 { + return NewError("modulo by zero") + } + return a % b + case promoteUint64: + a, b := pl.(uint64), pr.(uint64) + if b == 0 { + return NewError("modulo by zero") + } + return a % b + case promoteFloat64: + a, b := pl.(float64), pr.(float64) + if b == 0 { + return NewError("modulo by zero") + } + return math.Mod(a, b) + case promoteFloat32: + a, b := pl.(float32), pr.(float32) + if b == 0 { + return NewError("modulo by zero") + } + return float32(math.Mod(float64(a), float64(b))) + default: + return NewError("unexpected promotion result") + } +} + +func evalCompare(left, right any, op string) any { + // Note: timestamp comparisons are handled in evalBinaryOp before + // this function is called (the early-return block at the top). + + if !isNumeric(left) && !isNumeric(right) { + // String comparison. + if ls, ok := left.(string); ok { + rs, ok := right.(string) + if !ok { + return NewError(fmt.Sprintf("cannot compare string and %s: not comparable", typeName(right))) + } + return stringCompare(ls, rs, op) + } + // Bytes comparison (lexicographic). + if lb, ok := left.([]byte); ok { + rb, ok := right.([]byte) + if !ok { + return NewError(fmt.Sprintf("cannot compare bytes and %s: not comparable", typeName(right))) + } + return bytesCompare(lb, rb, op) + } + return NewError(fmt.Sprintf("cannot compare %s and %s: not comparable types", typeName(left), typeName(right))) + } + if !isNumeric(left) || !isNumeric(right) { + return NewError(fmt.Sprintf("cannot compare %s and %s: not comparable types", typeName(left), typeName(right))) + } + + // Promote using checked rules (same as arithmetic). + pl, pr, kind, promErr := promoteChecked(left, right) + if promErr != "" { + return NewError(promErr) + } + + switch kind { + case promoteInt64: + a, b := pl.(int64), pr.(int64) + return compareOrdered(a, b, op) + case promoteInt32: + a, b := pl.(int32), pr.(int32) + return compareOrdered(int64(a), int64(b), op) + case promoteUint32: + a, b := pl.(uint32), pr.(uint32) + return compareOrdered(uint64(a), uint64(b), op) + case promoteUint64: + a, b := pl.(uint64), pr.(uint64) + return compareOrdered(a, b, op) + case promoteFloat64: + a, b := pl.(float64), pr.(float64) + return compareFloat(a, b, op) + case promoteFloat32: + a, b := pl.(float32), pr.(float32) + return compareFloat(float64(a), float64(b), op) + default: + return NewError("unexpected promotion result") + } +} + +type ordered interface { + ~int64 | ~uint64 +} + +func compareOrdered[T ordered](a, b T, op string) any { + switch op { + case ">": + return a > b + case ">=": + return a >= b + case "<": + return a < b + case "<=": + return a <= b + default: + return false + } +} + +func compareFloat(a, b float64, op string) any { + switch op { + case ">": + return a > b + case ">=": + return a >= b + case "<": + return a < b + case "<=": + return a <= b + default: + return false + } +} + +func bytesCompare(a, b []byte, op string) any { + cmp := 0 + for i := 0; i < len(a) && i < len(b); i++ { + if a[i] < b[i] { + cmp = -1 + break + } + if a[i] > b[i] { + cmp = 1 + break + } + } + if cmp == 0 { + if len(a) < len(b) { + cmp = -1 + } else if len(a) > len(b) { + cmp = 1 + } + } + switch op { + case ">": + return cmp > 0 + case ">=": + return cmp >= 0 + case "<": + return cmp < 0 + case "<=": + return cmp <= 0 + default: + return false + } +} + +func opVerb(op string) string { + switch op { + case "+": + return "add" + case "-": + return "subtract" + case "*": + return "multiply" + case ">", ">=", "<", "<=": + return "compare" + default: + return "perform arithmetic on" + } +} + +func arithError(left, right any, op string) any { + return NewError(fmt.Sprintf("cannot %s %s and %s: arithmetic requires numeric types", + opVerb(op), typeName(left), typeName(right))) +} + +func typeName(v any) string { + if v == nil { + return "null" + } + switch v.(type) { + case string: + return "string" + case bool: + return "bool" + case int64: + return "int64" + case int32: + return "int32" + case uint32: + return "uint32" + case uint64: + return "uint64" + case float64: + return "float64" + case float32: + return "float32" + case []any: + return "array" + case map[string]any: + return "object" + case []byte: + return "bytes" + default: + return fmt.Sprintf("%T", v) + } +} + +func stringCompare(a, b, op string) any { + switch op { + case ">": + return a > b + case ">=": + return a >= b + case "<": + return a < b + case "<=": + return a <= b + default: + return false + } +} + +// ----------------------------------------------------------------------- +// Numeric promotion +// ----------------------------------------------------------------------- + +type promoteKind int + +const ( + promoteError promoteKind = iota + promoteInt32 + promoteInt64 + promoteUint32 + promoteUint64 + promoteFloat32 + promoteFloat64 +) + +// promoteChecked promotes two values and returns a specific error message on failure. +func promoteChecked(a, b any) (any, any, promoteKind, string) { + pa, pb, kind := promote(a, b) + if kind == promoteError { + // Determine specific error. + ak, bk := numericKind(a), numericKind(b) + if (ak == promoteUint64 || bk == promoteUint64) && !isFloatKind(ak) && !isFloatKind(bk) { + return nil, nil, promoteError, "uint64 value exceeds int64 range" + } + return nil, nil, promoteError, "integer exceeds float64 exact range (magnitude > 2^53)" + } + return pa, pb, kind, "" +} + +func promote(a, b any) (any, any, promoteKind) { + ak, bk := numericKind(a), numericKind(b) + + if ak == bk { + return a, b, ak + } + + // Same signedness, different width: widen. + // uint32 + uint64 → uint64. + if (ak == promoteUint32 && bk == promoteUint64) || (ak == promoteUint64 && bk == promoteUint32) { + return toU64(a), toU64(b), promoteUint64 + } + + // Any float involved → float64 (except float32+float32 which stays float32, + // but that case is handled by ak == bk above). + if isFloatKind(ak) || isFloatKind(bk) { + af, aOk := checkedToFloat64(a) + bf, bOk := checkedToFloat64(b) + if !aOk || !bOk { + return nil, nil, promoteError + } + return af, bf, promoteFloat64 + } + + // Both integers: widen to int64. Check uint64 overflow. + ai := toI64(a) + bi := toI64(b) + if ai == nil || bi == nil { + return nil, nil, promoteError + } + return ai, bi, promoteInt64 +} + +func toU64(v any) any { + switch n := v.(type) { + case uint32: + return uint64(n) + case uint64: + return n + default: + return nil + } +} + +func isFloatKind(k promoteKind) bool { + return k == promoteFloat32 || k == promoteFloat64 +} + +// checkedToFloat64 converts a numeric value to float64, returning false +// if the value is an integer with magnitude > 2^53 (can't be represented exactly). +func checkedToFloat64(v any) (float64, bool) { + const maxSafeInt = 1 << 53 // 9007199254740992 + switch n := v.(type) { + case int64: + if n > maxSafeInt || n < -maxSafeInt { + return 0, false + } + return float64(n), true + case int32: + return float64(n), true + case uint32: + return float64(n), true + case uint64: + if n > maxSafeInt { + return 0, false + } + return float64(n), true + case float64: + return n, true + case float32: + return float64(n), true + default: + return 0, false + } +} + +func numericKind(v any) promoteKind { + switch v.(type) { + case int32: + return promoteInt32 + case int64: + return promoteInt64 + case uint32: + return promoteUint32 + case uint64: + return promoteUint64 + case float32: + return promoteFloat32 + case float64: + return promoteFloat64 + default: + return promoteError + } +} + +func toI64(v any) any { + switch n := v.(type) { + case int32: + return int64(n) + case int64: + return n + case uint32: + return int64(n) + case uint64: + if n > math.MaxInt64 { + return nil // caller should check + } + return int64(n) + default: + return nil + } +} + +// ----------------------------------------------------------------------- +// Checked integer arithmetic +// ----------------------------------------------------------------------- + +func checkedInt64Arith(a, b int64, op string) any { + switch op { + case "+": + if (b > 0 && a > math.MaxInt64-b) || (b < 0 && a < math.MinInt64-b) { + return NewError("int64 overflow") + } + return a + b + case "-": + if (b < 0 && a > math.MaxInt64+b) || (b > 0 && a < math.MinInt64+b) { + return NewError("int64 overflow") + } + return a - b + case "*": + if a != 0 && b != 0 { + result := a * b + if result/a != b { + return NewError("int64 overflow") + } + return result + } + return a * b + default: + return NewError("unsupported int64 operation " + op) + } +} + +func checkedInt32Arith(a, b int32, op string) any { + // Promote to int64, check, then narrow. + result := checkedInt64Arith(int64(a), int64(b), op) + if IsError(result) { + return result + } + r := result.(int64) + if r > math.MaxInt32 || r < math.MinInt32 { + return NewError("int32 overflow") + } + return int32(r) +} + +func checkedUint32Arith(a, b uint32, op string) any { + switch op { + case "+": + if a > math.MaxUint32-b { + return NewError("uint32 overflow") + } + return a + b + case "-": + if a < b { + return NewError("uint32 overflow") + } + return a - b + case "*": + if a != 0 && b != 0 { + result := a * b + if result/a != b { + return NewError("uint32 overflow") + } + return result + } + return a * b + default: + return NewError("unsupported uint32 operation " + op) + } +} + +func checkedUint64Arith(a, b uint64, op string) any { + switch op { + case "+": + if a > math.MaxUint64-b { + return NewError("uint64 overflow") + } + return a + b + case "-": + if a < b { + return NewError("uint64 overflow") + } + return a - b + case "*": + if a != 0 && b != 0 { + result := a * b + if result/a != b { + return NewError("uint64 overflow") + } + return result + } + return a * b + default: + return NewError("unsupported uint64 operation " + op) + } +} + +func floatArith(a, b float64, op string) any { + switch op { + case "+": + return a + b + case "-": + return a - b + case "*": + return a * b + default: + return NewError("unsupported float64 operation " + op) + } +} + +func float32Arith(a, b float32, op string) any { + switch op { + case "+": + return a + b + case "-": + return a - b + case "*": + return a * b + default: + return NewError("unsupported float32 operation " + op) + } +} diff --git a/internal/bloblang2/go/pratt/eval/clone.go b/internal/bloblang2/go/pratt/eval/clone.go new file mode 100644 index 000000000..b35d4dfe4 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/clone.go @@ -0,0 +1,34 @@ +package eval + +import "time" + +// DeepClone creates a deep copy of a value. Simple types (numbers, +// strings, bools, time.Time) are returned as-is since Go copies them +// by value. Maps, slices, and byte slices are recursively cloned. +func DeepClone(v any) any { + switch val := v.(type) { + case map[string]any: + out := make(map[string]any, len(val)) + for k, v := range val { + out[k] = DeepClone(v) + } + return out + case []any: + out := make([]any, len(val)) + for i, v := range val { + out[i] = DeepClone(v) + } + return out + case []byte: + out := make([]byte, len(val)) + copy(out, val) + return out + default: + // string, int32, int64, uint32, uint64, float32, float64, + // bool, nil, time.Time — all copied by value. + return v + } +} + +// Ensure time import is referenced. +var _ time.Time diff --git a/internal/bloblang2/go/pratt/eval/interp.go b/internal/bloblang2/go/pratt/eval/interp.go new file mode 100644 index 000000000..2ee7d57a2 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/interp.go @@ -0,0 +1,2019 @@ +package eval + +import ( + "errors" + "fmt" + "math" + "strconv" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// maxRecursionDepth is the maximum allowed recursion depth for map calls. +const maxRecursionDepth = 10000 + +// Interpreter executes a parsed Bloblang V2 program. +type Interpreter struct { + prog *syntax.Program + + // Runtime state. + input any + inputMeta map[string]any + output any + outputMeta map[string]any + deleted bool + + // Map table: local maps + namespaced imports. + maps map[string]*syntax.MapDecl + namespaces map[string]map[string]*syntax.MapDecl + + scope *scope + depth int // recursion depth + + // Variable stack — replaces scope chain for variable access when + // resolver-assigned slot indices are available on AST nodes. + // Access pattern: stack[frameBase + slotIndex]. + // frameBase is 0 for top-level, advanced per map call. + stack []any + frameBase int + stackTop int // next free stack region for frame allocation + + // Methods and functions. Static maps are shared across all interpreters + // (built once at init time). Lambda methods need a per-interpreter map + // because they close over the interpreter for callLambda dispatch. + staticMethods map[string]MethodSpec + staticFunctions map[string]FunctionSpec + lambdaMethods map[string]MethodSpec + + // messageCtx is bound for the duration of a single RunWithMessage + // call. Message-coupled stdlib functions (batch_index, content, + // error, ...) read from it; calls from the plain Run path see a + // nil context and return a runtime error. + messageCtx MessageContext + + // lambdaTable is the opcode-indexed dispatch table for lambda methods. + // Indexed by (opcode - lambdaOpcodeBase). + lambdaTable []MethodSpec +} + +// MethodFunc is a stdlib method implementation. +// Receiver is the value the method is called on. +type MethodFunc func(receiver any, args []any) any + +// MethodParam describes a method parameter for named argument support. +type MethodParam struct { + Name string + Default any // default value (nil means required) + HasDefault bool + AcceptsLambda bool // this position accepts a lambda argument +} + +// MethodSpec bundles a method implementation with its behavioral metadata. +// Metadata is colocated with the method definition so the interpreter dispatch +// does not need hardcoded name lists. +type MethodSpec struct { + Fn MethodFunc // regular method (mutually exclusive with LambdaFn / PluginFn) + LambdaFn lambdaMethodFunc // lambda method (mutually exclusive with Fn / PluginFn) + PluginFn pluginMethodFunc // plugin dispatch with interp + AST args (mutually exclusive with Fn / LambdaFn) + Intrinsic bool // marks catch/or — dispatch handled inline, registered for name resolution only + Params []MethodParam // nil for methods with no named-arg support + AcceptsNull bool // receiver can be nil (e.g., type, string, not_null) + AcceptsLambda bool // method accepts a lambda argument (implicit for LambdaFn / PluginFn methods) + // ArgFolder, if set, is surfaced on the MethodInfo so the resolver + // runs parse-time folding on literal arguments (e.g. precompiling + // regex patterns). See syntax.ArgFolder. + ArgFolder syntax.ArgFolder + // CallFolder, if set, is surfaced on the MethodInfo so the resolver + // can precompute a parse-time dispatch target. When the folder returns + // a non-nil PreboundMethod, the interpreter uses it directly and skips + // spec.Fn. See syntax.CallFolder and PreboundMethod. + CallFolder syntax.CallFolder +} + +// lambdaMethodFunc is a method that receives unevaluated AST args (for lambda/map-ref arguments). +type lambdaMethodFunc func(receiver any, args []syntax.CallArg) any + +// pluginMethodFunc is the dispatch shape used by the public plugin surface +// (public/bloblangv2). The interpreter is forwarded so plugins can extract +// lambda arguments via interp.CallLambda; non-lambda arguments stay as AST +// nodes until the plugin asks for them. +type pluginMethodFunc func(interp *Interpreter, receiver any, args []syntax.CallArg) any + +// FunctionFunc is a stdlib function implementation. +type FunctionFunc func(args []any) any + +// FunctionSpec bundles a function implementation with its behavioral metadata. +type FunctionSpec struct { + Fn FunctionFunc + PluginFn pluginFunctionFunc // plugin dispatch with interp + AST args (mutually exclusive with Fn / MessageFn) + MessageFn MessageFunctionFunc // message-coupled dispatch (mutually exclusive with Fn / PluginFn) + // RequiresMessageContext is set automatically when MessageFn is + // provided. The flag is surfaced on FunctionInfo for tooling + // (schema dumps, linters) and drives the dispatch-time check that + // the interpreter has a MessageContext bound. Folding bypass is + // implicit: MessageFn functions have no Fn for the resolver to + // fold against. + RequiresMessageContext bool + Params []FunctionParam // for compile-time arity checking + // ArgFolder, if set, is surfaced on the FunctionInfo so the resolver + // runs parse-time folding on literal arguments. See syntax.ArgFolder. + ArgFolder syntax.ArgFolder + // CallFolder, if set, is surfaced on the FunctionInfo so the resolver + // can precompute a parse-time dispatch target. See PreboundFunction. + CallFolder syntax.CallFolder +} + +// pluginFunctionFunc is the function analogue of pluginMethodFunc. +type pluginFunctionFunc func(interp *Interpreter, args []syntax.CallArg) any + +// PreboundMethod is the shape the interpreter expects to find in +// syntax.MethodCallExpr.Prebound / syntax.PathSegment.Prebound. When set +// by a CallFolder, it is invoked in place of the normal spec dispatch. +type PreboundMethod func(receiver any) any + +// PreboundFunction is the shape the interpreter expects to find in +// syntax.CallExpr.Prebound. It is invoked in place of spec.Fn dispatch. +type PreboundFunction func() any + +// FunctionParam describes a function parameter for compile-time validation +// and named argument resolution. +type FunctionParam struct { + Name string + Default any // default value (used for named arg resolution) + HasDefault bool + AcceptsLambda bool // true when the parameter accepts a lambda expression +} + +// New creates a new interpreter for the given program. +func New(prog *syntax.Program) *Interpreter { + interp := &Interpreter{ + prog: prog, + maps: make(map[string]*syntax.MapDecl), + namespaces: make(map[string]map[string]*syntax.MapDecl), + } + + if prog != nil { + // Hoist map declarations. + for _, m := range prog.Maps { + interp.maps[m.Name] = m + } + + // Build namespace tables from imports. + for ns, maps := range prog.Namespaces { + table := make(map[string]*syntax.MapDecl, len(maps)) + for _, m := range maps { + table[m.Name] = m + } + interp.namespaces[ns] = table + } + } + + return interp +} + +// NewWithStdlib creates a new interpreter with the shared stdlib already +// wired in. This is the fast path for repeated execution of compiled +// mappings — the static method/function tables are shared (not copied) +// across all interpreters and only the lambda methods (which close over +// the interpreter) are allocated per-instance. +func NewWithStdlib(prog *syntax.Program) *Interpreter { + interp := &Interpreter{ + prog: prog, + maps: make(map[string]*syntax.MapDecl), + namespaces: make(map[string]map[string]*syntax.MapDecl), + lambdaTable: make([]MethodSpec, len(lambdaOpcodeOffsets)), + } + + if prog != nil { + for _, m := range prog.Maps { + interp.maps[m.Name] = m + } + for ns, maps := range prog.Namespaces { + table := make(map[string]*syntax.MapDecl, len(maps)) + for _, m := range maps { + table[m.Name] = m + } + interp.namespaces[ns] = table + } + } + + interp.RegisterLambdaMethods() + return interp +} + +// RegisterMethod registers a stdlib method with its behavioral metadata. +func (interp *Interpreter) RegisterMethod(name string, spec MethodSpec) { + if interp.staticMethods == nil { + interp.staticMethods = make(map[string]MethodSpec) + } + interp.staticMethods[name] = spec +} + +// RegisterFunction registers a stdlib function with its behavioral metadata. +func (interp *Interpreter) RegisterFunction(name string, spec FunctionSpec) { + if interp.staticFunctions == nil { + interp.staticFunctions = make(map[string]FunctionSpec) + } + if spec.MessageFn != nil { + spec.RequiresMessageContext = true + } + interp.staticFunctions[name] = spec +} + +// RegisterLambdaMethod registers a method that needs the interpreter for +// lambda/map-ref dispatch. These are stored separately from static methods +// and checked first during dispatch. Also populates the opcode-indexed +// lambdaTable for fast dispatch. +func (interp *Interpreter) RegisterLambdaMethod(name string, spec MethodSpec) { + if interp.lambdaMethods != nil { + interp.lambdaMethods[name] = spec + } + if offset, ok := lambdaOpcodeOffsets[name]; ok { + for int(offset) >= len(interp.lambdaTable) { + interp.lambdaTable = append(interp.lambdaTable, MethodSpec{}) + } + interp.lambdaTable[offset] = spec + } +} + +// lookupMethod resolves a method by name, checking lambda methods first +// (per-interpreter) then static methods (shared). +func (interp *Interpreter) lookupMethod(name string) (MethodSpec, bool) { + if spec, ok := interp.lambdaMethods[name]; ok { + return spec, true + } + spec, ok := interp.staticMethods[name] + return spec, ok +} + +// stackGet reads a variable from the stack at frameBase + slot. +func (interp *Interpreter) stackGet(slot int) any { + return interp.stack[interp.frameBase+slot] +} + +// stackSet writes a variable to the stack at frameBase + slot. +func (interp *Interpreter) stackSet(slot int, value any) { + interp.stack[interp.frameBase+slot] = value +} + +// stackIsDeclared checks whether a stack slot has been assigned (is not the +// uninitialized sentinel). Used for void-skip-on-reassignment checks. +func (interp *Interpreter) stackIsDeclared(slot int) bool { + _, uninit := interp.stack[interp.frameBase+slot].(uninitializedVal) + return !uninit +} + +// ensureStack ensures the stack can accommodate slots up to index `needed-1`, +// filling the current frame region [frameBase, needed) with the uninitialized +// sentinel. Slots beyond `needed` are left zero-initialized — callers MUST NOT +// access slots beyond the value passed to ensureStack without calling it again. +func (interp *Interpreter) ensureStack(needed int) { + if needed <= len(interp.stack) { + // Fill the new frame region with sentinels. + for i := interp.frameBase; i < needed; i++ { + interp.stack[i] = uninitialized + } + return + } + grown := make([]any, needed*2) + copy(grown, interp.stack) + for i := len(interp.stack); i < needed; i++ { + grown[i] = uninitialized + } + interp.stack = grown +} + +// Exec runs the program against the given input and metadata. +func (interp *Interpreter) Exec(input any, metadata map[string]any) (output any, outputMeta map[string]any, deleted bool, err error) { + interp.input = input + interp.inputMeta = metadata + interp.output = make(map[string]any) + interp.outputMeta = make(map[string]any) + interp.deleted = false + interp.depth = 0 + if interp.scope == nil { + interp.scope = newScope(nil, scopeStatement) + } else { + clear(interp.scope.vars) + } + // Allocate or reuse variable stack. MaxSlots > 0 means the resolver + // assigned slot indices. + if interp.prog.MaxSlots > 0 { + interp.frameBase = 0 + interp.stackTop = interp.prog.MaxSlots + if interp.stack == nil { + stackCap := interp.prog.MaxSlots * 4 + if stackCap < 32 { + stackCap = 32 + } + interp.stack = make([]any, stackCap) + } + // Reset the initial frame with sentinels. + for i := 0; i < interp.prog.MaxSlots && i < len(interp.stack); i++ { + interp.stack[i] = uninitialized + } + } + + for _, stmt := range interp.prog.Stmts { + interp.execStmt(stmt) + if interp.deleted { + return nil, nil, true, nil + } + } + + return interp.output, interp.outputMeta, false, nil +} + +// ----------------------------------------------------------------------- +// Statement execution +// ----------------------------------------------------------------------- + +func (interp *Interpreter) execStmt(stmt syntax.Stmt) { + switch s := stmt.(type) { + case *syntax.Assignment: + interp.execAssignment(s) + case *syntax.IfStmt: + interp.execIfStmt(s) + case *syntax.MatchStmt: + interp.execMatchStmt(s) + } +} + +func (interp *Interpreter) execAssignment(a *syntax.Assignment) { + value := interp.evalExpr(a.Value) + + // Error propagation: if value is an error, it halts the mapping. + if IsError(value) { + panic(runtimeError{message: ErrorMessage(value)}) + } + + // Void handling. + if IsVoid(value) { + // For variable targets: declaration with void is an error, + // reassignment with void skips the assignment. + if a.Target.Root == syntax.AssignVar && len(a.Target.Path) == 0 { + var isDeclared bool + if slot := a.Target.SlotIndex; slot > 0 { + isDeclared = interp.stackIsDeclared(slot) + } else { + _, isDeclared = interp.scope.get(a.Target.VarName) + } + if !isDeclared { + panic(runtimeError{message: "void in variable declaration (use .or() to provide a default)"}) + } + } + return + } + + switch a.Target.Root { + case syntax.AssignOutput: + if a.Target.MetaAccess { + // Metadata root assignment validation (Section 7.4, 9.2). + if len(a.Target.Path) == 0 { + if IsDeleted(value) { + panic(runtimeError{message: "cannot delete metadata object"}) + } + obj, ok := value.(map[string]any) + if !ok { + panic(runtimeError{message: fmt.Sprintf("metadata must be an object, got %T", value)}) + } + interp.outputMeta = DeepClone(obj).(map[string]any) + return + } + var meta any = interp.outputMeta + interp.assignPath(&meta, a.Target.Path, value) + if m, ok := meta.(map[string]any); ok { + interp.outputMeta = m + } + } else { + // Message drop: output = deleted() — set flag and exit immediately + // without storing the sentinel in interp.output. + if len(a.Target.Path) == 0 && IsDeleted(value) { + interp.deleted = true + return + } + interp.assignPath(&interp.output, a.Target.Path, value) + } + case syntax.AssignVar: + if IsDeleted(value) { + if len(a.Target.Path) == 0 { + panic(runtimeError{message: "cannot assign deleted() to a variable"}) + } + } + if slot := a.Target.SlotIndex; slot > 0 { + if len(a.Target.Path) == 0 { + interp.stackSet(slot, value) + } else { + existing := interp.stackGet(slot) + // Path assignment to an uninitialized variable declares it and + // auto-creates the root based on the first path segment (Section 3.7). + if _, uninit := existing.(uninitializedVal); uninit { + existing = nil + } + clone := DeepClone(existing) + interp.setPath(&clone, a.Target.Path, value) + interp.stackSet(slot, clone) + } + } else if len(a.Target.Path) == 0 { + interp.scope.set(a.Target.VarName, value) + } else { + // Path assignment to an undeclared variable declares it here. + existing, _ := interp.scope.get(a.Target.VarName) + clone := DeepClone(existing) + interp.setPath(&clone, a.Target.Path, value) + interp.scope.set(a.Target.VarName, clone) + } + } +} + +func (interp *Interpreter) execIfStmt(s *syntax.IfStmt) { + for _, branch := range s.Branches { + cond := interp.evalExpr(branch.Cond) + if IsError(cond) { + panic(runtimeError{message: ErrorMessage(cond)}) + } + b, ok := cond.(bool) + if !ok { + panic(runtimeError{message: fmt.Sprintf("if condition must be boolean, got %T", cond)}) + } + if b { + childScope := newScope(interp.scope, scopeStatement) + saved := interp.scope + interp.scope = childScope + for _, stmt := range branch.Body { + interp.execStmt(stmt) + if interp.deleted { + interp.scope = saved + return + } + } + interp.scope = saved + return + } + } + + if s.Else != nil { + childScope := newScope(interp.scope, scopeStatement) + saved := interp.scope + interp.scope = childScope + for _, stmt := range s.Else { + interp.execStmt(stmt) + if interp.deleted { + interp.scope = saved + return + } + } + interp.scope = saved + } +} + +func (interp *Interpreter) execMatchStmt(s *syntax.MatchStmt) { + var subject any + if s.Subject != nil { + subject = interp.evalExpr(s.Subject) + if IsError(subject) { + panic(runtimeError{message: ErrorMessage(subject)}) + } + } + + for _, c := range s.Cases { + matched, errVal := interp.matchCaseMatches(c, subject, s.Binding, s.BindingSlot, s.Subject != nil) + if errVal != nil { + panic(runtimeError{message: ErrorMessage(errVal)}) + } + if matched { + body, ok := c.Body.([]syntax.Stmt) + if !ok { + return + } + childScope := newScope(interp.scope, scopeStatement) + if s.Binding != "" { + if s.BindingSlot > 0 { + interp.stackSet(s.BindingSlot, subject) + } else { + childScope.vars[s.Binding] = subject + } + } + saved := interp.scope + interp.scope = childScope + for _, stmt := range body { + interp.execStmt(stmt) + if interp.deleted { + interp.scope = saved + return + } + } + interp.scope = saved + return + } + } +} + +// ----------------------------------------------------------------------- +// Expression evaluation +// ----------------------------------------------------------------------- + +func (interp *Interpreter) evalExpr(expr syntax.Expr) any { + switch e := expr.(type) { + case *syntax.LiteralExpr: + return interp.evalLiteral(e) + case *syntax.BinaryExpr: + return interp.evalBinary(e) + case *syntax.UnaryExpr: + return interp.evalUnary(e) + case *syntax.InputExpr: + return interp.input // immutable, no clone needed + case *syntax.InputMetaExpr: + return interp.inputMeta // immutable, no clone needed + case *syntax.OutputExpr: + return DeepClone(interp.output) // mutable; must snapshot for COW semantics + case *syntax.OutputMetaExpr: + return DeepClone(interp.outputMeta) // mutable; must snapshot for COW semantics + case *syntax.VarExpr: + if e.SlotIndex > 0 { + return interp.stackGet(e.SlotIndex) + } + v, ok := interp.scope.get(e.Name) + if !ok { + panic(runtimeError{message: "undefined variable $" + e.Name}) + } + return v + case *syntax.IdentExpr: + return interp.evalIdent(e) + case *syntax.CallExpr: + return interp.evalCall(e) + case *syntax.FieldAccessExpr: + return interp.evalFieldAccess(e) + case *syntax.MethodCallExpr: + return interp.evalMethodCall(e) + case *syntax.IndexExpr: + return interp.evalIndex(e) + case *syntax.IfExpr: + return interp.evalIfExpr(e) + case *syntax.MatchExpr: + return interp.evalMatchExpr(e) + case *syntax.ArrayLiteral: + return interp.evalArrayLiteral(e) + case *syntax.ObjectLiteral: + return interp.evalObjectLiteral(e) + case *syntax.LambdaExpr: + // Lambdas in expression position shouldn't be evaluated directly. + // They're handled by the method that receives them. + panic(runtimeError{message: "lambda expression cannot be used as a value"}) + case *syntax.PathExpr: + return interp.evalPathExpr(e) + default: + panic(runtimeError{message: fmt.Sprintf("unknown expression type %T", expr)}) + } +} + +func (interp *Interpreter) evalLiteral(e *syntax.LiteralExpr) any { + switch e.TokenType { + case syntax.INT: + n, _ := strconv.ParseInt(e.Value, 10, 64) + return n + case syntax.FLOAT: + f, _ := strconv.ParseFloat(e.Value, 64) + return f + case syntax.STRING, syntax.RAW_STRING: + return e.Value + case syntax.TRUE: + return true + case syntax.FALSE: + return false + case syntax.NULL: + return nil + default: + return nil + } +} + +func (interp *Interpreter) evalBinary(e *syntax.BinaryExpr) any { + left := interp.evalExpr(e.Left) + if IsError(left) { + return left + } + if IsVoid(left) { + return NewError("void in expression") + } + if IsDeleted(left) { + return NewError("deleted value in expression") + } + + // Short-circuit for logical operators. + if e.Op == syntax.AND { + b, ok := left.(bool) + if !ok { + return NewError(fmt.Sprintf("&& requires boolean operands, got %T", left)) + } + if !b { + return false + } + right := interp.evalExpr(e.Right) + if IsError(right) { + return right + } + rb, ok := right.(bool) + if !ok { + return NewError(fmt.Sprintf("&& requires boolean operands, got %T", right)) + } + return rb + } + if e.Op == syntax.OR { + b, ok := left.(bool) + if !ok { + return NewError(fmt.Sprintf("|| requires boolean operands, got %T", left)) + } + if b { + return true + } + right := interp.evalExpr(e.Right) + if IsError(right) { + return right + } + rb, ok := right.(bool) + if !ok { + return NewError(fmt.Sprintf("|| requires boolean operands, got %T", right)) + } + return rb + } + + right := interp.evalExpr(e.Right) + if IsError(right) { + return right + } + if IsVoid(right) { + return NewError("void in expression") + } + if IsDeleted(right) { + return NewError("deleted value in expression") + } + + return interp.evalBinaryOp(e.Op, left, right) +} + +func (interp *Interpreter) evalUnary(e *syntax.UnaryExpr) any { + operand := interp.evalExpr(e.Operand) + if IsError(operand) { + return operand + } + if IsVoid(operand) { + return NewError("void in expression") + } + if IsDeleted(operand) { + return NewError("deleted value in expression") + } + + switch e.Op { + case syntax.MINUS: + return numericNegate(operand) + case syntax.BANG: + b, ok := operand.(bool) + if !ok { + return NewError(fmt.Sprintf("! requires boolean operand, got %T", operand)) + } + return !b + default: + return NewError(fmt.Sprintf("unknown unary operator %s", e.Op)) + } +} + +func (interp *Interpreter) evalFieldAccess(e *syntax.FieldAccessExpr) any { + receiver := interp.evalExpr(e.Receiver) + if IsError(receiver) { + return receiver + } + if e.NullSafe && receiver == nil { + return nil + } + if receiver == nil { + return NewError(fmt.Sprintf("cannot access field %q on null", e.Field)) + } + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("cannot access field %q on %T", e.Field, receiver)) + } + return obj[e.Field] +} + +func (interp *Interpreter) evalIndex(e *syntax.IndexExpr) any { + receiver := interp.evalExpr(e.Receiver) + if IsError(receiver) { + return receiver + } + if e.NullSafe && receiver == nil { + return nil + } + + index := interp.evalExpr(e.Index) + if IsError(index) { + return index + } + + return interp.indexValue(receiver, index) +} + +func (interp *Interpreter) evalMethodCall(e *syntax.MethodCallExpr) any { + // Intrinsic: .catch() — intercepts errors, passes void/deleted through. + if e.Method == "catch" { + return interp.evalCatch(e) + } + + // Intrinsic: .or() — rescues null/void/deleted with short-circuit evaluation. + if e.Method == "or" { + return interp.evalOr(e) + } + + receiver := interp.evalExpr(e.Receiver) + + // Error propagation: errors skip method calls (except .catch handled above). + if IsError(receiver) { + return receiver + } + + // Null-safe: ?.method() returns nil if receiver is null. + if e.NullSafe && receiver == nil { + return nil + } + + // Parse-time-bound dispatch: a CallFolder populated e.Prebound with a + // ready-to-run closure. We still honour receiver null and void/deleted + // rules, but skip spec lookup and the per-call constructor path. + if pb, ok := e.Prebound.(PreboundMethod); ok { + if receiver == nil && !e.NullSafe { + return NewError(fmt.Sprintf(".%s() does not support null", e.Method)) + } + if IsVoid(receiver) { + return NewError("cannot call method on void") + } + if IsDeleted(receiver) { + return NewError("cannot call method on deleted value") + } + return pb(receiver) + } + + // Look up the method via opcode (fast path) or name (fallback). + var spec MethodSpec + if opc := e.MethodOpcode; opc != 0 { + if opc >= lambdaOpcodeBase { + spec = interp.lambdaTable[opc-lambdaOpcodeBase] + } else { + spec = methodTable[opc] + } + } else { + var ok bool + spec, ok = interp.lookupMethod(e.Method) + if !ok { + if receiver == nil { + return NewError(fmt.Sprintf(".%s() does not support null", e.Method)) + } + return NewError(fmt.Sprintf("unknown method .%s()", e.Method)) + } + } + + // Null check using spec metadata. + if receiver == nil && !e.NullSafe && !spec.AcceptsNull { + return NewError(fmt.Sprintf(".%s() does not support null", e.Method)) + } + + // Void and deleted in method calls (except .or handled above) are errors. + if IsVoid(receiver) { + return NewError("cannot call method on void") + } + if IsDeleted(receiver) { + return NewError("cannot call method on deleted value") + } + + // Lambda methods: receive unevaluated AST args (for lambdas/map-refs). + if spec.LambdaFn != nil { + args := e.Args + if e.Named && spec.Params != nil { + args = reorderNamedCallArgs(args, spec.Params) + } + return spec.LambdaFn(receiver, args) + } + + // Plugin methods that need interp access. + if spec.PluginFn != nil { + args := e.Args + if e.Named && spec.Params != nil { + args = reorderNamedCallArgs(args, spec.Params) + } + return spec.PluginFn(interp, receiver, args) + } + + // Evaluate arguments, resolving named args to positional if needed. + var args []any + if e.Named { + resolved := interp.resolveNamedMethodArgs(e) + if IsError(resolved) { + return resolved + } + args = resolved.([]any) + } else { + args = interp.evalArgs(e.Args) + } + for _, a := range args { + if IsError(a) { + return a + } + } + + return spec.Fn(receiver, args) +} + +func (interp *Interpreter) evalCatch(e *syntax.MethodCallExpr) any { + receiver := interp.evalExpr(e.Receiver) + + // .catch() passes non-errors (including void and deleted) through unchanged. + if !IsError(receiver) { + return receiver + } + + // Error: invoke the catch handler lambda. + if len(e.Args) != 1 { + return NewError(".catch() requires exactly one argument") + } + lambda, ok := e.Args[0].Value.(*syntax.LambdaExpr) + if !ok { + return NewError(".catch() argument must be a lambda") + } + + // Build the error object: {"what": "error message"}. + errObj := map[string]any{"what": ErrorMessage(receiver)} + + return interp.callLambda(lambda, []any{errObj}) +} + +func (interp *Interpreter) evalOr(e *syntax.MethodCallExpr) any { + receiver := interp.evalExpr(e.Receiver) + + // .or() rescues null, void, and deleted — returns the argument. + // For all other values (including errors), returns the receiver unchanged. + if receiver != nil && !IsVoid(receiver) && !IsDeleted(receiver) { + return receiver + } + + // Short-circuit: only evaluate the argument when rescuing. + if len(e.Args) != 1 { + return NewError(".or() requires exactly one argument") + } + return interp.evalExpr(e.Args[0].Value) +} + +// CallLambda executes a lambda expression with the given arguments. It is +// exported so plugin dispatch in public/bloblangv2 can invoke a lambda +// argument captured from a method / function call. +func (interp *Interpreter) CallLambda(lambda *syntax.LambdaExpr, args []any) any { + return interp.callLambda(lambda, args) +} + +// EvalExpr evaluates an arbitrary expression node against the interpreter +// state. Exported for plugin dispatch in public/bloblangv2 so non-lambda +// arguments to a plugin call can be resolved before the constructor runs. +func (interp *Interpreter) EvalExpr(expr syntax.Expr) any { + return interp.evalExpr(expr) +} + +// callLambda executes a lambda expression with the given arguments. +// Uses stack-based parameter binding when slot indices are available +// (fast path), otherwise falls back to scope-based binding. +func (interp *Interpreter) callLambda(lambda *syntax.LambdaExpr, args []any) any { + // Fast path: bind parameters via stack when slot indices are available. + useStack := false + for _, p := range lambda.Params { + if !p.Discard { + useStack = p.SlotIndex > 0 + break + } + } + if useStack { + for i, p := range lambda.Params { + if p.Discard { + continue + } + if i < len(args) { + interp.stackSet(p.SlotIndex, args[i]) + } else if p.Default != nil { + interp.stackSet(p.SlotIndex, interp.evalExpr(p.Default)) + } + } + return interp.evalExprBody(lambda.Body) + } + + // Fallback: scope-based parameter binding. + lambdaScope := newScope(interp.scope, scopeExpression) + for i, p := range lambda.Params { + if p.Discard { + continue + } + if i < len(args) { + lambdaScope.vars[p.Name] = args[i] + } else if p.Default != nil { + lambdaScope.vars[p.Name] = interp.evalExpr(p.Default) + } + } + + saved := interp.scope + interp.scope = lambdaScope + result := interp.evalExprBody(lambda.Body) + interp.scope = saved + + return result +} + +func (interp *Interpreter) evalCall(e *syntax.CallExpr) any { + // Check for namespace-qualified call. + if e.Namespace != "" { + return interp.callNamespaced(e) + } + + // Check for user-defined map. + if m, ok := interp.maps[e.Name]; ok { + return interp.callMap(m, e) + } + + // Parse-time-bound function: plugin CallFolder stashed a ready Function. + if pb, ok := e.Prebound.(PreboundFunction); ok { + return pb() + } + + // Check stdlib functions via opcode (fast path) or name (fallback). + var spec FunctionSpec + var isStdlib bool + if opc := e.FunctionOpcode; opc != 0 { + spec = functionTable[opc] + isStdlib = true + } else if s, ok := interp.staticFunctions[e.Name]; ok { + spec = s + isStdlib = true + } + if isStdlib { + // Plugin functions that need interp access (e.g., to evaluate + // lambda parameters). + if spec.PluginFn != nil { + args := e.Args + if e.Named && len(spec.Params) > 0 { + params := make([]MethodParam, len(spec.Params)) + for i, p := range spec.Params { + params[i] = MethodParam{Name: p.Name, Default: p.Default, HasDefault: p.HasDefault} + } + args = reorderNamedCallArgs(args, params) + } + return spec.PluginFn(interp, args) + } + var args []any + if e.Named { + resolved := interp.resolveNamedFuncArgs(e, spec) + if IsError(resolved) { + return resolved + } + args = resolved.([]any) + } else { + args = interp.evalArgs(e.Args) + } + for _, a := range args { + if IsError(a) { + return a + } + } + // Message-coupled functions read from the bound MessageContext. + // If none is bound (the mapping was run via Run instead of + // RunWithMessage) the call is a hard runtime error rather than + // a silent null fallback. + if spec.MessageFn != nil { + if interp.messageCtx == nil { + return NewError(fmt.Sprintf("function %s requires a message context, but Run was called without one", e.Name)) + } + return spec.MessageFn(interp.messageCtx, args) + } + return spec.Fn(args) + } + + return NewError(fmt.Sprintf("unknown function %s()", e.Name)) +} + +func (interp *Interpreter) callNamespaced(e *syntax.CallExpr) any { + ns, ok := interp.namespaces[e.Namespace] + if !ok { + return NewError(fmt.Sprintf("unknown namespace %q", e.Namespace)) + } + m, ok := ns[e.Name] + if !ok { + return NewError(fmt.Sprintf("unknown function %s::%s()", e.Namespace, e.Name)) + } + return interp.callMap(m, e) +} + +func (interp *Interpreter) callMap(m *syntax.MapDecl, e *syntax.CallExpr) any { + interp.depth++ + if interp.depth > maxRecursionDepth { + panic(recursionError{}) + } + defer func() { interp.depth-- }() + + // Evaluate arguments BEFORE pushing the frame, so that argument + // expressions read from the caller's frame (not the callee's). + args := interp.evalArgs(e.Args) + for _, a := range args { + if IsError(a) { + return a + } + } + + // Push stack frame for the map's isolated scope. + savedFrameBase := interp.frameBase + savedStackTop := interp.stackTop + interp.frameBase = interp.stackTop + interp.stackTop = interp.frameBase + m.MaxSlots + if interp.stackTop <= interp.frameBase { + interp.stackTop = interp.frameBase + 1 + } + interp.ensureStack(interp.stackTop) + + // Check if params have stack slots (check first non-discard param). + useStack := false + for _, p := range m.Params { + if !p.Discard { + useStack = p.SlotIndex > 0 + break + } + } + + // Bind parameters. Only create a scope when needed (no stack slots). + var mapScope *scope + if !useStack { + mapScope = newScope(nil, scopeExpression) + } + if e.Named { + // For named args, re-order the pre-evaluated args. + if err := interp.bindNamedMapParamsFromArgs(mapScope, m, e, args); err != "" { + interp.frameBase = savedFrameBase + interp.stackTop = savedStackTop + return NewError(err) + } + } else { + if useStack { + if err := interp.bindPositionalParamsStack(m.Params, args); err != "" { + interp.frameBase = savedFrameBase + interp.stackTop = savedStackTop + return NewError(err) + } + } else { + if err := interp.bindPositionalParams(mapScope, m.Params, args); err != "" { + interp.frameBase = savedFrameBase + interp.stackTop = savedStackTop + return NewError(err) + } + } + } + + // Evaluate the map body. If the map has its own namespace context + // (from an imported file), temporarily switch to it so that + // qualified calls within the map resolve correctly. + savedScope := interp.scope + savedNamespaces := interp.namespaces + savedMaps := interp.maps + + if mapScope != nil { + interp.scope = mapScope + } + if m.Namespaces != nil { + // Build namespace tables for this map's context. + nsTable := make(map[string]map[string]*syntax.MapDecl, len(m.Namespaces)) + for ns, maps := range m.Namespaces { + table := make(map[string]*syntax.MapDecl, len(maps)) + for _, md := range maps { + table[md.Name] = md + } + nsTable[ns] = table + } + interp.namespaces = nsTable + } + + result := interp.evalExprBody(m.Body) + + interp.scope = savedScope + interp.namespaces = savedNamespaces + interp.maps = savedMaps + interp.frameBase = savedFrameBase + interp.stackTop = savedStackTop + + return result +} + +func (interp *Interpreter) evalIdent(e *syntax.IdentExpr) any { + // Qualified reference — only valid in higher-order method args + // (handled by extractLambdaOrMapRef). If we reach here, it's misused. + if e.Namespace != "" { + return NewError(e.Namespace + "::" + e.Name + " cannot be used as a value (pass to a higher-order method or call with parentheses)") + } + // Fast path: stack-based variable/parameter access. + if e.SlotIndex > 0 { + return interp.stackGet(e.SlotIndex) + } + // Fallback: scope-based lookup. + if v, ok := interp.scope.get(e.Name); ok { + return v + } + // Bare map name without call — error per spec. + if _, ok := interp.maps[e.Name]; ok { + return NewError("map " + e.Name + " cannot be used as a value (call it with parentheses)") + } + return NewError("undefined identifier " + e.Name) +} + +func (interp *Interpreter) evalIfExpr(e *syntax.IfExpr) any { + for _, branch := range e.Branches { + cond := interp.evalExpr(branch.Cond) + if IsError(cond) { + return cond + } + b, ok := cond.(bool) + if !ok { + return NewError(fmt.Sprintf("if condition must be boolean, got %T", cond)) + } + if b { + childScope := newScope(interp.scope, scopeExpression) + saved := interp.scope + interp.scope = childScope + result := interp.evalExprBody(branch.Body) + interp.scope = saved + return result + } + } + + if e.Else != nil { + childScope := newScope(interp.scope, scopeExpression) + saved := interp.scope + interp.scope = childScope + result := interp.evalExprBody(e.Else) + interp.scope = saved + return result + } + + return Void +} + +func (interp *Interpreter) evalMatchExpr(e *syntax.MatchExpr) any { + var subject any + if e.Subject != nil { + subject = interp.evalExpr(e.Subject) + if IsError(subject) { + return subject + } + } + + for _, c := range e.Cases { + matched, errVal := interp.matchCaseMatches(c, subject, e.Binding, e.BindingSlot, e.Subject != nil) + if errVal != nil { + return errVal + } + if matched { + childScope := newScope(interp.scope, scopeExpression) + if e.Binding != "" { + if e.BindingSlot > 0 { + interp.stackSet(e.BindingSlot, subject) + } else { + childScope.vars[e.Binding] = subject + } + } + saved := interp.scope + interp.scope = childScope + + var result any + switch body := c.Body.(type) { + case syntax.Expr: + result = interp.evalExpr(body) + case *syntax.ExprBody: + result = interp.evalExprBody(body) + } + + interp.scope = saved + return result + } + } + + return Void +} + +// matchCaseMatches returns (matched, errorValue). If errorValue is non-nil, +// the case expression produced an error that should be propagated. +func (interp *Interpreter) matchCaseMatches(c syntax.MatchCase, subject any, binding string, bindingSlot int, hasSubject bool) (bool, any) { + if c.Wildcard { + return true, nil + } + + if hasSubject && binding == "" { + // Equality match: compare pattern against subject. + patternVal := interp.evalExpr(c.Pattern) + if IsError(patternVal) { + return false, patternVal + } + // Boolean case values are a runtime error in equality match. + if _, ok := patternVal.(bool); ok { + return false, NewError("boolean case value in equality match (use 'as' for boolean conditions)") + } + return valuesEqual(subject, patternVal), nil + } + + // Boolean match (with or without 'as'): case must evaluate to bool. + if binding != "" { + // Evaluate pattern with the binding visible. + if bindingSlot > 0 { + interp.stackSet(bindingSlot, subject) + } + childScope := newScope(interp.scope, interp.scope.mode) + childScope.vars[binding] = subject + saved := interp.scope + interp.scope = childScope + patternVal := interp.evalExpr(c.Pattern) + interp.scope = saved + if IsError(patternVal) { + return false, patternVal + } + b, ok := patternVal.(bool) + if !ok { + return false, NewError(fmt.Sprintf("boolean match case must evaluate to bool, got %T", patternVal)) + } + return b, nil + } + patternVal := interp.evalExpr(c.Pattern) + if IsError(patternVal) { + return false, patternVal + } + b, ok := patternVal.(bool) + if !ok { + return false, NewError(fmt.Sprintf("boolean match case must evaluate to bool, got %T", patternVal)) + } + return b, nil +} + +func (interp *Interpreter) evalArrayLiteral(e *syntax.ArrayLiteral) any { + result := make([]any, 0, len(e.Elements)) + for _, elem := range e.Elements { + val := interp.evalExpr(elem) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("void in array literal (use deleted() to omit elements, or add an else branch)") + } + if IsDeleted(val) { + continue // deleted elements are removed + } + result = append(result, val) + } + return result +} + +func (interp *Interpreter) evalObjectLiteral(e *syntax.ObjectLiteral) any { + result := make(map[string]any, len(e.Entries)) + for _, entry := range e.Entries { + key := interp.evalExpr(entry.Key) + if IsError(key) { + return key + } + keyStr, ok := key.(string) + if !ok { + return NewError(fmt.Sprintf("object key must be string, got %T", key)) + } + val := interp.evalExpr(entry.Value) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("void in object literal (use deleted() to omit fields, or add an else branch)") + } + if IsDeleted(val) { + continue // deleted fields are removed + } + result[keyStr] = val + } + return result +} + +func (interp *Interpreter) evalExprBody(body *syntax.ExprBody) any { + for _, va := range body.Assignments { + val := interp.evalExpr(va.Value) + if IsError(val) { + return val + } + if IsVoid(val) { + // Void in variable declaration is an error. + // Void in reassignment (variable already exists) skips. + if slot := va.SlotIndex; slot > 0 { + if interp.stackIsDeclared(slot) { + continue + } + } else if _, exists := interp.scope.get(va.Name); exists { + continue + } + return NewError("void in variable declaration (use .or() to provide a default)") + } + if IsDeleted(val) { + if len(va.Path) == 0 { + return NewError("cannot assign deleted() to a variable") + } + } + if slot := va.SlotIndex; slot > 0 { + if len(va.Path) == 0 { + interp.stackSet(slot, val) + } else { + existing := interp.stackGet(slot) + if _, uninit := existing.(uninitializedVal); uninit { + existing = nil + } + clone := DeepClone(existing) + interp.setPath(&clone, va.Path, val) + interp.stackSet(slot, clone) + } + } else { + if len(va.Path) == 0 { + interp.scope.set(va.Name, val) + } else { + existing, _ := interp.scope.get(va.Name) + clone := DeepClone(existing) + interp.setPath(&clone, va.Path, val) + interp.scope.set(va.Name, clone) + } + } + } + return interp.evalExpr(body.Result) +} + +func (interp *Interpreter) evalPathExpr(e *syntax.PathExpr) any { + var root any + switch e.Root { + case syntax.PathRootInput: + root = interp.input + case syntax.PathRootInputMeta: + root = interp.inputMeta + case syntax.PathRootOutput: + root = DeepClone(interp.output) + case syntax.PathRootOutputMeta: + root = DeepClone(interp.outputMeta) + case syntax.PathRootVar: + if e.VarSlotIndex > 0 { + root = interp.stackGet(e.VarSlotIndex) + break + } + v, ok := interp.scope.get(e.VarName) + if !ok { + return NewError("undefined variable $" + e.VarName) + } + root = v + } + + current := root + for _, seg := range e.Segments { + if IsError(current) { + return current + } + switch seg.Kind { + case syntax.PathSegField: + if seg.NullSafe && current == nil { + return nil + } + obj, ok := current.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("cannot access field %q on %T", seg.Name, current)) + } + current = obj[seg.Name] + case syntax.PathSegIndex: + if seg.NullSafe && current == nil { + return nil + } + idx := interp.evalExpr(seg.Index) + if IsError(idx) { + return idx + } + current = interp.indexValue(current, idx) + if IsError(current) { + return current + } + case syntax.PathSegMethod: + if seg.NullSafe && current == nil { + return nil + } + // Parse-time-bound plugin method: skip spec lookup entirely. + if pb, ok := seg.Prebound.(PreboundMethod); ok { + if current == nil && !seg.NullSafe { + return NewError(fmt.Sprintf(".%s() does not support null", seg.Name)) + } + if IsVoid(current) { + return NewError("cannot call method on void") + } + if IsDeleted(current) { + return NewError("cannot call method on deleted value") + } + current = pb(current) + continue + } + var spec MethodSpec + if opc := seg.MethodOpcode; opc != 0 { + if opc >= lambdaOpcodeBase { + spec = interp.lambdaTable[opc-lambdaOpcodeBase] + } else { + spec = methodTable[opc] + } + } else { + var ok bool + spec, ok = interp.lookupMethod(seg.Name) + if !ok { + return NewError(fmt.Sprintf("unknown method .%s()", seg.Name)) + } + } + // Intrinsic methods (catch/or) cannot appear in path expressions + // because they require control over receiver evaluation. + if spec.Intrinsic { + return NewError(fmt.Sprintf(".%s() cannot be used in path expressions", seg.Name)) + } + if current == nil && !seg.NullSafe && !spec.AcceptsNull { + return NewError(fmt.Sprintf(".%s() does not support null", seg.Name)) + } + if IsVoid(current) { + return NewError("cannot call method on void") + } + if IsDeleted(current) { + return NewError("cannot call method on deleted value") + } + if spec.LambdaFn != nil { + lambdaArgs := seg.Args + if seg.Named && spec.Params != nil { + lambdaArgs = reorderNamedCallArgs(lambdaArgs, spec.Params) + } + current = spec.LambdaFn(current, lambdaArgs) + } else if spec.PluginFn != nil { + pluginArgs := seg.Args + if seg.Named && spec.Params != nil { + pluginArgs = reorderNamedCallArgs(pluginArgs, spec.Params) + } + current = spec.PluginFn(interp, current, pluginArgs) + } else { + var args []any + if seg.Named { + resolved := interp.resolveNamedPathArgs(seg, spec) + if IsError(resolved) { + return resolved + } + args = resolved.([]any) + } else { + args = interp.evalArgs(seg.Args) + } + for _, a := range args { + if IsError(a) { + return a + } + } + current = spec.Fn(current, args) + } + } + } + return current +} + +// ----------------------------------------------------------------------- +// Helpers +// ----------------------------------------------------------------------- + +// namedArgParam is a unified parameter descriptor for named argument resolution, +// used by both methods and functions. +type namedArgParam struct { + Name string + Default any + HasDefault bool +} + +// resolveNamedArgs maps named call arguments to positional order using parameter +// metadata. context is used in error messages (e.g., ".replace_all()", "random_int()"). +// Returns []any or an errorVal. +func (interp *Interpreter) resolveNamedArgs(callArgs []syntax.CallArg, params []namedArgParam, context string) any { + if len(params) == 0 { + // No parameter metadata — evaluate named args by name order. + args := make([]any, 0, len(callArgs)) + for _, arg := range callArgs { + v := interp.evalExpr(arg.Value) + if IsError(v) { + return v + } + args = append(args, v) + } + return args + } + + // Build named arg map. + named := make(map[string]any, len(callArgs)) + for _, arg := range callArgs { + v := interp.evalExpr(arg.Value) + if IsError(v) { + return v + } + named[arg.Name] = v + } + + // Map to positional based on parameter metadata. + args := make([]any, len(params)) + for i, p := range params { + if v, ok := named[p.Name]; ok { + args[i] = v + } else if p.HasDefault { + args[i] = p.Default + } else { + return NewError(fmt.Sprintf("%s: missing required argument %q", context, p.Name)) + } + } + return args +} + +// resolveNamedMethodArgs maps named arguments to positional using the method's +// parameter metadata. Returns []any or an errorVal. +func (interp *Interpreter) resolveNamedMethodArgs(e *syntax.MethodCallExpr) any { + var spec MethodSpec + var specOK bool + if opc := e.MethodOpcode; opc != 0 { + if opc >= lambdaOpcodeBase { + spec = interp.lambdaTable[opc-lambdaOpcodeBase] + } else { + spec = methodTable[opc] + } + specOK = true + } else { + spec, specOK = interp.lookupMethod(e.Method) + } + var params []namedArgParam + if specOK && spec.Params != nil { + params = make([]namedArgParam, len(spec.Params)) + for i, p := range spec.Params { + params[i] = namedArgParam{Name: p.Name, Default: p.Default, HasDefault: p.HasDefault} + } + } + return interp.resolveNamedArgs(e.Args, params, "."+e.Method+"()") +} + +// resolveNamedPathArgs is the path-expression analogue of +// resolveNamedMethodArgs — it reorders named call arguments for a method +// invoked through a path segment (e.g. `input.parse_csv(delimiter: "|")`). +// Returns []any or an errorVal. +func (interp *Interpreter) resolveNamedPathArgs(seg syntax.PathSegment, spec MethodSpec) any { + var params []namedArgParam + if spec.Params != nil { + params = make([]namedArgParam, len(spec.Params)) + for i, p := range spec.Params { + params[i] = namedArgParam{Name: p.Name, Default: p.Default, HasDefault: p.HasDefault} + } + } + return interp.resolveNamedArgs(seg.Args, params, "."+seg.Name+"()") +} + +// resolveNamedFuncArgs maps named arguments to positional using the function's +// parameter metadata. Trailing unspecified optional args are truncated so that +// functions using len(args) for optional parameter detection continue to work. +// Returns []any or an errorVal. +func (interp *Interpreter) resolveNamedFuncArgs(e *syntax.CallExpr, spec FunctionSpec) any { + params := make([]namedArgParam, len(spec.Params)) + for i, p := range spec.Params { + params[i] = namedArgParam{Name: p.Name, Default: p.Default, HasDefault: p.HasDefault} + } + resolved := interp.resolveNamedArgs(e.Args, params, e.Name+"()") + if IsError(resolved) { + return resolved + } + args := resolved.([]any) + + // Truncate trailing default-filled args: find the last parameter position + // that was explicitly provided and trim the slice there. + provided := make(map[string]bool, len(e.Args)) + for _, arg := range e.Args { + provided[arg.Name] = true + } + lastExplicit := -1 + for i, p := range spec.Params { + if provided[p.Name] { + lastExplicit = i + } + } + if lastExplicit >= 0 && lastExplicit < len(args)-1 { + args = args[:lastExplicit+1] + } + return args +} + +// reorderNamedCallArgs reorders named CallArgs to positional order based on +// parameter metadata. Missing optional args are omitted (the method handles +// missing trailing args via len(args) checks internally). +func reorderNamedCallArgs(args []syntax.CallArg, params []MethodParam) []syntax.CallArg { + byName := make(map[string]syntax.CallArg, len(args)) + for _, arg := range args { + byName[arg.Name] = arg + } + result := make([]syntax.CallArg, 0, len(params)) + for _, p := range params { + if arg, ok := byName[p.Name]; ok { + result = append(result, arg) + } else if !p.HasDefault { + // Required param missing — append a placeholder that will trigger + // an error when the method tries to use it. But in practice, the + // resolver catches arity mismatches at compile time. + result = append(result, syntax.CallArg{}) + } + // Optional param missing: omit — method handles via len(args). + } + return result +} + +func (interp *Interpreter) evalArgs(args []syntax.CallArg) []any { + result := make([]any, len(args)) + for i, a := range args { + // Parse-time-folded values (e.g. precompiled regex patterns + // produced by the resolver's ArgFolder hook) substitute for the + // AST expression verbatim, skipping re-evaluation. See + // syntax.CallArg.Folded. + if a.Folded != nil { + result[i] = a.Folded + continue + } + v := interp.evalExpr(a.Value) + if IsVoid(v) { + result[i] = NewError("void passed as argument (use .or() to provide a default)") + } else if IsDeleted(v) { + result[i] = NewError("deleted() passed as argument") + } else { + result[i] = v + } + } + return result +} + +// bindPositionalParams binds evaluated positional args to map parameters, +// handling discard params and AST-expression defaults. +func (interp *Interpreter) bindPositionalParams(s *scope, params []syntax.Param, args []any) string { + argIdx := 0 + for _, p := range params { + if p.Discard { + if argIdx < len(args) { + argIdx++ + } + continue + } + if argIdx < len(args) { + s.vars[p.Name] = args[argIdx] + argIdx++ + } else if p.Default != nil { + s.vars[p.Name] = interp.evalExpr(p.Default) + } else { + return fmt.Sprintf("missing argument for parameter %q", p.Name) + } + } + return "" +} + +// bindPositionalParamsStack binds positional args directly to stack slots. +func (interp *Interpreter) bindPositionalParamsStack(params []syntax.Param, args []any) string { + argIdx := 0 + for _, p := range params { + if p.Discard { + if argIdx < len(args) { + argIdx++ + } + continue + } + if argIdx < len(args) { + interp.stackSet(p.SlotIndex, args[argIdx]) + argIdx++ + } else if p.Default != nil { + interp.stackSet(p.SlotIndex, interp.evalExpr(p.Default)) + } else { + return fmt.Sprintf("missing argument for parameter %q", p.Name) + } + } + return "" +} + +// bindNamedMapParamsFromArgs binds pre-evaluated named arguments to map params. +func (interp *Interpreter) bindNamedMapParamsFromArgs(s *scope, m *syntax.MapDecl, e *syntax.CallExpr, evaledArgs []any) string { + // Build name→value map from pre-evaluated args. + named := make(map[string]any, len(evaledArgs)) + for i, arg := range e.Args { + if i < len(evaledArgs) { + named[arg.Name] = evaledArgs[i] + } + } + + // Bind each non-discard param from named args or defaults. + for _, mp := range m.Params { + if mp.Discard { + continue + } + var val any + if v, ok := named[mp.Name]; ok { + val = v + } else if mp.Default != nil { + val = interp.evalExpr(mp.Default) + } else { + return fmt.Sprintf("%s(): missing required argument %q", e.Name, mp.Name) + } + if mp.SlotIndex > 0 { + interp.stackSet(mp.SlotIndex, val) + } else { + s.vars[mp.Name] = val + } + } + return "" +} + +// assignPath sets a value at a path within a root value, auto-creating +// intermediate objects and arrays as needed. +func (interp *Interpreter) assignPath(root *any, path []syntax.PathSegment, value any) { + if len(path) == 0 { + *root = DeepClone(value) + return + } + + interp.assignPathRecursive(root, path, value) +} + +func (interp *Interpreter) assignPathRecursive(current *any, path []syntax.PathSegment, value any) { + seg := path[0] + isLast := len(path) == 1 + + switch seg.Kind { + case syntax.PathSegField: + // Ensure current is an object. Auto-create only from nil. + obj, ok := (*current).(map[string]any) + if !ok { + if *current != nil { + panic(runtimeError{message: fmt.Sprintf( + "cannot access field %q on %T (expected object)", seg.Name, *current)}) + } + obj = make(map[string]any) + *current = obj + } + + if isLast { + if IsDeleted(value) { + delete(obj, seg.Name) + } else { + obj[seg.Name] = value + } + return + } + + child, exists := obj[seg.Name] + if !exists { + child = nil // will be auto-created by recursive call + } + interp.assignPathRecursive(&child, path[1:], value) + obj[seg.Name] = child + + case syntax.PathSegIndex: + idx := interp.evalExpr(seg.Index) + if IsError(idx) { + return + } + + // String index → object field. + if key, ok := idx.(string); ok { + obj, ok := (*current).(map[string]any) + if !ok { + obj = make(map[string]any) + *current = obj + } + if isLast { + if IsDeleted(value) { + delete(obj, key) + } else { + obj[key] = value + } + return + } + child, exists := obj[key] + if !exists { + child = make(map[string]any) + } + interp.assignPathRecursive(&child, path[1:], value) + obj[key] = child + return + } + + // Integer index → array element. + i, ok := toInt64(idx) + if !ok { + return + } + + arr, isArr := (*current).([]any) + if !isArr { + if *current != nil { + panic(runtimeError{message: fmt.Sprintf( + "cannot index into %T (expected array)", *current)}) + } + // Auto-create array from nil. + arr = make([]any, 0) + } + + // Handle negative indexing. + if i < 0 { + i += int64(len(arr)) + } + + if isLast && IsDeleted(value) { + // Delete array element: remove and shift. + if i < 0 || i >= int64(len(arr)) { + panic(runtimeError{message: "array index deletion: index out of bounds"}) + } + arr = append(arr[:i], arr[i+1:]...) + *current = arr + return + } + + // Grow array with null gaps if needed. + for int64(len(arr)) <= i { + arr = append(arr, nil) + } + *current = arr + + if isLast { + arr[i] = value + return + } + + child := arr[i] + if child == nil { + child = make(map[string]any) + } + interp.assignPathRecursive(&child, path[1:], value) + arr[i] = child + } +} + +func (interp *Interpreter) setPath(root *any, path []syntax.PathSegment, value any) { + interp.assignPath(root, path, value) +} + +func (interp *Interpreter) indexValue(receiver, index any) any { + switch r := receiver.(type) { + case map[string]any: + key, ok := index.(string) + if !ok { + return NewError(fmt.Sprintf("non-string index on object: got %T", index)) + } + return r[key] + case []any: + return indexSequence(index, int64(len(r)), func(i int64) any { return r[i] }) + case string: + runes := []rune(r) + return indexSequence(index, int64(len(runes)), func(i int64) any { return int64(runes[i]) }) + case []byte: + return indexSequence(index, int64(len(r)), func(i int64) any { return int64(r[i]) }) + case nil: + return NewError("cannot index null value") + default: + return NewError(fmt.Sprintf("cannot index %T", receiver)) + } +} + +func indexSequence(index any, length int64, get func(int64) any) any { + idx, ok := toInt64(index) + if !ok { + // Distinguish non-numeric from non-whole-number float. + if f, isFloat := index.(float64); isFloat { + if f != math.Trunc(f) { + return NewError("index must be a whole number, got float with fractional part") + } + } + if f, isFloat := index.(float32); isFloat { + if float64(f) != math.Trunc(float64(f)) { + return NewError("index must be a whole number, got float with fractional part") + } + } + return NewError(fmt.Sprintf("non-numeric index: got %T", index)) + } + if idx < 0 { + idx += length + } + if idx < 0 || idx >= length { + return NewError("index out of bounds") + } + return get(idx) +} + +func toInt64(v any) (int64, bool) { + switch n := v.(type) { + case int64: + return n, true + case int32: + return int64(n), true + case uint32: + return int64(n), true + case uint64: + if n > math.MaxInt64 { + return 0, false + } + return int64(n), true + case float64: + if n != math.Trunc(n) || math.IsNaN(n) || math.IsInf(n, 0) { + return 0, false + } + if n > math.MaxInt64 || n < math.MinInt64 { + return 0, false + } + return int64(n), true + case float32: + f := float64(n) + if f != math.Trunc(f) || math.IsNaN(f) || math.IsInf(f, 0) { + return 0, false + } + if f > math.MaxInt64 || f < math.MinInt64 { + return 0, false + } + return int64(f), true + default: + return 0, false + } +} + +func numericNegate(v any) any { + switch n := v.(type) { + case int64: + if n == math.MinInt64 { + return NewError("int64 overflow") + } + return -n + case int32: + if n == math.MinInt32 { + return NewError("int32 overflow") + } + return -n + case float64: + return -n + case float32: + return -n + case uint32: + return -int64(n) + case uint64: + if n > math.MaxInt64 { + return NewError("cannot negate uint64 value exceeding int64 range") + } + return -int64(n) + default: + return NewError(fmt.Sprintf("cannot negate %T", v)) + } +} + +// ----------------------------------------------------------------------- +// Error handling via panic/recover +// ----------------------------------------------------------------------- + +type runtimeError struct { + message string +} + +type recursionError struct{} + +// RunWithMessage is Run plus a bound MessageContext, used by callers +// that need the message-coupled stdlib (batch_index, content, error, +// ...). Input value and metadata are taken from the context; the +// context is cleared before returning so it is not retained in the +// pooled interpreter. +func (interp *Interpreter) RunWithMessage(ctx MessageContext) (output any, outputMeta map[string]any, deleted bool, err error) { + interp.messageCtx = ctx + defer func() { interp.messageCtx = nil }() + var meta map[string]any + if ctx != nil { + meta = ctx.Metadata() + } + var input any + if ctx != nil { + input = ctx.Input() + } + return interp.Run(input, meta) +} + +// Run executes the program with panic recovery, converting runtime panics +// to error returns. +func (interp *Interpreter) Run(input any, metadata map[string]any) (output any, outputMeta map[string]any, deleted bool, err error) { + defer func() { + if r := recover(); r != nil { + switch e := r.(type) { + case runtimeError: + err = fmt.Errorf("%s", e.message) + case recursionError: + err = errors.New("maximum recursion depth exceeded") + default: + panic(r) // re-panic for unexpected errors + } + } + }() + + // Inputs may have been decoded by a JSON library that uses json.Number to + // preserve precision (e.g. encoding/json with UseNumber). The interpreter + // only knows about native Go numeric types, so normalise once at the + // boundary instead of forcing every operator to special-case json.Number. + input = normalizeJSONNumbers(input) + for k, v := range metadata { + metadata[k] = normalizeJSONNumbers(v) + } + + return interp.Exec(input, metadata) +} diff --git a/internal/bloblang2/go/pratt/eval/interp_test.go b/internal/bloblang2/go/pratt/eval/interp_test.go new file mode 100644 index 000000000..74cda4338 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/interp_test.go @@ -0,0 +1,615 @@ +// Tests in this file use run() which calls Parse → New → Run, bypassing +// Optimize and Resolve. This means AST nodes have no opcode IDs or stack +// slot indices — all execution goes through the scope-chain and map-lookup +// fallback paths. +// +// The optimized paths (opcode dispatch, variable stack) are exercised by +// the spec conformance suite in bloblang2_test.go, which uses the full +// compilation pipeline: Parse → Optimize → Resolve → NewWithStdlib → Run. +package eval + +import ( + "encoding/json" + "reflect" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +func run(t *testing.T, src string, input any) (any, map[string]any, bool) { + t.Helper() + prog, errs := syntax.Parse(src, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + interp := New(prog) + meta := map[string]any{} + output, outputMeta, deleted, err := interp.Run(input, meta) + if err != nil { + t.Fatalf("runtime error: %v", err) + } + return output, outputMeta, deleted +} + +func runExpectError(t *testing.T, src string, input any, substr string) { + t.Helper() + prog, errs := syntax.Parse(src, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + interp := New(prog) + meta := map[string]any{} + _, _, _, err := interp.Run(input, meta) + if err == nil { + t.Fatalf("expected runtime error containing %q, but execution succeeded", substr) + } + if !containsStr(err.Error(), substr) { + t.Fatalf("error %q does not contain %q", err.Error(), substr) + } +} + +func containsStr(s, sub string) bool { + for i := 0; i <= len(s)-len(sub); i++ { + if s[i:i+len(sub)] == sub { + return true + } + } + return false +} + +// ----------------------------------------------------------------------- +// Basic assignments +// ----------------------------------------------------------------------- + +func TestInterp_SimpleAssignment(t *testing.T) { + output, _, _ := run(t, `output.x = 42`, nil) + m := output.(map[string]any) + if m["x"] != int64(42) { + t.Fatalf("expected 42, got %v (%T)", m["x"], m["x"]) + } +} + +func TestInterp_MultipleAssignments(t *testing.T) { + output, _, _ := run(t, "output.a = 1\noutput.b = 2", nil) + m := output.(map[string]any) + if m["a"] != int64(1) || m["b"] != int64(2) { + t.Fatalf("expected {a:1, b:2}, got %v", m) + } +} + +func TestInterp_VarDeclarationAndUse(t *testing.T) { + output, _, _ := run(t, "$x = 42\noutput.v = $x", nil) + m := output.(map[string]any) + if m["v"] != int64(42) { + t.Fatalf("expected 42, got %v", m["v"]) + } +} + +func TestInterp_NestedFieldAssignment(t *testing.T) { + output, _, _ := run(t, `output.user.name = "Alice"`, nil) + m := output.(map[string]any) + user := m["user"].(map[string]any) + if user["name"] != "Alice" { + t.Fatalf("expected Alice, got %v", user["name"]) + } +} + +// ----------------------------------------------------------------------- +// Arithmetic +// ----------------------------------------------------------------------- + +func TestInterp_Addition(t *testing.T) { + output, _, _ := run(t, `output = 5 + 3`, nil) + if output != int64(8) { + t.Fatalf("expected 8, got %v (%T)", output, output) + } +} + +func TestInterp_Division(t *testing.T) { + output, _, _ := run(t, `output = 7 / 2`, nil) + if output != 3.5 { + t.Fatalf("expected 3.5, got %v (%T)", output, output) + } +} + +func TestInterp_DivisionByZero(t *testing.T) { + runExpectError(t, `output = 7 / 0`, nil, "division by zero") +} + +func TestInterp_IntegerOverflow(t *testing.T) { + runExpectError(t, `output = 9223372036854775807 + 1`, nil, "overflow") +} + +func TestInterp_Modulo(t *testing.T) { + output, _, _ := run(t, `output = 7 % 2`, nil) + if output != int64(1) { + t.Fatalf("expected 1, got %v (%T)", output, output) + } +} + +func TestInterp_StringConcat(t *testing.T) { + output, _, _ := run(t, `output = "hello" + " " + "world"`, nil) + if output != "hello world" { + t.Fatalf("expected 'hello world', got %v", output) + } +} + +func TestInterp_UnaryMinus(t *testing.T) { + output, _, _ := run(t, `output = -5`, nil) + if output != int64(-5) { + t.Fatalf("expected -5, got %v", output) + } +} + +func TestInterp_UnaryNot(t *testing.T) { + output, _, _ := run(t, `output = !true`, nil) + if output != false { + t.Fatalf("expected false, got %v", output) + } +} + +// ----------------------------------------------------------------------- +// Equality and comparison +// ----------------------------------------------------------------------- + +func TestInterp_Equality(t *testing.T) { + output, _, _ := run(t, `output = 5 == 5`, nil) + if output != true { + t.Fatalf("expected true, got %v", output) + } +} + +func TestInterp_CrossFamilyEquality(t *testing.T) { + output, _, _ := run(t, `output = 5 == "5"`, nil) + if output != false { + t.Fatalf("expected false, got %v", output) + } +} + +func TestInterp_Comparison(t *testing.T) { + output, _, _ := run(t, `output = 5 > 3`, nil) + if output != true { + t.Fatalf("expected true, got %v", output) + } +} + +// ----------------------------------------------------------------------- +// Logical operators +// ----------------------------------------------------------------------- + +func TestInterp_LogicalAnd(t *testing.T) { + output, _, _ := run(t, `output = true && false`, nil) + if output != false { + t.Fatalf("expected false, got %v", output) + } +} + +func TestInterp_LogicalOr(t *testing.T) { + output, _, _ := run(t, `output = false || true`, nil) + if output != true { + t.Fatalf("expected true, got %v", output) + } +} + +func TestInterp_LogicalRequiresBool(t *testing.T) { + runExpectError(t, `output = 5 && true`, nil, "boolean") +} + +// ----------------------------------------------------------------------- +// Input access +// ----------------------------------------------------------------------- + +func TestInterp_InputAccess(t *testing.T) { + input := map[string]any{"name": "Alice"} + output, _, _ := run(t, `output.v = input.name`, input) + m := output.(map[string]any) + if m["v"] != "Alice" { + t.Fatalf("expected Alice, got %v", m["v"]) + } +} + +func TestInterp_InputRoot(t *testing.T) { + output, _, _ := run(t, `output = input`, map[string]any{"x": int64(1)}) + m := output.(map[string]any) + if m["x"] != int64(1) { + t.Fatalf("expected {x:1}, got %v", m) + } +} + +// ----------------------------------------------------------------------- +// If expression +// ----------------------------------------------------------------------- + +func TestInterp_IfExprTrue(t *testing.T) { + output, _, _ := run(t, `output = if true { 1 } else { 2 }`, nil) + if output != int64(1) { + t.Fatalf("expected 1, got %v", output) + } +} + +func TestInterp_IfExprFalse(t *testing.T) { + output, _, _ := run(t, `output = if false { 1 } else { 2 }`, nil) + if output != int64(2) { + t.Fatalf("expected 2, got %v", output) + } +} + +func TestInterp_IfExprVoid(t *testing.T) { + // if without else when false → void → assignment skipped. + output, _, _ := run(t, "output.x = \"prior\"\noutput.x = if false { 1 }", nil) + m := output.(map[string]any) + if m["x"] != "prior" { + t.Fatalf("expected 'prior', got %v", m["x"]) + } +} + +// ----------------------------------------------------------------------- +// Match expression +// ----------------------------------------------------------------------- + +func TestInterp_MatchEquality(t *testing.T) { + output, _, _ := run(t, `output = match "cat" { "cat" => "meow", "dog" => "woof", _ => "?" }`, nil) + if output != "meow" { + t.Fatalf("expected meow, got %v", output) + } +} + +func TestInterp_MatchWildcard(t *testing.T) { + output, _, _ := run(t, `output = match "bird" { "cat" => "meow", _ => "unknown" }`, nil) + if output != "unknown" { + t.Fatalf("expected unknown, got %v", output) + } +} + +func TestInterp_MatchVoid(t *testing.T) { + // No matching case, no wildcard → void. + output, _, _ := run(t, "output.x = \"prior\"\noutput.x = match \"bird\" { \"cat\" => \"meow\" }", nil) + m := output.(map[string]any) + if m["x"] != "prior" { + t.Fatalf("expected 'prior', got %v", m["x"]) + } +} + +// ----------------------------------------------------------------------- +// Map declarations +// ----------------------------------------------------------------------- + +func TestInterp_MapCall(t *testing.T) { + output, _, _ := run(t, "map double(x) {\n x * 2\n}\noutput = double(21)", nil) + if output != int64(42) { + t.Fatalf("expected 42, got %v", output) + } +} + +func TestInterp_MapCallMultiParam(t *testing.T) { + output, _, _ := run(t, "map add(a, b) {\n a + b\n}\noutput = add(3, 7)", nil) + if output != int64(10) { + t.Fatalf("expected 10, got %v", output) + } +} + +func TestInterp_MapIsolation(t *testing.T) { + // Map cannot access top-level variables. + runExpectError(t, "$x = 10\nmap bad(a) {\n a + $x\n}\noutput = bad(1)", nil, "undefined") +} + +// ----------------------------------------------------------------------- +// Deletion +// ----------------------------------------------------------------------- + +func TestInterp_OutputDeleted(t *testing.T) { + prog, _ := syntax.Parse("output = deleted()", "", nil) + interp := New(prog) + interp.RegisterFunction("deleted", FunctionSpec{Fn: func(_ []any) any { return Deleted }}) + _, _, deleted, err := interp.Run(nil, map[string]any{}) + if err != nil { + t.Fatal(err) + } + if !deleted { + t.Fatal("expected deleted") + } +} + +func TestInterp_FieldDeleted(t *testing.T) { + prog, _ := syntax.Parse("output.a = 1\noutput.b = 2\noutput.b = deleted()", "", nil) + interp := New(prog) + interp.RegisterFunction("deleted", FunctionSpec{Fn: func(_ []any) any { return Deleted }}) + output, _, _, err := interp.Run(nil, map[string]any{}) + if err != nil { + t.Fatal(err) + } + m := output.(map[string]any) + if _, exists := m["b"]; exists { + t.Fatal("expected b to be deleted") + } + if m["a"] != int64(1) { + t.Fatal("expected a to be 1") + } +} + +// ----------------------------------------------------------------------- +// Array and object literals +// ----------------------------------------------------------------------- + +func TestInterp_ArrayLiteral(t *testing.T) { + output, _, _ := run(t, `output = [1, 2, 3]`, nil) + arr := output.([]any) + if len(arr) != 3 || arr[0] != int64(1) { + t.Fatalf("expected [1,2,3], got %v", arr) + } +} + +func TestInterp_ObjectLiteral(t *testing.T) { + output, _, _ := run(t, `output = {"a": 1, "b": 2}`, nil) + m := output.(map[string]any) + if m["a"] != int64(1) || m["b"] != int64(2) { + t.Fatalf("expected {a:1,b:2}, got %v", m) + } +} + +func TestInterp_VoidInArrayIsError(t *testing.T) { + runExpectError(t, `output = [1, if false { 2 }, 3]`, nil, "void in array") +} + +func TestInterp_VoidInObjectIsError(t *testing.T) { + runExpectError(t, `output = {"a": if false { 1 }}`, nil, "void in object") +} + +// ----------------------------------------------------------------------- +// Recursion +// ----------------------------------------------------------------------- + +func TestInterp_Recursion(t *testing.T) { + src := "map factorial(n) {\n if n <= 1 { 1 } else { n * factorial(n - 1) }\n}\noutput = factorial(5)" + output, _, _ := run(t, src, nil) + if output != int64(120) { + t.Fatalf("expected 120, got %v", output) + } +} + +// ----------------------------------------------------------------------- +// Numeric promotion edge cases +// ----------------------------------------------------------------------- + +func TestInterp_IntPlusFloatPromotion(t *testing.T) { + output, _, _ := run(t, `output = 5 + 3.0`, nil) + if output != 8.0 { + t.Fatalf("expected 8.0, got %v (%T)", output, output) + } +} + +func TestInterp_IntPlusFloatExceedsPrecision(t *testing.T) { + runExpectError(t, `output = 9007199254740993 + 1.0`, nil, "float64 exact range") +} + +func TestInterp_Uint64ExceedsInt64Range(t *testing.T) { + // This would require uint64 values in the system — skip for now + // since literals are always int64. Tested when uint64 values + // enter through .uint64() conversion (stdlib phase). +} + +func TestInterp_VoidInVarDeclarationIsError(t *testing.T) { + runExpectError(t, `$x = if false { 42 }`, nil, "void") +} + +func TestInterp_VoidInVarReassignmentSkips(t *testing.T) { + output, _, _ := run(t, "$x = 10\n$x = if false { 42 }\noutput = $x", nil) + if output != int64(10) { + t.Fatalf("expected 10, got %v", output) + } +} + +// ----------------------------------------------------------------------- +// Copy-on-write +// ----------------------------------------------------------------------- + +func TestInterp_CopyOnWrite(t *testing.T) { + input := map[string]any{"x": int64(1)} + output, _, _ := run(t, "$v = input\n$v.x = 99\noutput.original = input.x\noutput.copy = $v.x", input) + m := output.(map[string]any) + if m["original"] != int64(1) { + t.Fatalf("expected original 1, got %v", m["original"]) + } + if m["copy"] != int64(99) { + t.Fatalf("expected copy 99, got %v", m["copy"]) + } +} + +// ----------------------------------------------------------------------- +// Stack path vs scope path equivalence +// ----------------------------------------------------------------------- + +// TestScopeAndStackPathsAgree runs the same program through both the +// unresolved (scope-based) and resolved (stack-based) paths and verifies +// they produce identical output. +func TestScopeAndStackPathsAgree(t *testing.T) { + cases := []struct { + name string + src string + input any + }{ + { + name: "variables and lambdas", + src: "$x = 10\n$y = 20\noutput.sum = $x + $y\noutput.mapped = [1, 2, 3].map(n -> n * $x)", + input: nil, + }, + { + name: "map call with params", + src: "map add(a, b) {\n a + b\n}\noutput.v = add(3, 7)", + input: nil, + }, + { + name: "nested field access", + src: "$data = input\noutput.name = $data.user.name", + input: map[string]any{"user": map[string]any{"name": "Alice"}}, + }, + { + name: "if expression with vars", + src: "$x = 5\noutput.v = if $x > 3 { \"big\" } else { \"small\" }", + input: nil, + }, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + prog, errs := syntax.Parse(tc.src, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + + // Scope path: no optimize, no resolve. + // Use a deep copy of the program since Optimize mutates in place. + scopeProg, _ := syntax.Parse(tc.src, "", nil) + scopeInterp := New(scopeProg) + scopeInterp.RegisterStdlib() + scopeInterp.lambdaMethods = make(map[string]MethodSpec, 16) + scopeInterp.RegisterLambdaMethods() + scopeOut, scopeMeta, scopeDel, scopeErr := scopeInterp.Run(tc.input, map[string]any{}) + + // Stack path: optimize + resolve with opcodes and slots. + syntax.Optimize(prog) + methods, functions := StdlibNames() + methodOpc, funcOpc := StdlibOpcodes() + syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, Functions: functions, + MethodOpcodes: methodOpc, FunctionOpcodes: funcOpc, + }) + stackInterp := NewWithStdlib(prog) + stackOut, stackMeta, stackDel, stackErr := stackInterp.Run(tc.input, map[string]any{}) + + // Compare. + if (scopeErr == nil) != (stackErr == nil) { + t.Fatalf("error mismatch: scope=%v stack=%v", scopeErr, stackErr) + } + if scopeDel != stackDel { + t.Fatalf("deleted mismatch: scope=%v stack=%v", scopeDel, stackDel) + } + if !reflect.DeepEqual(scopeOut, stackOut) { + t.Fatalf("output mismatch:\n scope: %v\n stack: %v", scopeOut, stackOut) + } + if !reflect.DeepEqual(scopeMeta, stackMeta) { + t.Fatalf("metadata mismatch:\n scope: %v\n stack: %v", scopeMeta, stackMeta) + } + }) + } +} + +// ----------------------------------------------------------------------- +// json.Number normalisation at the Run boundary +// ----------------------------------------------------------------------- +// +// Inputs decoded with encoding/json's UseNumber arrive as json.Number. +// Run normalises these to int64 / float64 once at the entry point so +// downstream operators do not need to special-case the type. + +func TestInterp_JSONNumberInputComparison(t *testing.T) { + prog, errs := syntax.Parse(`output.over = input.score > 0.5`, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + interp := New(prog) + + input := map[string]any{"score": json.Number("0.89")} + out, _, _, err := interp.Run(input, map[string]any{}) + if err != nil { + t.Fatalf("runtime error: %v", err) + } + m, ok := out.(map[string]any) + if !ok { + t.Fatalf("expected map output, got %T", out) + } + if m["over"] != true { + t.Fatalf("expected over=true, got %v (%T)", m["over"], m["over"]) + } +} + +func TestInterp_JSONNumberInputArithmetic(t *testing.T) { + prog, errs := syntax.Parse(`output.sum = input.a + input.b`, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + interp := New(prog) + + input := map[string]any{ + "a": json.Number("3"), + "b": json.Number("4"), + } + out, _, _, err := interp.Run(input, map[string]any{}) + if err != nil { + t.Fatalf("runtime error: %v", err) + } + m := out.(map[string]any) + if m["sum"] != int64(7) { + t.Fatalf("expected sum=int64(7), got %v (%T)", m["sum"], m["sum"]) + } +} + +func TestInterp_JSONNumberInMetadata(t *testing.T) { + prog, errs := syntax.Parse(`output.retries = input@.retry_count + 1`, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + interp := New(prog) + + meta := map[string]any{"retry_count": json.Number("2")} + out, _, _, err := interp.Run(nil, meta) + if err != nil { + t.Fatalf("runtime error: %v", err) + } + m := out.(map[string]any) + if m["retries"] != int64(3) { + t.Fatalf("expected retries=int64(3), got %v (%T)", m["retries"], m["retries"]) + } +} + +func TestInterp_JSONNumberFractional(t *testing.T) { + // Fractional json.Number values arrive as float64 once normalised, so + // arithmetic against a float literal produces a float64 result. + prog, errs := syntax.Parse(`output.tax = input.amount * 0.1`, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + interp := New(prog) + + input := map[string]any{"amount": json.Number("19.99")} + out, _, _, err := interp.Run(input, map[string]any{}) + if err != nil { + t.Fatalf("runtime error: %v", err) + } + m := out.(map[string]any) + tax, ok := m["tax"].(float64) + if !ok { + t.Fatalf("tax should be float64, got %T", m["tax"]) + } + if tax < 1.998 || tax > 2.0 { + t.Fatalf("tax=%v, expected ~1.999", tax) + } +} + +func TestInterp_JSONNumberNestedInArray(t *testing.T) { + // Array elements containing json.Number should be normalised so that + // downstream stdlib calls (here .sum()) get native numerics. + prog, errs := syntax.Parse(`output.total = input.scores.sum()`, "", nil) + if len(errs) > 0 { + t.Fatalf("parse errors:\n%s", syntax.FormatErrors(errs)) + } + methods, functions := StdlibNames() + methodOpc, funcOpc := StdlibOpcodes() + syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, Functions: functions, + MethodOpcodes: methodOpc, FunctionOpcodes: funcOpc, + }) + interp := NewWithStdlib(prog) + + input := map[string]any{ + "scores": []any{json.Number("1"), json.Number("2"), json.Number("3")}, + } + out, _, _, err := interp.Run(input, map[string]any{}) + if err != nil { + t.Fatalf("runtime error: %v", err) + } + m := out.(map[string]any) + if m["total"] != int64(6) { + t.Fatalf("total=%v (%T), expected int64(6)", m["total"], m["total"]) + } +} diff --git a/internal/bloblang2/go/pratt/eval/messagecontext.go b/internal/bloblang2/go/pratt/eval/messagecontext.go new file mode 100644 index 000000000..a51135ef9 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/messagecontext.go @@ -0,0 +1,58 @@ +package eval + +// MessageContext exposes per-message read access used by message-coupled +// stdlib functions (batch_index, batch_size, content, error, errored, +// tracing_id, tracing_span). The interface is intentionally small: it +// describes only the read surface batch-3 needs, so the eval package +// stays decoupled from public/service.Message and the executor remains +// reusable outside Benthos pipelines. +// +// Callers obtain the bound context inside a function implementation by +// receiving it as the first argument of MessageFunctionFunc. It is only +// non-nil while Interpreter.RunWithMessage is in flight; calls to a +// message-coupled function from a plain Run / Exec path produce a +// runtime error before the function body is invoked. +type MessageContext interface { + // Input returns the structured form of the message body that should + // be bound to the mapping's `input` keyword. May be []byte, a + // scalar, an array, an object, or nil. + Input() any + + // Metadata returns a snapshot of the message metadata to be bound + // to `input@`. Returning nil is equivalent to an empty map. + Metadata() map[string]any + + // Bytes returns the raw byte form of the message body, used by + // content(). + Bytes() []byte + + // Error returns the error currently set on the message, or nil. Used + // by error() and errored(). + Error() error + + // BatchIndex is the 0-based position of the current message within + // its batch. Used by batch_index(). + BatchIndex() int + + // BatchSize is the total number of messages in the current batch. + // Used by batch_size(). + BatchSize() int + + // TraceID returns the OpenTelemetry trace ID associated with the + // message, or the empty string if none is set. Used by tracing_id(). + TraceID() string + + // Span returns the active tracing span for the message, or nil. + // Used by tracing_span(). + Span() any +} + +// MessageFunctionFunc is the implementation shape for message-coupled +// stdlib functions. Functions of this shape read from the bound +// MessageContext and ignore the interpreter's input value. +// +// Functions registered with MessageFunctionFunc bypass parse-time +// argument folding implicitly (folding requires Fn, which they leave +// nil). They are dispatched only when the interpreter is running with +// a MessageContext bound (i.e. via Interpreter.RunWithMessage). +type MessageFunctionFunc func(msg MessageContext, args []any) any diff --git a/internal/bloblang2/go/pratt/eval/opcodes.go b/internal/bloblang2/go/pratt/eval/opcodes.go new file mode 100644 index 000000000..caeba31ba --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/opcodes.go @@ -0,0 +1,80 @@ +package eval + +// MethodOpcode is a compile-time integer ID for a stdlib method, assigned +// dynamically at init time. Opcode 0 is reserved (unused). +type MethodOpcode = uint16 + +// FunctionOpcode is a compile-time integer ID for a stdlib function, assigned +// dynamically at init time. Opcode 0 is reserved (unused). +type FunctionOpcode = uint16 + +// Opcode tables and name-to-opcode mappings. Built once at init time by +// initSharedStdlib and shared read-only across all interpreters. +var ( + methodTable []MethodSpec // indexed by MethodOpcode + functionTable []FunctionSpec // indexed by FunctionOpcode + + methodNameToOpcode map[string]MethodOpcode + functionNameToOpcode map[string]FunctionOpcode + + // lambdaOpcodeBase is the first opcode assigned to lambda methods. + // Lambda opcodes are lambdaOpcodeBase, lambdaOpcodeBase+1, ... + // At runtime, interp.lambdaTable[opcode - lambdaOpcodeBase] resolves them. + lambdaOpcodeBase MethodOpcode + + // lambdaOpcodeOffsets maps lambda method names to their offset from + // lambdaOpcodeBase. Used during RegisterLambdaMethods to populate the + // per-interpreter lambdaTable. + lambdaOpcodeOffsets map[string]uint16 +) + +// nextMethodOpcode and nextFunctionOpcode are used during init to assign +// sequential opcodes. They start at 1 (0 is reserved). +var ( + nextMethodOpcode MethodOpcode = 1 + nextFunctionOpcode FunctionOpcode = 1 +) + +// registerMethodOpcode assigns an opcode to a static method during init. +func registerMethodOpcode(name string, spec MethodSpec) { + opcode := nextMethodOpcode + nextMethodOpcode++ + + methodNameToOpcode[name] = opcode + + // Grow table if needed. + for int(opcode) >= len(methodTable) { + methodTable = append(methodTable, MethodSpec{}) + } + methodTable[opcode] = spec +} + +// registerFunctionOpcode assigns an opcode to a stdlib function during init. +func registerFunctionOpcode(name string, spec FunctionSpec) { + opcode := nextFunctionOpcode + nextFunctionOpcode++ + + functionNameToOpcode[name] = opcode + + // Grow table if needed. + for int(opcode) >= len(functionTable) { + functionTable = append(functionTable, FunctionSpec{}) + } + functionTable[opcode] = spec +} + +// registerLambdaMethodOpcode assigns an opcode to a lambda method during init. +// Lambda opcodes start at lambdaOpcodeBase and are stored in a separate +// per-interpreter slice at runtime. +func registerLambdaMethodOpcode(name string) { + offset := uint16(len(lambdaOpcodeOffsets)) + lambdaOpcodeOffsets[name] = offset + methodNameToOpcode[name] = lambdaOpcodeBase + offset +} + +// StdlibOpcodes returns the name-to-opcode mappings for methods and functions. +// Used by the resolver to annotate AST nodes at compile time. Returns nil-safe +// maps (never nil). +func StdlibOpcodes() (methods map[string]uint16, functions map[string]uint16) { + return methodNameToOpcode, functionNameToOpcode +} diff --git a/internal/bloblang2/go/pratt/eval/scope.go b/internal/bloblang2/go/pratt/eval/scope.go new file mode 100644 index 000000000..b87d9a010 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/scope.go @@ -0,0 +1,54 @@ +package eval + +// scopeMode determines how variable assignment interacts with outer scopes. +type scopeMode int + +const ( + // scopeStatement: assigning to an existing outer variable modifies it. + // New variables are block-scoped. + scopeStatement scopeMode = iota + // scopeExpression: assigning to an existing outer variable shadows it. + // All variables are local to this scope. + scopeExpression +) + +// scope is a linked scope chain for variable resolution. +type scope struct { + parent *scope + mode scopeMode + vars map[string]any +} + +func newScope(parent *scope, mode scopeMode) *scope { + return &scope{ + parent: parent, + mode: mode, + vars: make(map[string]any), + } +} + +// get looks up a variable by walking the scope chain. +func (s *scope) get(name string) (any, bool) { + for cur := s; cur != nil; cur = cur.parent { + if v, ok := cur.vars[name]; ok { + return v, true + } + } + return nil, false +} + +// set assigns a variable, respecting the scope mode: +// - Expression mode: always writes locally (shadow). +// - Statement mode: if variable exists in an ancestor, update the ancestor. +// Otherwise, create locally. +func (s *scope) set(name string, value any) { + if s.mode == scopeStatement { + for cur := s.parent; cur != nil; cur = cur.parent { + if _, ok := cur.vars[name]; ok { + cur.vars[name] = value + return + } + } + } + s.vars[name] = value +} diff --git a/internal/bloblang2/go/pratt/eval/stdlib.go b/internal/bloblang2/go/pratt/eval/stdlib.go new file mode 100644 index 000000000..aa5216f78 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/stdlib.go @@ -0,0 +1,1894 @@ +package eval + +import ( + "bytes" + "encoding/base64" + "encoding/hex" + "encoding/json" + "fmt" + "math" + "math/rand/v2" + "regexp" + "sort" + "strconv" + "strings" + "sync" + "time" + "unicode/utf8" + + "github.com/google/uuid" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// sharedMethods and sharedFunctions hold the static (non-lambda) stdlib +// entries. They are built once and shared read-only across all interpreters. +var ( + sharedMethods map[string]MethodSpec + sharedFunctions map[string]FunctionSpec + sharedLambdaSpecs map[string]MethodSpec // lambda method specs (for MethodInfos) + stdlibOnce sync.Once +) + +func initSharedStdlib() { + stdlibOnce.Do(func() { + // Initialize opcode mapping tables. + methodNameToOpcode = make(map[string]MethodOpcode, 100) + functionNameToOpcode = make(map[string]FunctionOpcode, 20) + lambdaOpcodeOffsets = make(map[string]uint16, 16) + methodTable = make([]MethodSpec, 1, 100) // index 0 unused + functionTable = make([]FunctionSpec, 1, 20) // index 0 unused + + // Register static methods and functions, building opcode tables. + interp := New(nil) + interp.registerFunctions() + interp.registerMethods() + sharedMethods = interp.staticMethods + sharedFunctions = interp.staticFunctions + + // Assign opcode IDs to all registered methods and functions. + for name, spec := range sharedMethods { + registerMethodOpcode(name, spec) + } + for name, spec := range sharedFunctions { + registerFunctionOpcode(name, spec) + } + + // Set lambda opcode base and register lambda method opcodes. + // Lambda methods are registered per-interpreter but their opcodes + // are global and stable. + lambdaOpcodeBase = nextMethodOpcode + tmpInterp := New(nil) + tmpInterp.lambdaMethods = make(map[string]MethodSpec, 16) + tmpInterp.RegisterLambdaMethods() + sharedLambdaSpecs = make(map[string]MethodSpec, len(tmpInterp.lambdaMethods)) + for name, spec := range tmpInterp.lambdaMethods { + sharedLambdaSpecs[name] = spec + registerLambdaMethodOpcode(name) + } + }) +} + +func init() { + initSharedStdlib() +} + +// RegisterStdlib registers all standard library functions and methods. +func (interp *Interpreter) RegisterStdlib() { + interp.registerFunctions() + interp.registerMethods() +} + +// MethodSpecToInfo converts a MethodSpec into the compile-time MethodInfo +// consumed by the resolver. Exported so external packages (notably the +// public plugin surface in public/bloblang2) can extend the compiler's +// view of known methods with their own registrations. +func MethodSpecToInfo(spec MethodSpec) syntax.MethodInfo { + return methodSpecToInfo(spec) +} + +// FunctionSpecToInfo is the FunctionSpec analogue of MethodSpecToInfo. +func FunctionSpecToInfo(spec FunctionSpec) syntax.FunctionInfo { + return functionSpecToInfo(spec) +} + +func methodSpecToInfo(spec MethodSpec) syntax.MethodInfo { + methodAcceptsLambda := spec.LambdaFn != nil || spec.AcceptsLambda + if spec.Params == nil { + return syntax.MethodInfo{ + Required: 0, + Total: -1, + AcceptsLambda: methodAcceptsLambda, + ArgFolder: spec.ArgFolder, + CallFolder: spec.CallFolder, + } + } + required, total := 0, 0 + params := make([]syntax.MethodParamInfo, len(spec.Params)) + for i, p := range spec.Params { + total++ + if !p.HasDefault { + required++ + } + params[i] = syntax.MethodParamInfo{ + Name: p.Name, + HasDefault: p.HasDefault, + AcceptsLambda: p.AcceptsLambda, + } + } + return syntax.MethodInfo{ + Required: required, + Total: total, + AcceptsLambda: methodAcceptsLambda, + Params: params, + ArgFolder: spec.ArgFolder, + CallFolder: spec.CallFolder, + } +} + +// functionSpecToInfo converts a FunctionSpec into the compile-time +// FunctionInfo the resolver consumes. Mirrors methodSpecToInfo. +func functionSpecToInfo(spec FunctionSpec) syntax.FunctionInfo { + required, total := 0, 0 + params := make([]syntax.FunctionParamInfo, len(spec.Params)) + for i, p := range spec.Params { + total++ + if !p.HasDefault { + required++ + } + params[i] = syntax.FunctionParamInfo{ + Name: p.Name, + HasDefault: p.HasDefault, + AcceptsLambda: p.AcceptsLambda, + } + } + return syntax.FunctionInfo{ + Required: required, + Total: total, + Params: params, + ArgFolder: spec.ArgFolder, + CallFolder: spec.CallFolder, + } +} + +// MethodInfos returns compile-time metadata for all registered methods, +// keyed by name. Used by the resolver for arity checking. +func (interp *Interpreter) MethodInfos() map[string]syntax.MethodInfo { + infos := make(map[string]syntax.MethodInfo, len(interp.staticMethods)+len(interp.lambdaMethods)) + for name, spec := range interp.staticMethods { + infos[name] = methodSpecToInfo(spec) + } + for name, spec := range interp.lambdaMethods { + infos[name] = methodSpecToInfo(spec) + } + return infos +} + +// FunctionInfos returns compile-time metadata for all registered functions, +// keyed by name. Used by the resolver for arity checking. +func (interp *Interpreter) FunctionInfos() map[string]syntax.FunctionInfo { + infos := make(map[string]syntax.FunctionInfo, len(interp.staticFunctions)) + for name, spec := range interp.staticFunctions { + infos[name] = functionSpecToInfo(spec) + } + return infos +} + +// StdlibNames returns compile-time method and function metadata from the +// global opcode tables. Used by the resolver for name validation and arity checking. +func StdlibNames() (methods map[string]syntax.MethodInfo, functions map[string]syntax.FunctionInfo) { + methods = make(map[string]syntax.MethodInfo, len(methodNameToOpcode)) + for name, opcode := range methodNameToOpcode { + var spec MethodSpec + if opcode >= lambdaOpcodeBase { + spec = sharedLambdaSpecs[name] + } else { + spec = methodTable[opcode] + } + methods[name] = methodSpecToInfo(spec) + } + functions = make(map[string]syntax.FunctionInfo, len(functionNameToOpcode)) + for name, opcode := range functionNameToOpcode { + functions[name] = functionSpecToInfo(functionTable[opcode]) + } + return +} + +func (interp *Interpreter) registerFunctions() { + f := func(fn FunctionFunc) FunctionSpec { return FunctionSpec{Fn: fn} } + + interp.RegisterFunction("deleted", FunctionSpec{Fn: func(_ []any) any { return Deleted }}) + interp.RegisterFunction("void", FunctionSpec{Fn: func(_ []any) any { return Void }}) + interp.RegisterFunction("throw", FunctionSpec{ + Fn: func(args []any) any { + if len(args) != 1 { + return NewError("throw() requires exactly one string argument") + } + msg, ok := args[0].(string) + if !ok { + return NewError(fmt.Sprintf("throw() requires a string argument, got %T", args[0])) + } + return NewError(msg) + }, + Params: []FunctionParam{{Name: "message"}}, + }) + interp.RegisterFunction("uuid_v4", f(func(_ []any) any { + return uuid.New().String() + })) + interp.RegisterFunction("now", f(func(_ []any) any { + return time.Now().UTC() + })) + interp.RegisterFunction("random_int", FunctionSpec{ + Fn: func(args []any) any { + if len(args) != 2 { + return NewError("random_int() requires min and max arguments") + } + minVal, ok1 := toInt64(args[0]) + maxVal, ok2 := toInt64(args[1]) + if !ok1 || !ok2 { + return NewError("random_int() requires integer arguments") + } + if minVal > maxVal { + return NewError("random_int(): min must be <= max") + } + return minVal + rand.Int64N(maxVal-minVal+1) + }, + Params: []FunctionParam{{Name: "min"}, {Name: "max"}}, + }) + interp.RegisterFunction("range", FunctionSpec{ + Fn: func(args []any) any { + if len(args) < 2 || len(args) > 3 { + return NewError("range() requires 2 or 3 arguments") + } + start, ok1 := toInt64(args[0]) + stop, ok2 := toInt64(args[1]) + if !ok1 || !ok2 { + return NewError("range() requires integer arguments") + } + var step int64 + if len(args) == 3 { + s, ok := toInt64(args[2]) + if !ok { + return NewError("range() step must be integer") + } + if s == 0 { + return NewError("range() step cannot be zero") + } + if (start < stop && s < 0) || (start > stop && s > 0) { + return NewError("range() step direction contradicts start/stop") + } + step = s + } else { + if start <= stop { + step = 1 + } else { + step = -1 + } + } + if start == stop { + return []any{} + } + var result []any + if step > 0 { + for i := start; i < stop; i += step { + result = append(result, i) + } + } else { + for i := start; i > stop; i += step { + result = append(result, i) + } + } + return result + }, + Params: []FunctionParam{{Name: "start"}, {Name: "stop"}, {Name: "step", HasDefault: true}}, + }) + interp.RegisterFunction("second", f(func(_ []any) any { return int64(1_000_000_000) })) + interp.RegisterFunction("minute", f(func(_ []any) any { return int64(60_000_000_000) })) + interp.RegisterFunction("hour", f(func(_ []any) any { return int64(3_600_000_000_000) })) + interp.RegisterFunction("day", f(func(_ []any) any { return int64(86_400_000_000_000) })) + + interp.RegisterFunction("timestamp", FunctionSpec{ + Fn: func(args []any) any { + if len(args) < 3 { + return NewError("timestamp() requires at least year, month, day") + } + year, ok1 := toInt64(args[0]) + month, ok2 := toInt64(args[1]) + day, ok3 := toInt64(args[2]) + if !ok1 || !ok2 || !ok3 { + return NewError("timestamp() requires integer year, month, day") + } + var hour, minute, sec, nano int64 + tz := "UTC" + if len(args) > 3 { + hour, _ = toInt64(args[3]) + } + if len(args) > 4 { + minute, _ = toInt64(args[4]) + } + if len(args) > 5 { + sec, _ = toInt64(args[5]) + } + if len(args) > 6 { + nano, _ = toInt64(args[6]) + } + if len(args) > 7 { + if s, ok := args[7].(string); ok { + tz = s + } + } + if month < 1 || month > 12 { + return NewError(fmt.Sprintf("timestamp(): month %d out of range (1-12)", month)) + } + if day < 1 || day > 31 { + return NewError(fmt.Sprintf("timestamp(): day %d out of range (1-31)", day)) + } + if hour < 0 || hour > 23 { + return NewError(fmt.Sprintf("timestamp(): hour %d out of range (0-23)", hour)) + } + if minute < 0 || minute > 59 { + return NewError(fmt.Sprintf("timestamp(): minute %d out of range (0-59)", minute)) + } + if sec < 0 || sec > 59 { + return NewError(fmt.Sprintf("timestamp(): second %d out of range (0-59)", sec)) + } + if nano < 0 || nano > 999999999 { + return NewError(fmt.Sprintf("timestamp(): nano %d out of range (0-999999999)", nano)) + } + loc, err := time.LoadLocation(tz) + if err != nil { + return NewError("timestamp(): unknown timezone " + tz) + } + return time.Date(int(year), time.Month(month), int(day), int(hour), int(minute), int(sec), int(nano), loc) + }, + Params: []FunctionParam{ + {Name: "year"}, + {Name: "month"}, + {Name: "day"}, + {Name: "hour", HasDefault: true, Default: int64(0)}, + {Name: "minute", HasDefault: true, Default: int64(0)}, + {Name: "second", HasDefault: true, Default: int64(0)}, + {Name: "nano", HasDefault: true, Default: int64(0)}, + {Name: "timezone", HasDefault: true, Default: "UTC"}, + }, + }) + + interp.registerMessageFunctions() +} + +func (interp *Interpreter) registerMethods() { + m := func(fn MethodFunc) MethodSpec { return MethodSpec{Fn: fn} } + + // Type conversion and introspection. + interp.RegisterMethod("type", MethodSpec{Fn: methodType, AcceptsNull: true}) + interp.RegisterMethod("string", MethodSpec{Fn: methodString, AcceptsNull: true}) + interp.RegisterMethod("int64", m(methodInt64)) + interp.RegisterMethod("int32", m(methodInt32)) + interp.RegisterMethod("uint32", m(methodUint32)) + interp.RegisterMethod("uint64", m(methodUint64)) + interp.RegisterMethod("float64", m(methodFloat64)) + interp.RegisterMethod("float32", m(methodFloat32)) + interp.RegisterMethod("bool", m(methodBool)) + interp.RegisterMethod("bytes", MethodSpec{Fn: methodBytes, AcceptsNull: true}) + interp.RegisterMethod("char", m(methodChar)) + + // Sequence methods. + interp.RegisterMethod("length", m(methodLength)) + interp.RegisterMethod("contains", m(methodContains)) + interp.RegisterMethod("reverse", m(methodReverse)) + + // String methods. + interp.RegisterMethod("uppercase", m(methodUppercase)) + interp.RegisterMethod("lowercase", m(methodLowercase)) + interp.RegisterMethod("trim", m(methodTrim)) + interp.RegisterMethod("trim_prefix", m(methodTrimPrefix)) + interp.RegisterMethod("trim_suffix", m(methodTrimSuffix)) + interp.RegisterMethod("has_prefix", m(methodHasPrefix)) + interp.RegisterMethod("has_suffix", m(methodHasSuffix)) + interp.RegisterMethod("split", m(methodSplit)) + interp.RegisterMethod("replace_all", m(methodReplaceAll)) + interp.RegisterMethod("repeat", m(methodRepeat)) + interp.RegisterMethod("re_match", MethodSpec{Fn: methodReMatch, ArgFolder: foldRegexPattern}) + interp.RegisterMethod("re_find_all", MethodSpec{Fn: methodReFindAll, ArgFolder: foldRegexPattern}) + interp.RegisterMethod("re_replace_all", MethodSpec{Fn: methodReReplaceAll, ArgFolder: foldRegexPattern}) + + // Numeric methods. + interp.RegisterMethod("abs", m(methodAbs)) + interp.RegisterMethod("floor", m(methodFloor)) + interp.RegisterMethod("ceil", m(methodCeil)) + interp.RegisterMethod("round", m(methodRound)) + + // Array methods. + interp.RegisterMethod("append", m(methodAppend)) + interp.RegisterMethod("concat", m(methodConcat)) + interp.RegisterMethod("flatten", m(methodFlatten)) + interp.RegisterMethod("enumerate", m(methodEnumerate)) + interp.RegisterMethod("join", m(methodJoin)) + interp.RegisterMethod("sum", m(methodSum)) + interp.RegisterMethod("min", m(methodMin)) + interp.RegisterMethod("max", m(methodMax)) + + // Object methods. + interp.RegisterMethod("keys", m(methodKeys)) + interp.RegisterMethod("values", m(methodValues)) + interp.RegisterMethod("has_key", m(methodHasKey)) + interp.RegisterMethod("merge", m(methodMerge)) + interp.RegisterMethod("without", m(methodWithout)) + interp.RegisterMethod("iter", m(methodIter)) + interp.RegisterMethod("collect", m(methodCollect)) + + // Timestamp methods. + interp.RegisterMethod("ts_unix", m(methodTsUnix)) + interp.RegisterMethod("ts_unix_milli", m(methodTsUnixMilli)) + interp.RegisterMethod("ts_unix_micro", m(methodTsUnixMicro)) + interp.RegisterMethod("ts_unix_nano", m(methodTsUnixNano)) + interp.RegisterMethod("ts_from_unix", m(methodTsFromUnix)) + interp.RegisterMethod("ts_from_unix_milli", m(methodTsFromUnixMilli)) + interp.RegisterMethod("ts_from_unix_micro", m(methodTsFromUnixMicro)) + interp.RegisterMethod("ts_from_unix_nano", m(methodTsFromUnixNano)) + interp.RegisterMethod("ts_parse", MethodSpec{Fn: methodTsParse, Params: []MethodParam{ + {Name: "format", Default: defaultTimestampFormat, HasDefault: true}, + }}) + interp.RegisterMethod("ts_format", MethodSpec{Fn: methodTsFormat, Params: []MethodParam{ + {Name: "format", Default: defaultTimestampFormat, HasDefault: true}, + }}) + interp.RegisterMethod("ts_add", MethodSpec{Fn: methodTsAdd, Params: []MethodParam{ + {Name: "nanos"}, + }}) + + // Encoding methods. + interp.RegisterMethod("parse_json", m(methodParseJSON)) + interp.RegisterMethod("format_json", MethodSpec{Fn: methodFormatJSON, AcceptsNull: true, Params: []MethodParam{ + {Name: "indent", Default: "", HasDefault: true}, + {Name: "no_indent", Default: false, HasDefault: true}, + {Name: "escape_html", Default: true, HasDefault: true}, + }}) + interp.RegisterMethod("encode", MethodSpec{Fn: methodEncode, Params: []MethodParam{ + {Name: "scheme"}, + }}) + interp.RegisterMethod("decode", MethodSpec{Fn: methodDecode, Params: []MethodParam{ + {Name: "scheme"}, + }}) + + // Error handling. + interp.RegisterMethod("not_null", MethodSpec{Fn: methodNotNull, AcceptsNull: true, Params: []MethodParam{ + {Name: "message", Default: "unexpected null value", HasDefault: true}, + }}) + + // Intrinsic methods — dispatch handled inline in evalMethodCall, + // registered here for name resolution only. + interp.RegisterMethod("catch", MethodSpec{ + Intrinsic: true, + Params: []MethodParam{{Name: "fn", AcceptsLambda: true}}, + }) + interp.RegisterMethod("or", MethodSpec{ + Intrinsic: true, + Params: []MethodParam{{Name: "default"}}, + }) +} + +// ----------------------------------------------------------------------- +// Type introspection and conversion +// ----------------------------------------------------------------------- + +func methodType(receiver any, _ []any) any { + if receiver == nil { + return "null" + } + switch receiver.(type) { + case string: + return "string" + case int32: + return "int32" + case int64: + return "int64" + case uint32: + return "uint32" + case uint64: + return "uint64" + case float32: + return "float32" + case float64: + return "float64" + case bool: + return "bool" + case []byte: + return "bytes" + case time.Time: + return "timestamp" + case []any: + return "array" + case map[string]any: + return "object" + default: + return "unknown" + } +} + +func methodString(receiver any, _ []any) any { + if receiver == nil { + return "null" + } + switch v := receiver.(type) { + case string: + return v + case int32: + return strconv.FormatInt(int64(v), 10) + case int64: + return strconv.FormatInt(v, 10) + case uint32: + return strconv.FormatUint(uint64(v), 10) + case uint64: + return strconv.FormatUint(v, 10) + case float32: + return formatFloat(float64(v), 32) + case float64: + return formatFloat(v, 64) + case bool: + if v { + return "true" + } + return "false" + case time.Time: + return formatTimestamp(v) + case []byte: + if !utf8.Valid(v) { + return NewError("bytes are not valid UTF-8") + } + return string(v) + case []any: + if containsBytes(v) { + return NewError("cannot convert array to string: contains bytes value (convert bytes explicitly before embedding in containers)") + } + b, err := json.Marshal(sortedJSON(v)) + if err != nil { + return NewError("cannot convert array to string: " + err.Error()) + } + return string(b) + case map[string]any: + if containsBytes(v) { + return NewError("cannot convert object to string: contains bytes value (convert bytes explicitly before embedding in containers)") + } + b, err := json.Marshal(sortedJSON(v)) + if err != nil { + return NewError("cannot convert object to string: " + err.Error()) + } + return string(b) + default: + return NewError(fmt.Sprintf("cannot convert %T to string", receiver)) + } +} + +// containsBytes recursively checks whether a value tree contains any []byte values. +func containsBytes(v any) bool { + switch val := v.(type) { + case []byte: + return true + case []any: + for _, elem := range val { + if containsBytes(elem) { + return true + } + } + case map[string]any: + for _, elem := range val { + if containsBytes(elem) { + return true + } + } + } + return false +} + +func formatFloat(f float64, bitSize int) string { + if math.IsNaN(f) { + return "NaN" + } + if math.IsInf(f, 1) { + return "Infinity" + } + if math.IsInf(f, -1) { + return "-Infinity" + } + if f == 0 && math.Signbit(f) { + return "0.0" // negative zero + } + s := strconv.FormatFloat(f, 'g', -1, bitSize) + // Ensure the string contains a decimal point or exponent. + if !strings.ContainsAny(s, ".eE") { + s += ".0" + } + return s +} + +func formatTimestamp(t time.Time) string { + return strftimeFormat(t, defaultTimestampFormat) +} + +func methodInt64(receiver any, _ []any) any { + switch v := receiver.(type) { + case int64: + return v + case int32: + return int64(v) + case uint32: + return int64(v) + case uint64: + if v > math.MaxInt64 { + return NewError("uint64 value exceeds int64 range") + } + return int64(v) + case float64: + return int64(v) // truncates toward zero + case float32: + return int64(v) + case string: + n, err := strconv.ParseInt(v, 10, 64) + if err != nil { + return NewError("cannot convert string to int64: " + err.Error()) + } + return n + case bool: + return NewError("cannot convert bool to int64") + default: + return NewError(fmt.Sprintf("cannot convert %T to int64", receiver)) + } +} + +func methodInt32(receiver any, _ []any) any { + i64 := methodInt64(receiver, nil) + if IsError(i64) { + return i64 + } + n := i64.(int64) + if n > math.MaxInt32 || n < math.MinInt32 { + return NewError("int32 overflow") + } + return int32(n) +} + +func methodUint32(receiver any, _ []any) any { + switch v := receiver.(type) { + case uint32: + return v + case int64: + if v < 0 || v > math.MaxUint32 { + return NewError("uint32 overflow") + } + return uint32(v) + case string: + n, err := strconv.ParseUint(v, 10, 32) + if err != nil { + return NewError("cannot convert string to uint32: " + err.Error()) + } + return uint32(n) + default: + i64 := methodInt64(receiver, nil) + if IsError(i64) { + return i64 + } + n := i64.(int64) + if n < 0 || n > math.MaxUint32 { + return NewError("uint32 overflow") + } + return uint32(n) + } +} + +func methodUint64(receiver any, _ []any) any { + switch v := receiver.(type) { + case uint64: + return v + case int64: + if v < 0 { + return NewError("uint64 overflow: negative value") + } + return uint64(v) + case string: + n, err := strconv.ParseUint(v, 10, 64) + if err != nil { + return NewError("uint64 overflow: " + err.Error()) + } + return n + default: + i64 := methodInt64(receiver, nil) + if IsError(i64) { + return i64 + } + n := i64.(int64) + if n < 0 { + return NewError("uint64 overflow: negative value") + } + return uint64(n) + } +} + +func methodFloat64(receiver any, _ []any) any { + switch v := receiver.(type) { + case float64: + return v + case float32: + return float64(v) + case int64: + return float64(v) + case int32: + return float64(v) + case uint32: + return float64(v) + case uint64: + return float64(v) + case string: + f, err := strconv.ParseFloat(v, 64) + if err != nil { + return NewError("cannot convert string to float64: " + err.Error()) + } + return f + default: + return NewError(fmt.Sprintf("cannot convert %T to float64", receiver)) + } +} + +func methodFloat32(receiver any, _ []any) any { + f64 := methodFloat64(receiver, nil) + if IsError(f64) { + return f64 + } + return float32(f64.(float64)) +} + +func methodBool(receiver any, _ []any) any { + switch v := receiver.(type) { + case bool: + return v + case string: + switch v { + case "true": + return true + case "false": + return false + default: + return NewError("cannot convert string " + strconv.Quote(v) + " to bool") + } + case int64: + return v != 0 + case int32: + return v != 0 + case uint32: + return v != 0 + case uint64: + return v != 0 + case float64: + if math.IsNaN(v) { + return NewError("NaN cannot be converted to bool") + } + return v != 0 + case float32: + if math.IsNaN(float64(v)) { + return NewError("NaN cannot be converted to bool") + } + return v != 0 + default: + return NewError(fmt.Sprintf("cannot convert %T to bool", receiver)) + } +} + +func methodBytes(receiver any, _ []any) any { + switch v := receiver.(type) { + case []byte: + return v + case string: + return []byte(v) + default: + s := methodString(receiver, nil) + if IsError(s) { + return s + } + return []byte(s.(string)) + } +} + +func methodChar(receiver any, _ []any) any { + n, ok := toInt64(receiver) + if !ok { + return NewError(fmt.Sprintf("char() requires integer, got %T", receiver)) + } + if n < 0 || n > 0x10FFFF { + return NewError("codepoint out of valid Unicode range") + } + return string(rune(n)) +} + +func methodNotNull(receiver any, args []any) any { + if receiver != nil { + return receiver + } + msg := "unexpected null value" + if len(args) > 0 { + if s, ok := args[0].(string); ok { + msg = s + } + } + return NewError(msg) +} + +// ----------------------------------------------------------------------- +// Sequence methods +// ----------------------------------------------------------------------- + +func methodLength(receiver any, _ []any) any { + switch v := receiver.(type) { + case string: + return int64(utf8.RuneCountInString(v)) + case []any: + return int64(len(v)) + case []byte: + return int64(len(v)) + case map[string]any: + return int64(len(v)) + default: + return NewError(fmt.Sprintf("length() not supported on %T", receiver)) + } +} + +func methodContains(receiver any, args []any) any { + if len(args) != 1 { + return NewError("contains() requires exactly one argument") + } + switch v := receiver.(type) { + case string: + target, ok := args[0].(string) + if !ok { + return NewError("string contains() requires string argument") + } + return strings.Contains(v, target) + case []any: + for _, elem := range v { + if valuesEqual(elem, args[0]) { + return true + } + } + return false + case []byte: + target, ok := args[0].([]byte) + if !ok { + return NewError("bytes contains() requires bytes argument") + } + return bytes.Contains(v, target) + default: + return NewError(fmt.Sprintf("contains() not supported on %T", receiver)) + } +} + +func methodReverse(receiver any, _ []any) any { + switch v := receiver.(type) { + case string: + runes := []rune(v) + for i, j := 0, len(runes)-1; i < j; i, j = i+1, j-1 { + runes[i], runes[j] = runes[j], runes[i] + } + return string(runes) + case []any: + result := make([]any, len(v)) + for i, j := 0, len(v)-1; j >= 0; i, j = i+1, j-1 { + result[i] = v[j] + } + return result + case []byte: + result := make([]byte, len(v)) + for i, j := 0, len(v)-1; j >= 0; i, j = i+1, j-1 { + result[i] = v[j] + } + return result + default: + return NewError(fmt.Sprintf("reverse() not supported on %T", receiver)) + } +} + +// ----------------------------------------------------------------------- +// String methods +// ----------------------------------------------------------------------- + +func methodUppercase(receiver any, _ []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("uppercase() requires string, got %T", receiver)) + } + return strings.ToUpper(s) +} + +func methodLowercase(receiver any, _ []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("lowercase() requires string, got %T", receiver)) + } + return strings.ToLower(s) +} + +func methodTrim(receiver any, _ []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("trim() requires string, got %T", receiver)) + } + return strings.TrimSpace(s) +} + +func methodTrimPrefix(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("trim_prefix() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("trim_prefix() requires one argument") + } + prefix, ok := args[0].(string) + if !ok { + return NewError("trim_prefix() argument must be string") + } + return strings.TrimPrefix(s, prefix) +} + +func methodTrimSuffix(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("trim_suffix() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("trim_suffix() requires one argument") + } + suffix, ok := args[0].(string) + if !ok { + return NewError("trim_suffix() argument must be string") + } + return strings.TrimSuffix(s, suffix) +} + +func methodHasPrefix(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("has_prefix() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("has_prefix() requires one argument") + } + prefix, ok := args[0].(string) + if !ok { + return NewError("has_prefix() argument must be string") + } + return strings.HasPrefix(s, prefix) +} + +func methodHasSuffix(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("has_suffix() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("has_suffix() requires one argument") + } + suffix, ok := args[0].(string) + if !ok { + return NewError("has_suffix() argument must be string") + } + return strings.HasSuffix(s, suffix) +} + +func methodSplit(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("split() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("split() requires one argument") + } + delim, ok := args[0].(string) + if !ok { + return NewError("split() argument must be string") + } + if delim == "" { + if s == "" { + return []any{} + } + // Split by codepoint. + runes := []rune(s) + result := make([]any, len(runes)) + for i, r := range runes { + result[i] = string(r) + } + return result + } + parts := strings.Split(s, delim) + result := make([]any, len(parts)) + for i, p := range parts { + result[i] = p + } + return result +} + +func methodReplaceAll(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("replace_all() requires string, got %T", receiver)) + } + if len(args) != 2 { + return NewError("replace_all() requires old and new arguments") + } + old, ok1 := args[0].(string) + new_, ok2 := args[1].(string) + if !ok1 || !ok2 { + return NewError("replace_all() arguments must be strings") + } + return strings.ReplaceAll(s, old, new_) +} + +func methodRepeat(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("repeat() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("repeat() requires one argument") + } + count, ok := toInt64(args[0]) + if !ok { + return NewError("repeat() argument must be integer") + } + if count < 0 { + return NewError("repeat() count must be non-negative") + } + return strings.Repeat(s, int(count)) +} + +// foldRegexPattern is the ArgFolder shared by re_match, re_find_all, +// and re_replace_all. If the first argument is a string literal it +// gets compiled at parse time; the resulting *regexp.Regexp flows +// through to the runtime method via CallArg.Folded. Dynamic patterns +// (e.g. `.re_match($pattern)`) are left untouched and compile on every +// call, matching the previous behaviour. +func foldRegexPattern(args []syntax.CallArg) ([]any, error) { + if len(args) == 0 { + return nil, nil + } + out := make([]any, len(args)) + lit, ok := args[0].Value.(*syntax.LiteralExpr) + if !ok { + return out, nil + } + if lit.TokenType != syntax.STRING && lit.TokenType != syntax.RAW_STRING { + return out, nil + } + re, err := regexp.Compile(lit.Value) + if err != nil { + return nil, fmt.Errorf("invalid regex pattern %q: %v", lit.Value, err) + } + out[0] = re + return out, nil +} + +// resolveRegex extracts a *regexp.Regexp from a pattern argument that +// may already be precompiled (via foldRegexPattern) or still be a raw +// string. Shared by all three re_* methods. +func resolveRegex(arg any, callerLabel string) (*regexp.Regexp, any) { + switch p := arg.(type) { + case *regexp.Regexp: + return p, nil + case string: + re, err := regexp.Compile(p) + if err != nil { + return nil, NewError(callerLabel + " invalid pattern: " + err.Error()) + } + return re, nil + default: + return nil, NewError(callerLabel + " argument must be string") + } +} + +func methodReMatch(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("re_match() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("re_match() requires one argument") + } + re, errV := resolveRegex(args[0], "re_match()") + if errV != nil { + return errV + } + return re.MatchString(s) +} + +func methodReFindAll(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("re_find_all() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("re_find_all() requires one argument") + } + re, errV := resolveRegex(args[0], "re_find_all()") + if errV != nil { + return errV + } + matches := re.FindAllString(s, -1) + result := make([]any, len(matches)) + for i, m := range matches { + result[i] = m + } + return result +} + +func methodReReplaceAll(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("re_replace_all() requires string, got %T", receiver)) + } + if len(args) != 2 { + return NewError("re_replace_all() requires pattern and replacement arguments") + } + re, errV := resolveRegex(args[0], "re_replace_all()") + if errV != nil { + return errV + } + replacement, ok := args[1].(string) + if !ok { + return NewError("re_replace_all() replacement must be string") + } + return re.ReplaceAllString(s, replacement) +} + +// ----------------------------------------------------------------------- +// Numeric methods +// ----------------------------------------------------------------------- + +func methodAbs(receiver any, _ []any) any { + switch v := receiver.(type) { + case int64: + if v == math.MinInt64 { + return NewError("int64 overflow in abs()") + } + if v < 0 { + return -v + } + return v + case int32: + if v == math.MinInt32 { + return NewError("int32 overflow in abs()") + } + if v < 0 { + return -v + } + return v + case float64: + return math.Abs(v) + case float32: + return float32(math.Abs(float64(v))) + case uint32: + return v + case uint64: + return v + default: + return NewError(fmt.Sprintf("abs() requires numeric, got %T", receiver)) + } +} + +func methodFloor(receiver any, _ []any) any { + switch v := receiver.(type) { + case float64: + return math.Floor(v) + case float32: + return float32(math.Floor(float64(v))) + default: + return NewError(fmt.Sprintf("floor() requires float, got %T", receiver)) + } +} + +func methodCeil(receiver any, _ []any) any { + switch v := receiver.(type) { + case float64: + return math.Ceil(v) + case float32: + return float32(math.Ceil(float64(v))) + default: + return NewError(fmt.Sprintf("ceil() requires float, got %T", receiver)) + } +} + +func methodRound(receiver any, args []any) any { + var n int64 + if len(args) > 0 { + var ok bool + n, ok = toInt64(args[0]) + if !ok { + return NewError("round() argument must be integer") + } + } + + switch v := receiver.(type) { + case float64: + return roundFloat(v, n) + case float32: + return float32(roundFloat(float64(v), n)) + default: + return NewError(fmt.Sprintf("round() requires float, got %T", receiver)) + } +} + +func roundFloat(f float64, decimals int64) float64 { + shift := math.Pow(10, float64(decimals)) + return math.RoundToEven(f*shift) / shift +} + +// ----------------------------------------------------------------------- +// Array methods +// ----------------------------------------------------------------------- + +func methodAppend(receiver any, args []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("append() requires array, got %T", receiver)) + } + if len(args) != 1 { + return NewError("append() requires one argument") + } + result := make([]any, len(arr)+1) + copy(result, arr) + result[len(arr)] = args[0] + return result +} + +func methodConcat(receiver any, args []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("concat() requires array, got %T", receiver)) + } + if len(args) != 1 { + return NewError("concat() requires one argument") + } + other, ok := args[0].([]any) + if !ok { + return NewError("concat() argument must be array") + } + result := make([]any, len(arr)+len(other)) + copy(result, arr) + copy(result[len(arr):], other) + return result +} + +func methodFlatten(receiver any, _ []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("flatten() requires array, got %T", receiver)) + } + var result []any + for _, elem := range arr { + if inner, ok := elem.([]any); ok { + result = append(result, inner...) + } else { + result = append(result, elem) + } + } + if result == nil { + result = []any{} + } + return result +} + +func methodEnumerate(receiver any, _ []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("enumerate() requires array, got %T", receiver)) + } + result := make([]any, len(arr)) + for i, v := range arr { + result[i] = map[string]any{"index": int64(i), "value": v} + } + return result +} + +func methodJoin(receiver any, args []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("join() requires array, got %T", receiver)) + } + if len(args) != 1 { + return NewError("join() requires one argument") + } + delim, ok := args[0].(string) + if !ok { + return NewError("join() delimiter must be string") + } + parts := make([]string, len(arr)) + for i, elem := range arr { + s, ok := elem.(string) + if !ok { + return NewError(fmt.Sprintf("join() requires all elements to be strings, element %d is %T", i, elem)) + } + parts[i] = s + } + return strings.Join(parts, delim) +} + +func methodSum(receiver any, _ []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("sum() requires array, got %T", receiver)) + } + if len(arr) == 0 { + return int64(0) + } + result := arr[0] + if !isNumeric(result) { + return NewError(fmt.Sprintf("sum() requires numeric elements, got %T", result)) + } + for _, elem := range arr[1:] { + result = evalAdd(result, elem) + if IsError(result) { + return result + } + } + return result +} + +func methodMin(receiver any, _ []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("min() requires array, got %T", receiver)) + } + if len(arr) == 0 { + return NewError("min() requires non-empty array") + } + result := arr[0] + // Track the widest type seen for final promotion. + widest := arr[0] + for _, elem := range arr[1:] { + cmp := compareForSort(result, elem) + if IsError(cmp) { + return cmp + } + if cmp.(int64) > 0 { + result = elem + } + // Widen the type tracker so we know the common type at the end. + promoted, _, _, promErr := promoteChecked(widest, elem) + if promErr == "" { + widest = promoted + } + } + // Promote the result to the common type of all elements. + promoted, _, _, promErr := promoteChecked(result, widest) + if promErr == "" { + result = promoted + } + return result +} + +func methodMax(receiver any, _ []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("max() requires array, got %T", receiver)) + } + if len(arr) == 0 { + return NewError("max() requires non-empty array") + } + result := arr[0] + widest := arr[0] + for _, elem := range arr[1:] { + cmp := compareForSort(result, elem) + if IsError(cmp) { + return cmp + } + if cmp.(int64) < 0 { + result = elem + } + promoted, _, _, promErr := promoteChecked(widest, elem) + if promErr == "" { + widest = promoted + } + } + promoted, _, _, promErr := promoteChecked(result, widest) + if promErr == "" { + result = promoted + } + return result +} + +// ----------------------------------------------------------------------- +// Object methods +// ----------------------------------------------------------------------- + +func methodKeys(receiver any, _ []any) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("keys() requires object, got %T", receiver)) + } + result := make([]any, 0, len(obj)) + for k := range obj { + result = append(result, k) + } + sort.Slice(result, func(i, j int) bool { + return result[i].(string) < result[j].(string) + }) + return result +} + +func methodValues(receiver any, _ []any) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("values() requires object, got %T", receiver)) + } + // Sort by keys for deterministic order. + keys := make([]string, 0, len(obj)) + for k := range obj { + keys = append(keys, k) + } + sort.Strings(keys) + result := make([]any, len(keys)) + for i, k := range keys { + result[i] = obj[k] + } + return result +} + +func methodHasKey(receiver any, args []any) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("has_key() requires object, got %T", receiver)) + } + if len(args) != 1 { + return NewError("has_key() requires one argument") + } + key, ok := args[0].(string) + if !ok { + return NewError("has_key() argument must be string") + } + _, exists := obj[key] + return exists +} + +func methodMerge(receiver any, args []any) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("merge() requires object, got %T", receiver)) + } + if len(args) != 1 { + return NewError("merge() requires one argument") + } + other, ok := args[0].(map[string]any) + if !ok { + return NewError("merge() argument must be object") + } + result := make(map[string]any, len(obj)+len(other)) + for k, v := range obj { + result[k] = v + } + for k, v := range other { + result[k] = v + } + return result +} + +func methodWithout(receiver any, args []any) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("without() requires object, got %T", receiver)) + } + if len(args) != 1 { + return NewError("without() requires one argument") + } + keys, ok := args[0].([]any) + if !ok { + return NewError("without() argument must be array of strings") + } + exclude := make(map[string]bool, len(keys)) + for i, k := range keys { + s, ok := k.(string) + if !ok { + return NewError(fmt.Sprintf("without() keys must be strings, element %d is %T", i, k)) + } + exclude[s] = true + } + result := make(map[string]any, len(obj)) + for k, v := range obj { + if !exclude[k] { + result[k] = v + } + } + return result +} + +func methodIter(receiver any, _ []any) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("iter() requires object, got %T", receiver)) + } + result := make([]any, 0, len(obj)) + for k, v := range obj { + result = append(result, map[string]any{"key": k, "value": v}) + } + return result +} + +func methodCollect(receiver any, _ []any) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("collect() requires array, got %T", receiver)) + } + result := make(map[string]any, len(arr)) + for _, elem := range arr { + entry, ok := elem.(map[string]any) + if !ok { + return NewError("collect() requires array of {key, value} objects") + } + key, ok := entry["key"].(string) + if !ok { + return NewError("collect() entry missing string 'key' field") + } + val, ok := entry["value"] + if !ok { + return NewError("collect() entry missing 'value' field") + } + result[key] = val + } + return result +} + +// ----------------------------------------------------------------------- +// Timestamp methods +// ----------------------------------------------------------------------- + +func methodTsUnix(receiver any, _ []any) any { + t, ok := receiver.(time.Time) + if !ok { + return NewError(fmt.Sprintf("ts_unix() requires timestamp, got %T", receiver)) + } + return t.Unix() +} + +func methodTsUnixMilli(receiver any, _ []any) any { + t, ok := receiver.(time.Time) + if !ok { + return NewError(fmt.Sprintf("ts_unix_milli() requires timestamp, got %T", receiver)) + } + return t.UnixMilli() +} + +func methodTsUnixMicro(receiver any, _ []any) any { + t, ok := receiver.(time.Time) + if !ok { + return NewError(fmt.Sprintf("ts_unix_micro() requires timestamp, got %T", receiver)) + } + return t.UnixMicro() +} + +func methodTsUnixNano(receiver any, _ []any) any { + t, ok := receiver.(time.Time) + if !ok { + return NewError(fmt.Sprintf("ts_unix_nano() requires timestamp, got %T", receiver)) + } + return t.UnixNano() +} + +func methodTsFromUnix(receiver any, _ []any) any { + if !isNumeric(receiver) { + return NewError(fmt.Sprintf("ts_from_unix() requires numeric, got %T", receiver)) + } + if v, ok := receiver.(uint64); ok && v > math.MaxInt64 { + return NewError("ts_from_unix(): uint64 value exceeds int64 range") + } + f := toFloat64(receiver) + sec := int64(f) + nsec := int64((f - float64(sec)) * 1e9) + return time.Unix(sec, nsec).UTC() +} + +func methodTsFromUnixMilli(receiver any, _ []any) any { + n, ok := toInt64(receiver) + if !ok { + return NewError(fmt.Sprintf("ts_from_unix_milli() requires integer, got %T", receiver)) + } + return time.UnixMilli(n).UTC() +} + +func methodTsFromUnixMicro(receiver any, _ []any) any { + n, ok := toInt64(receiver) + if !ok { + return NewError(fmt.Sprintf("ts_from_unix_micro() requires integer, got %T", receiver)) + } + return time.UnixMicro(n).UTC() +} + +func methodTsFromUnixNano(receiver any, _ []any) any { + n, ok := toInt64(receiver) + if !ok { + return NewError(fmt.Sprintf("ts_from_unix_nano() requires integer, got %T", receiver)) + } + return time.Unix(0, n).UTC() +} + +func methodTsAdd(receiver any, args []any) any { + t, ok := receiver.(time.Time) + if !ok { + return NewError(fmt.Sprintf("ts_add() requires timestamp, got %T", receiver)) + } + if len(args) != 1 { + return NewError("ts_add() requires one argument (nanoseconds)") + } + nanos, ok := toInt64(args[0]) + if !ok { + return NewError("ts_add() argument must be integer nanoseconds") + } + return t.Add(time.Duration(nanos)) +} + +const defaultTimestampFormat = "%Y-%m-%dT%H:%M:%S%f%z" + +func methodTsParse(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("ts_parse() requires string, got %T", receiver)) + } + + format := defaultTimestampFormat + if len(args) > 0 { + if f, ok := args[0].(string); ok { + format = f + } + } + + t, err := strftimeParse(s, format) + if err != nil { + return NewError("ts_parse() failed: " + err.Error()) + } + return t +} + +func methodTsFormat(receiver any, args []any) any { + t, ok := receiver.(time.Time) + if !ok { + return NewError(fmt.Sprintf("ts_format() requires timestamp, got %T", receiver)) + } + + format := defaultTimestampFormat + if len(args) > 0 { + if f, ok := args[0].(string); ok { + format = f + } + } + + return strftimeFormat(t, format) +} + +// ----------------------------------------------------------------------- +// Encoding methods +// ----------------------------------------------------------------------- + +func methodParseJSON(receiver any, _ []any) any { + var data []byte + switch v := receiver.(type) { + case string: + data = []byte(v) + case []byte: + data = v + default: + return NewError(fmt.Sprintf("parse_json() requires string or bytes, got %T", receiver)) + } + dec := json.NewDecoder(strings.NewReader(string(data))) + dec.UseNumber() + var result any + if err := dec.Decode(&result); err != nil { + return NewError("parse_json() failed: " + err.Error()) + } + return normalizeJSONNumbers(result) +} + +func normalizeJSONNumbers(v any) any { + switch val := v.(type) { + case json.Number: + s := val.String() + // Spec: numbers with decimal or exponent → float64, else → int64. + if strings.ContainsAny(s, ".eE") { + f, err := val.Float64() + if err != nil { + return NewError("parse_json(): invalid number " + s) + } + return f + } + n, err := val.Int64() + if err != nil { + // Exceeds int64 range → float64 (may lose precision). + f, err := val.Float64() + if err != nil { + return NewError("parse_json(): invalid number " + s) + } + return f + } + return n + case map[string]any: + for k, v := range val { + val[k] = normalizeJSONNumbers(v) + } + return val + case []any: + for i, v := range val { + val[i] = normalizeJSONNumbers(v) + } + return val + default: + return v + } +} + +func methodFormatJSON(receiver any, args []any) any { + // Args mapped by RegisterMethodWithParams: [indent, no_indent, escape_html] + indent := "" + escapeHTML := true + + if len(args) > 0 { + if s, ok := args[0].(string); ok { + indent = s + } + } + if len(args) > 1 { + if b, ok := args[1].(bool); ok && b { + indent = "" // no_indent overrides indent + } + } + if len(args) > 2 { + if b, ok := args[2].(bool); ok { + escapeHTML = b + } + } + + // Check for non-JSON-representable values. + if err := checkJSONSerializable(receiver); err != "" { + return NewError(err) + } + + // Use json.Encoder for escape_html control. + var buf strings.Builder + enc := json.NewEncoder(&buf) + enc.SetEscapeHTML(escapeHTML) + if indent != "" { + enc.SetIndent("", indent) + } + if err := enc.Encode(sortedJSON(receiver)); err != nil { + return NewError("format_json() failed: " + err.Error()) + } + // Encoder adds a trailing newline — remove it. + result := buf.String() + if len(result) > 0 && result[len(result)-1] == '\n' { + result = result[:len(result)-1] + } + return result +} + +func checkJSONSerializable(v any) string { + switch val := v.(type) { + case float64: + if math.IsNaN(val) { + return "format_json(): NaN is not representable in JSON" + } + if math.IsInf(val, 0) { + return "format_json(): Infinity is not representable in JSON" + } + case float32: + f := float64(val) + if math.IsNaN(f) { + return "format_json(): NaN is not representable in JSON" + } + if math.IsInf(f, 0) { + return "format_json(): Infinity is not representable in JSON" + } + case []byte: + return "format_json(): bytes have no implicit JSON serialization" + case map[string]any: + for _, v := range val { + if err := checkJSONSerializable(v); err != "" { + return err + } + } + case []any: + for _, v := range val { + if err := checkJSONSerializable(v); err != "" { + return err + } + } + } + return "" +} + +// sortedJSON returns a value suitable for json.Marshal with sorted object keys +// and timestamps pre-formatted to the spec's shortest-precision RFC 3339. +func sortedJSON(v any) any { + switch val := v.(type) { + case map[string]any: + sorted := make(map[string]any, len(val)) + for k, v := range val { + sorted[k] = sortedJSON(v) + } + return sorted + case []any: + result := make([]any, len(val)) + for i, v := range val { + result[i] = sortedJSON(v) + } + return result + case time.Time: + return formatTimestamp(val) + default: + return v + } +} + +func methodEncode(receiver any, args []any) any { + if len(args) != 1 { + return NewError("encode() requires one argument (scheme)") + } + scheme, ok := args[0].(string) + if !ok { + return NewError("encode() scheme must be string") + } + var data []byte + switch v := receiver.(type) { + case string: + data = []byte(v) + case []byte: + data = v + default: + return NewError(fmt.Sprintf("encode() requires string or bytes, got %T", receiver)) + } + switch scheme { + case "base64": + return base64.StdEncoding.EncodeToString(data) + case "base64url": + return base64.URLEncoding.EncodeToString(data) + case "base64rawurl": + return base64.RawURLEncoding.EncodeToString(data) + case "hex": + return hex.EncodeToString(data) + default: + return NewError("encode(): unknown scheme " + scheme) + } +} + +func methodDecode(receiver any, args []any) any { + s, ok := receiver.(string) + if !ok { + return NewError(fmt.Sprintf("decode() requires string, got %T", receiver)) + } + if len(args) != 1 { + return NewError("decode() requires one argument (scheme)") + } + scheme, ok := args[0].(string) + if !ok { + return NewError("decode() scheme must be string") + } + switch scheme { + case "base64": + b, err := base64.StdEncoding.DecodeString(s) + if err != nil { + return NewError("decode() base64 failed: " + err.Error()) + } + return b + case "base64url": + b, err := base64.URLEncoding.DecodeString(s) + if err != nil { + return NewError("decode() base64url failed: " + err.Error()) + } + return b + case "base64rawurl": + b, err := base64.RawURLEncoding.DecodeString(s) + if err != nil { + return NewError("decode() base64rawurl failed: " + err.Error()) + } + return b + case "hex": + b, err := hex.DecodeString(s) + if err != nil { + return NewError("decode() hex failed: " + err.Error()) + } + return b + default: + return NewError("decode(): unknown scheme " + scheme) + } +} diff --git a/internal/bloblang2/go/pratt/eval/stdlib_lambda.go b/internal/bloblang2/go/pratt/eval/stdlib_lambda.go new file mode 100644 index 000000000..e71fa5819 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/stdlib_lambda.go @@ -0,0 +1,872 @@ +package eval + +import ( + "bytes" + "fmt" + "math" + "sort" + "time" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// RegisterLambdaMethods registers methods that take lambda arguments and +// need access to the interpreter for evaluation. +func (interp *Interpreter) RegisterLambdaMethods() { + lm := func(fn lambdaMethodFunc, params ...MethodParam) MethodSpec { + return MethodSpec{LambdaFn: fn, Params: params} + } + fnParam := MethodParam{Name: "fn", AcceptsLambda: true} + + interp.RegisterLambdaMethod("filter", lm(interp.methodFilter, fnParam)) + interp.RegisterLambdaMethod("map", lm(interp.methodMap, fnParam)) + interp.RegisterLambdaMethod("sort", MethodSpec{LambdaFn: interp.methodSort}) + interp.RegisterLambdaMethod("sort_by", lm(interp.methodSortBy, fnParam)) + interp.RegisterLambdaMethod("any", lm(interp.methodAny, fnParam)) + interp.RegisterLambdaMethod("all", lm(interp.methodAll, fnParam)) + interp.RegisterLambdaMethod("find", lm(interp.methodFind, fnParam)) + interp.RegisterLambdaMethod("fold", lm(interp.methodFold, MethodParam{Name: "initial"}, fnParam)) + interp.RegisterLambdaMethod("unique", lm(interp.methodUnique, MethodParam{Name: "fn", HasDefault: true, AcceptsLambda: true})) + interp.RegisterLambdaMethod("without_index", lm(interp.methodWithoutIndex, MethodParam{Name: "index"})) + interp.RegisterLambdaMethod("index_of", lm(interp.methodIndexOf, MethodParam{Name: "target"})) + interp.RegisterLambdaMethod("slice", lm(interp.methodSlice, MethodParam{Name: "low"}, MethodParam{Name: "high", HasDefault: true})) + interp.RegisterLambdaMethod("map_values", lm(interp.methodMapValues, fnParam)) + interp.RegisterLambdaMethod("map_keys", lm(interp.methodMapKeys, fnParam)) + interp.RegisterLambdaMethod("map_entries", lm(interp.methodMapEntries, fnParam)) + interp.RegisterLambdaMethod("filter_entries", lm(interp.methodFilterEntries, fnParam)) + // .into accepts any value type (including null) — the lambda sees + // the receiver verbatim and decides what to do with it. Only void/ + // deleted/error receivers are rejected, per spec §13.12. + interp.RegisterLambdaMethod("into", MethodSpec{ + LambdaFn: interp.methodInto, + Params: []MethodParam{fnParam}, + AcceptsNull: true, + AcceptsLambda: true, + }) +} + +// methodInto invokes the lambda with the receiver as its single argument +// and returns the lambda's result. Errors, void, and deleted() from the +// lambda propagate through unchanged — the calling context decides what +// to do with them. +func (interp *Interpreter) methodInto(receiver any, args []syntax.CallArg) any { + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("into() requires a lambda argument") + } + if len(lambda.Params) != 1 { + return NewError(fmt.Sprintf("into() requires a one-parameter lambda, got %d parameters", len(lambda.Params))) + } + argBuf := [1]any{receiver} + return interp.callLambda(lambda, argBuf[:]) +} + +func (interp *Interpreter) methodFilter(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("filter() requires array, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("filter() requires a lambda argument") + } + var argBuf [1]any + var result []any + for _, elem := range arr { + argBuf[0] = elem + val := interp.callLambda(lambda, argBuf[:]) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("filter() lambda returned void") + } + b, ok := val.(bool) + if !ok { + return NewError(fmt.Sprintf("filter() lambda must return bool, got %T", val)) + } + if b { + result = append(result, elem) + } + } + if result == nil { + result = []any{} + } + return result +} + +func (interp *Interpreter) methodMap(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("map() requires array, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("map() requires a lambda argument") + } + var argBuf [1]any + var result []any + for _, elem := range arr { + argBuf[0] = elem + val := interp.callLambda(lambda, argBuf[:]) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("map() lambda returned void (must return a value for every element)") + } + if IsDeleted(val) { + continue + } + result = append(result, val) + } + if result == nil { + result = []any{} + } + return result +} + +func (interp *Interpreter) methodSort(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("sort() requires array, got %T", receiver)) + } + if len(arr) == 0 { + return []any{} + } + if !isSortable(arr[0]) { + return NewError(fmt.Sprintf("sort(): %T is not a sortable type", arr[0])) + } + + sorted := make([]any, len(arr)) + copy(sorted, arr) + + var sortErr any + sort.SliceStable(sorted, func(i, j int) bool { + if sortErr != nil { + return false + } + cmp := compareForSort(sorted[i], sorted[j]) + if IsError(cmp) { + sortErr = cmp + return false + } + return cmp.(int64) < 0 + }) + if sortErr != nil { + return sortErr + } + return sorted +} + +func (interp *Interpreter) methodSortBy(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("sort_by() requires array, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("sort_by() requires a lambda argument") + } + + // Extract keys. + var argBuf [1]any + keys := make([]any, len(arr)) + for i, elem := range arr { + argBuf[0] = elem + key := interp.callLambda(lambda, argBuf[:]) + if IsError(key) { + return key + } + keys[i] = key + } + + indices := make([]int, len(arr)) + for i := range indices { + indices[i] = i + } + + var sortErr any + sort.SliceStable(indices, func(i, j int) bool { + if sortErr != nil { + return false + } + cmp := compareForSort(keys[indices[i]], keys[indices[j]]) + if IsError(cmp) { + sortErr = cmp + return false + } + return cmp.(int64) < 0 + }) + if sortErr != nil { + return sortErr + } + + result := make([]any, len(arr)) + for i, idx := range indices { + result[i] = arr[idx] + } + return result +} + +func (interp *Interpreter) methodAny(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("any() requires array, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("any() requires a lambda argument") + } + var argBuf [1]any + for _, elem := range arr { + argBuf[0] = elem + val := interp.callLambda(lambda, argBuf[:]) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("any() lambda returned void") + } + b, ok := val.(bool) + if !ok { + return NewError(fmt.Sprintf("any() lambda must return bool, got %T", val)) + } + if b { + return true // short-circuit + } + } + return false +} + +func (interp *Interpreter) methodAll(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("all() requires array, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("all() requires a lambda argument") + } + var argBuf [1]any + for _, elem := range arr { + argBuf[0] = elem + val := interp.callLambda(lambda, argBuf[:]) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("all() lambda returned void") + } + b, ok := val.(bool) + if !ok { + return NewError(fmt.Sprintf("all() lambda must return bool, got %T", val)) + } + if !b { + return false // short-circuit + } + } + return true +} + +func (interp *Interpreter) methodFind(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("find() requires array, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("find() requires a lambda argument") + } + var argBuf [1]any + for _, elem := range arr { + argBuf[0] = elem + val := interp.callLambda(lambda, argBuf[:]) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("find() lambda returned void") + } + b, ok := val.(bool) + if !ok { + return NewError(fmt.Sprintf("find() lambda must return bool, got %T", val)) + } + if b { + return elem // short-circuit + } + } + return Void +} + +func (interp *Interpreter) methodFold(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("fold() requires array, got %T", receiver)) + } + if len(args) != 2 { + return NewError("fold() requires initial value and lambda arguments") + } + initial := interp.evalExpr(args[0].Value) + if IsError(initial) { + return initial + } + lambda, ok := args[1].Value.(*syntax.LambdaExpr) + if !ok { + return NewError("fold() second argument must be a lambda") + } + + var argBuf2 [2]any + tally := initial + for _, elem := range arr { + argBuf2[0] = tally + argBuf2[1] = elem + tally = interp.callLambda(lambda, argBuf2[:]) + if IsError(tally) { + return tally + } + if IsVoid(tally) { + return NewError("fold() lambda returned void") + } + } + return tally +} + +func (interp *Interpreter) methodUnique(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("unique() requires array, got %T", receiver)) + } + + var keyFn *syntax.LambdaExpr + if len(args) > 0 { + keyFn = interp.extractLambdaOrMapRef(args) + } + + var seenList []any + seenNaN := false + contains := func(key any) bool { + // NaN values are considered equal for unique() per spec. + if isNaN(key) { + if seenNaN { + return true + } + seenNaN = true + return false + } + for _, s := range seenList { + if valuesEqual(s, key) { + return true + } + } + return false + } + + var argBuf [1]any + var result []any + for _, elem := range arr { + var key any + if keyFn != nil { + argBuf[0] = elem + key = interp.callLambda(keyFn, argBuf[:]) + if IsError(key) { + return key + } + } else { + key = elem + } + if !contains(key) { + seenList = append(seenList, key) + result = append(result, elem) + } + } + if result == nil { + result = []any{} + } + return result +} + +func (interp *Interpreter) methodWithoutIndex(receiver any, args []syntax.CallArg) any { + arr, ok := receiver.([]any) + if !ok { + return NewError(fmt.Sprintf("without_index() requires array, got %T", receiver)) + } + if len(args) != 1 { + return NewError("without_index() requires one argument") + } + idxVal := interp.evalExpr(args[0].Value) + if IsError(idxVal) { + return idxVal + } + idx, ok := toInt64(idxVal) + if !ok { + return NewError("without_index() argument must be integer") + } + if idx < 0 { + idx += int64(len(arr)) + } + if idx < 0 || idx >= int64(len(arr)) { + return NewError("without_index(): index out of bounds") + } + result := make([]any, 0, len(arr)-1) + result = append(result, arr[:idx]...) + result = append(result, arr[idx+1:]...) + return result +} + +func (interp *Interpreter) methodIndexOf(receiver any, args []syntax.CallArg) any { + if len(args) != 1 { + return NewError("index_of() requires one argument") + } + target := interp.evalExpr(args[0].Value) + if IsError(target) { + return target + } + + switch v := receiver.(type) { + case string: + s, ok := target.(string) + if !ok { + return NewError("string index_of() requires string argument") + } + idx := -1 + runes := []rune(v) + targetRunes := []rune(s) + for i := 0; i <= len(runes)-len(targetRunes); i++ { + if string(runes[i:i+len(targetRunes)]) == s { + idx = i + break + } + } + return int64(idx) + case []any: + for i, elem := range v { + if valuesEqual(elem, target) { + return int64(i) + } + } + return int64(-1) + case []byte: + tb, ok := target.([]byte) + if !ok { + return NewError("bytes index_of() requires bytes argument") + } + return int64(bytes.Index(v, tb)) + default: + return NewError(fmt.Sprintf("index_of() not supported on %T", receiver)) + } +} + +func (interp *Interpreter) methodSlice(receiver any, args []syntax.CallArg) any { + if len(args) < 1 || len(args) > 2 { + return NewError("slice() requires 1 or 2 arguments") + } + lowVal := interp.evalExpr(args[0].Value) + if IsError(lowVal) { + return lowVal + } + low, ok := toInt64(lowVal) + if !ok { + return NewError("slice() low must be integer") + } + + switch v := receiver.(type) { + case string: + runes := []rune(v) + length := int64(len(runes)) + high := length + if len(args) == 2 { + hVal := interp.evalExpr(args[1].Value) + if IsError(hVal) { + return hVal + } + h, ok := toInt64(hVal) + if !ok { + return NewError("slice() high must be integer") + } + high = h + } + low, high = clampSlice(low, high, length) + return string(runes[low:high]) + case []any: + length := int64(len(v)) + high := length + if len(args) == 2 { + hVal := interp.evalExpr(args[1].Value) + if IsError(hVal) { + return hVal + } + h, ok := toInt64(hVal) + if !ok { + return NewError("slice() high must be integer") + } + high = h + } + low, high = clampSlice(low, high, length) + result := make([]any, high-low) + copy(result, v[low:high]) + return result + case []byte: + length := int64(len(v)) + high := length + if len(args) == 2 { + hVal := interp.evalExpr(args[1].Value) + if IsError(hVal) { + return hVal + } + h, ok := toInt64(hVal) + if !ok { + return NewError("slice() high must be integer") + } + high = h + } + low, high = clampSlice(low, high, length) + result := make([]byte, high-low) + copy(result, v[low:high]) + return result + default: + return NewError(fmt.Sprintf("slice() not supported on %T", receiver)) + } +} + +func clampSlice(low, high, length int64) (int64, int64) { + if low < 0 { + low += length + } + if high < 0 { + high += length + } + if low < 0 { + low = 0 + } + if high > length { + high = length + } + if low > high { + low = high + } + return low, high +} + +func (interp *Interpreter) methodMapValues(receiver any, args []syntax.CallArg) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("map_values() requires object, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("map_values() requires a lambda argument") + } + var argBuf [1]any + result := make(map[string]any, len(obj)) + for k, v := range obj { + argBuf[0] = v + val := interp.callLambda(lambda, argBuf[:]) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("map_values() lambda returned void") + } + if IsDeleted(val) { + continue + } + result[k] = val + } + return result +} + +func (interp *Interpreter) methodMapKeys(receiver any, args []syntax.CallArg) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("map_keys() requires object, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("map_keys() requires a lambda argument") + } + var argBuf [1]any + result := make(map[string]any, len(obj)) + for k, v := range obj { + argBuf[0] = k + newKey := interp.callLambda(lambda, argBuf[:]) + if IsError(newKey) { + return newKey + } + if IsVoid(newKey) { + return NewError("map_keys() lambda returned void") + } + if IsDeleted(newKey) { + continue + } + s, ok := newKey.(string) + if !ok { + return NewError(fmt.Sprintf("map_keys() lambda must return string, got %T", newKey)) + } + result[s] = v + } + return result +} + +func (interp *Interpreter) methodMapEntries(receiver any, args []syntax.CallArg) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("map_entries() requires object, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("map_entries() requires a lambda argument") + } + var argBuf2 [2]any + result := make(map[string]any, len(obj)) + for k, v := range obj { + argBuf2[0] = k + argBuf2[1] = v + entry := interp.callLambda(lambda, argBuf2[:]) + if IsError(entry) { + return entry + } + if IsVoid(entry) { + return NewError("map_entries() lambda returned void") + } + if IsDeleted(entry) { + continue + } + entryMap, ok := entry.(map[string]any) + if !ok { + return NewError("map_entries() lambda must return {key, value} object") + } + key, ok := entryMap["key"].(string) + if !ok { + return NewError("map_entries() returned entry missing string 'key'") + } + val, exists := entryMap["value"] + if !exists { + return NewError("map_entries() returned entry missing 'value'") + } + result[key] = val + } + return result +} + +func (interp *Interpreter) methodFilterEntries(receiver any, args []syntax.CallArg) any { + obj, ok := receiver.(map[string]any) + if !ok { + return NewError(fmt.Sprintf("filter_entries() requires object, got %T", receiver)) + } + lambda := interp.extractLambdaOrMapRef(args) + if lambda == nil { + return NewError("filter_entries() requires a lambda argument") + } + var argBuf2 [2]any + result := make(map[string]any, len(obj)) + for k, v := range obj { + argBuf2[0] = k + argBuf2[1] = v + val := interp.callLambda(lambda, argBuf2[:]) + if IsError(val) { + return val + } + if IsVoid(val) { + return NewError("filter_entries() lambda returned void") + } + b, ok := val.(bool) + if !ok { + return NewError(fmt.Sprintf("filter_entries() lambda must return bool, got %T", val)) + } + if b { + result[k] = v + } + } + return result +} + +// ExtractLambdaOrMapRef is the exported form of extractLambdaOrMapRef, used +// by the public plugin surface to translate a lambda or bare map reference +// argument into a callable LambdaExpr. +func (interp *Interpreter) ExtractLambdaOrMapRef(args []syntax.CallArg) *syntax.LambdaExpr { + return interp.extractLambdaOrMapRef(args) +} + +// extractLambdaOrMapRef gets the lambda expression from the first argument. +// If the argument is a bare identifier or qualified reference (map name), +// synthesizes a lambda that calls the map with a single parameter (Section 5.5). +func (interp *Interpreter) extractLambdaOrMapRef(args []syntax.CallArg) *syntax.LambdaExpr { + if len(args) == 0 { + return nil + } + + // Direct lambda. + if lambda, ok := args[0].Value.(*syntax.LambdaExpr); ok { + return lambda + } + + // Bare identifier or qualified reference → map name reference. + if ident, ok := args[0].Value.(*syntax.IdentExpr); ok { + if ident.Namespace != "" { + // Qualified reference: namespace::name + return interp.synthesizeNamespacedMapLambda(ident) + } + // Local map reference. + if m, exists := interp.maps[ident.Name]; exists { + return interp.synthesizeMapLambda(ident.TokenPos, ident.Name, "", m) + } + } + + return nil +} + +// synthesizeMapLambda creates a lambda that calls the given map with a single +// argument. Returns nil if the map doesn't accept exactly 1 required param. +func (interp *Interpreter) synthesizeMapLambda(pos syntax.Pos, name, namespace string, m *syntax.MapDecl) *syntax.LambdaExpr { + required := 0 + for _, p := range m.Params { + if p.Default == nil && !p.Discard { + required++ + } + } + if required != 1 { + return nil // will trigger "requires a lambda argument" error + } + return &syntax.LambdaExpr{ + TokenPos: pos, + Params: []syntax.Param{{Name: "__arg", Pos: pos}}, + Body: &syntax.ExprBody{ + Result: &syntax.CallExpr{ + TokenPos: pos, + Namespace: namespace, + Name: name, + Args: []syntax.CallArg{{Value: &syntax.IdentExpr{TokenPos: pos, Name: "__arg"}}}, + }, + }, + } +} + +// synthesizeNamespacedMapLambda looks up a qualified map reference and +// synthesizes a lambda for it. +func (interp *Interpreter) synthesizeNamespacedMapLambda(ident *syntax.IdentExpr) *syntax.LambdaExpr { + ns, ok := interp.namespaces[ident.Namespace] + if !ok { + return nil + } + m, ok := ns[ident.Name] + if !ok { + return nil + } + return interp.synthesizeMapLambda(ident.TokenPos, ident.Name, ident.Namespace, m) +} + +// compareForSort compares two values for sort ordering. Returns -1, 0, or 1. +func compareForSort(a, b any) any { + // Handle NaN: sorts after everything. + aNaN := isNaN(a) + bNaN := isNaN(b) + if aNaN && bNaN { + return int64(0) + } + if aNaN { + return int64(1) + } + if bNaN { + return int64(-1) + } + + // Numeric comparison with checked promotion. + if isNumeric(a) && isNumeric(b) { + pl, pr, kind, promErr := promoteChecked(a, b) + if promErr != "" { + return NewError(promErr) + } + switch kind { + case promoteInt64: + return cmpOrdered(pl.(int64), pr.(int64)) + case promoteInt32: + return cmpOrdered(int64(pl.(int32)), int64(pr.(int32))) + case promoteUint32: + return cmpOrdered(uint64(pl.(uint32)), uint64(pr.(uint32))) + case promoteUint64: + return cmpOrdered(pl.(uint64), pr.(uint64)) + case promoteFloat64: + av, bv := pl.(float64), pr.(float64) + if av < bv { + return int64(-1) + } + if av > bv { + return int64(1) + } + return int64(0) + case promoteFloat32: + av, bv := float64(pl.(float32)), float64(pr.(float32)) + if av < bv { + return int64(-1) + } + if av > bv { + return int64(1) + } + return int64(0) + } + return int64(0) + } + + // String comparison. + if as, ok := a.(string); ok { + if bs, ok := b.(string); ok { + if as < bs { + return int64(-1) + } + if as > bs { + return int64(1) + } + return int64(0) + } + } + + // Timestamp comparison. + if at, ok := a.(time.Time); ok { + if bt, ok := b.(time.Time); ok { + if at.Before(bt) { + return int64(-1) + } + if at.After(bt) { + return int64(1) + } + return int64(0) + } + } + + return NewError(fmt.Sprintf("cannot sort: incompatible types %T and %T", a, b)) +} + +type cmpOrderable interface { + ~int64 | ~uint64 +} + +func cmpOrdered[T cmpOrderable](a, b T) int64 { + if a < b { + return -1 + } + if a > b { + return 1 + } + return 0 +} + +func isSortable(v any) bool { + switch v.(type) { + case int32, int64, uint32, uint64, float32, float64, string, time.Time: + return true + default: + return false + } +} + +func isNaN(v any) bool { + switch n := v.(type) { + case float64: + return math.IsNaN(n) + case float32: + return math.IsNaN(float64(n)) + default: + return false + } +} diff --git a/internal/bloblang2/go/pratt/eval/stdlib_message.go b/internal/bloblang2/go/pratt/eval/stdlib_message.go new file mode 100644 index 000000000..4facf8062 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/stdlib_message.go @@ -0,0 +1,59 @@ +package eval + +// registerMessageFunctions registers stdlib functions that read from the +// bound MessageContext (batch_index, batch_size, content, error, +// errored, tracing_id, tracing_span). They are dispatched only when the +// interpreter is running with a MessageContext bound; otherwise the +// caller sees a runtime error of the form +// "function NAME requires a message context, but Run was called without +// one". +// +// Each function uses MessageFunctionFunc, which causes +// RegisterFunction to set RequiresMessageContext on the spec +// automatically. Folding bypass is implicit: MessageFn functions leave +// Fn nil, so the resolver has nothing to fold against. +func (interp *Interpreter) registerMessageFunctions() { + interp.RegisterFunction("batch_index", FunctionSpec{ + MessageFn: func(msg MessageContext, _ []any) any { + return int64(msg.BatchIndex()) + }, + }) + interp.RegisterFunction("batch_size", FunctionSpec{ + MessageFn: func(msg MessageContext, _ []any) any { + return int64(msg.BatchSize()) + }, + }) + interp.RegisterFunction("content", FunctionSpec{ + MessageFn: func(msg MessageContext, _ []any) any { + return msg.Bytes() + }, + }) + interp.RegisterFunction("error", FunctionSpec{ + MessageFn: func(msg MessageContext, _ []any) any { + err := msg.Error() + if err == nil { + return nil + } + // V2 error() returns a structured object. The minimal + // shape is {what: string}; future iterations may add + // source.* fields once the underlying MessageContext.Error + // surfaces them. + return map[string]any{"what": err.Error()} + }, + }) + interp.RegisterFunction("errored", FunctionSpec{ + MessageFn: func(msg MessageContext, _ []any) any { + return msg.Error() != nil + }, + }) + interp.RegisterFunction("tracing_id", FunctionSpec{ + MessageFn: func(msg MessageContext, _ []any) any { + return msg.TraceID() + }, + }) + interp.RegisterFunction("tracing_span", FunctionSpec{ + MessageFn: func(msg MessageContext, _ []any) any { + return msg.Span() + }, + }) +} diff --git a/internal/bloblang2/go/pratt/eval/strftime.go b/internal/bloblang2/go/pratt/eval/strftime.go new file mode 100644 index 000000000..6a8c96bf9 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/strftime.go @@ -0,0 +1,194 @@ +package eval + +import ( + "regexp" + "strconv" + "strings" + "time" + + "github.com/itchyny/timefmt-go" +) + +// strftimeParse parses a string using a strftime format. Handles %f and %z +// with spec-compliant semantics, delegating all other directives to timefmt-go. +// +// %f semantics (parsing): optional — consumes leading '.' and 1-9 fractional +// digits if present, otherwise matches zero characters. Pads to nanoseconds. +// +// %z semantics (parsing): accepts 'Z', '+HH:MM', '-HH:MM', '+HHMM', '-HHMM'. +func strftimeParse(input, format string) (time.Time, error) { + hasFrac := strings.Contains(format, "%f") + hasZone := strings.Contains(format, "%z") + + cleanFmt := format + cleanInput := input + var nanos int + + if hasFrac { + // Remove %f from format. Extract fractional seconds from input. + cleanFmt = strings.Replace(cleanFmt, "%f", "", 1) + cleanInput, nanos = extractFractionalSeconds(cleanInput, cleanFmt) + } + + if hasZone { + // Replace %z with timefmt-compatible format. timefmt supports %z + // but only for +HHMM format. We need to normalize the input to + // match what timefmt expects. + cleanInput, cleanFmt = normalizeTimezone(cleanInput, cleanFmt) + } + + t, err := timefmt.Parse(cleanInput, cleanFmt) + if err != nil { + return time.Time{}, err + } + + if nanos > 0 { + t = t.Add(time.Duration(nanos)) + } + + return t, nil +} + +// strftimeFormat formats a timestamp using a strftime format. Handles %f and +// %z with spec-compliant semantics. +// +// %f semantics (formatting): emits shortest fractional seconds with leading +// dot, trailing zeros trimmed. Omitted entirely (including dot) when zero. +// +// %z semantics (formatting): 'Z' for UTC, '±HH:MM' for all other offsets. +func strftimeFormat(t time.Time, format string) string { + hasFrac := strings.Contains(format, "%f") + hasZone := strings.Contains(format, "%z") + + workFmt := format + + // Replace %f with a sentinel for post-processing. + const fracSentinel = "\x00FRAC\x00" + if hasFrac { + workFmt = strings.Replace(workFmt, "%f", fracSentinel, 1) + } + + // Replace %z with a sentinel for post-processing. + const zoneSentinel = "\x00ZONE\x00" + if hasZone { + workFmt = strings.Replace(workFmt, "%z", zoneSentinel, 1) + } + + // Format with timefmt (sentinels pass through as literals). + result := timefmt.Format(t, workFmt) + + // Replace sentinels with spec-compliant values. + if hasFrac { + result = strings.Replace(result, fracSentinel, formatFractionalSeconds(t), 1) + } + if hasZone { + result = strings.Replace(result, zoneSentinel, formatTimezone(t), 1) + } + + return result +} + +// extractFractionalSeconds finds and removes optional fractional seconds +// (a '.' followed by 1-9 digits) from the input string. Returns the +// cleaned input and the nanoseconds value. +// +// The position is determined by finding a '.' followed by digits that +// is NOT part of the format's literal text. We use the format (with %f +// already removed) to locate where the fractional seconds should appear. +func extractFractionalSeconds(input, formatWithoutF string) (string, int) { + // Strategy: the fractional seconds appear as '.\d{1,9}' at a position + // in the input that doesn't correspond to any format directive. Since + // %f was between other directives (typically %S and %z or end), we + // look for a '.' followed by digits that is not matched by the + // cleaned format. + // + // Pragmatic approach: find all occurrences of \.\d{1,9} in the input + // and try removing each one to see if the remaining string parses + // with the cleaned format. Use the first one that works. + // + // Simpler approach for the common case: find a '.' followed by digits + // that appears after the seconds portion. Since we can't easily + // determine the seconds position from the format, we try all matches. + re := regexp.MustCompile(`\.(\d{1,9})`) + matches := re.FindAllStringIndex(input, -1) + + // Try removing each match (last to first to preserve indices). + for i := len(matches) - 1; i >= 0; i-- { + loc := matches[i] + candidate := input[:loc[0]] + input[loc[1]:] + if _, err := timefmt.Parse(candidate, formatWithoutF); err == nil { + // This match is the fractional seconds. + digits := input[loc[0]+1 : loc[1]] // skip the '.' + nanos := parseFracDigits(digits) + return candidate, nanos + } + } + + // No fractional seconds found — that's OK, %f is optional. + return input, 0 +} + +// parseFracDigits parses fractional second digits (1-9) into nanoseconds. +func parseFracDigits(digits string) int { + // Pad to 9 digits. + for len(digits) < 9 { + digits += "0" + } + if len(digits) > 9 { + digits = digits[:9] + } + n, _ := strconv.Atoi(digits) + return n +} + +// formatFractionalSeconds produces the spec-compliant %f output: +// shortest representation with leading dot, trailing zeros trimmed. +// Empty string when fractional seconds are zero. +func formatFractionalSeconds(t time.Time) string { + ns := t.Nanosecond() + if ns == 0 { + return "" + } + // Format as 9-digit string, then trim trailing zeros. + s := strconv.Itoa(ns) + for len(s) < 9 { + s = "0" + s + } + s = strings.TrimRight(s, "0") + return "." + s +} + +// normalizeTimezone adjusts the input string so that timezone offsets +// match what timefmt-go expects for %z parsing. +// timefmt-go's %z accepts: +HHMM, -HHMM, +HH:MM, -HH:MM, Z +func normalizeTimezone(input, format string) (string, string) { + // timefmt-go handles %z reasonably well for parsing. The main issue + // is that 'Z' needs to be handled. timefmt-go actually supports Z + // in %z parsing, so we can pass through. + return input, format +} + +// formatTimezone produces the spec-compliant %z output: +// 'Z' for UTC, '±HH:MM' for all other offsets. +func formatTimezone(t time.Time) string { + _, offset := t.Zone() + if offset == 0 && t.Location() == time.UTC { + return "Z" + } + sign := "+" + if offset < 0 { + sign = "-" + offset = -offset + } + hours := offset / 3600 + minutes := (offset % 3600) / 60 + return sign + padInt(hours, 2) + ":" + padInt(minutes, 2) +} + +func padInt(n, width int) string { + s := strconv.Itoa(n) + for len(s) < width { + s = "0" + s + } + return s +} diff --git a/internal/bloblang2/go/pratt/eval/strftime_test.go b/internal/bloblang2/go/pratt/eval/strftime_test.go new file mode 100644 index 000000000..c89de6e66 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/strftime_test.go @@ -0,0 +1,92 @@ +package eval + +import ( + "testing" + "time" +) + +func TestStrftimeParse_DefaultFormat(t *testing.T) { + tests := []struct { + name string + input string + want time.Time + }{ + {"no frac", "2024-01-15T10:30:00Z", time.Date(2024, 1, 15, 10, 30, 0, 0, time.UTC)}, + {"millis", "2024-01-15T10:30:00.123Z", time.Date(2024, 1, 15, 10, 30, 0, 123000000, time.UTC)}, + {"nanos", "2024-01-15T10:30:00.123456789Z", time.Date(2024, 1, 15, 10, 30, 0, 123456789, time.UTC)}, + {"positive offset", "2024-01-15T10:30:00+05:30", time.Date(2024, 1, 15, 10, 30, 0, 0, time.FixedZone("", 5*3600+30*60))}, + {"negative offset", "2024-01-15T10:30:00-08:00", time.Date(2024, 1, 15, 10, 30, 0, 0, time.FixedZone("", -8*3600))}, + {"frac with offset", "2024-01-15T10:30:00.5+01:00", time.Date(2024, 1, 15, 10, 30, 0, 500000000, time.FixedZone("", 3600))}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + got, err := strftimeParse(tt.input, defaultTimestampFormat) + if err != nil { + t.Fatalf("parse error: %v", err) + } + if !got.Equal(tt.want) { + t.Fatalf("expected %v, got %v", tt.want, got) + } + if got.Nanosecond() != tt.want.Nanosecond() { + t.Fatalf("nanos: expected %d, got %d", tt.want.Nanosecond(), got.Nanosecond()) + } + }) + } +} + +func TestStrftimeParse_CustomFormat(t *testing.T) { + got, err := strftimeParse("2024-01-15", "%Y-%m-%d") + if err != nil { + t.Fatalf("parse error: %v", err) + } + if got.Year() != 2024 || got.Month() != 1 || got.Day() != 15 { + t.Fatalf("expected 2024-01-15, got %v", got) + } +} + +func TestStrftimeFormat_DefaultFormat(t *testing.T) { + tests := []struct { + name string + t time.Time + want string + }{ + {"no frac", time.Date(2024, 1, 15, 10, 30, 0, 0, time.UTC), "2024-01-15T10:30:00Z"}, + {"millis", time.Date(2024, 1, 15, 10, 30, 0, 123000000, time.UTC), "2024-01-15T10:30:00.123Z"}, + {"nanos", time.Date(2024, 1, 15, 10, 30, 0, 123456789, time.UTC), "2024-01-15T10:30:00.123456789Z"}, + {"trailing zeros trimmed", time.Date(2024, 1, 15, 10, 30, 0, 500000000, time.UTC), "2024-01-15T10:30:00.5Z"}, + {"offset", time.Date(2024, 1, 15, 10, 30, 0, 0, time.FixedZone("EST", -5*3600)), "2024-01-15T10:30:00-05:00"}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + got := strftimeFormat(tt.t, defaultTimestampFormat) + if got != tt.want { + t.Fatalf("expected %q, got %q", tt.want, got) + } + }) + } +} + +func TestStrftimeFormat_CustomFormat(t *testing.T) { + ts := time.Date(2024, 1, 15, 0, 0, 0, 0, time.UTC) + got := strftimeFormat(ts, "%Y-%m-%d") + if got != "2024-01-15" { + t.Fatalf("expected %q, got %q", "2024-01-15", got) + } +} + +func TestStrftimeRoundTrip(t *testing.T) { + original := time.Date(2024, 3, 1, 12, 30, 45, 123456789, time.UTC) + formatted := strftimeFormat(original, defaultTimestampFormat) + parsed, err := strftimeParse(formatted, defaultTimestampFormat) + if err != nil { + t.Fatalf("round-trip parse error: %v", err) + } + if !parsed.Equal(original) { + t.Fatalf("round-trip failed: %v != %v", original, parsed) + } + if parsed.Nanosecond() != original.Nanosecond() { + t.Fatalf("nanos lost: %d != %d", original.Nanosecond(), parsed.Nanosecond()) + } +} diff --git a/internal/bloblang2/go/pratt/eval/value.go b/internal/bloblang2/go/pratt/eval/value.go new file mode 100644 index 000000000..cc45fc764 --- /dev/null +++ b/internal/bloblang2/go/pratt/eval/value.go @@ -0,0 +1,56 @@ +package eval + +// Special sentinel values used internally by the interpreter. +// These are distinct from normal Bloblang values (which use native Go types). + +type ( + voidVal struct{} + deletedVal struct{} + errorVal struct{ message string } + uninitializedVal struct{} // stack slot not yet assigned +) + +// uninitialized is the sentinel for stack slots that haven't been written. +// Distinguishes "variable holds nil" from "variable not yet declared". +var uninitialized = uninitializedVal{} + +// Void is the singleton void value. It represents the absence of a value, +// produced by if-without-else when the condition is false, or by +// match-without-wildcard when no case matches. +var Void = voidVal{} + +// Deleted is the singleton deletion marker. When assigned to a field, +// it removes the field. When assigned to root output, it drops the message. +var Deleted = deletedVal{} + +// NewError creates a runtime error value that propagates through +// postfix chains until caught by .catch(). +func NewError(msg string) errorVal { + return errorVal{message: msg} +} + +// IsVoid reports whether v is the void sentinel. +func IsVoid(v any) bool { + _, ok := v.(voidVal) + return ok +} + +// IsDeleted reports whether v is the deletion sentinel. +func IsDeleted(v any) bool { + _, ok := v.(deletedVal) + return ok +} + +// IsError reports whether v is a runtime error value. +func IsError(v any) bool { + _, ok := v.(errorVal) + return ok +} + +// ErrorMessage returns the error message if v is an errorVal, or empty string. +func ErrorMessage(v any) string { + if e, ok := v.(errorVal); ok { + return e.message + } + return "" +} From c5d91f07366c7e2df9711758c1a3dd74697d9926 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Thu, 9 Apr 2026 10:31:02 +0100 Subject: [PATCH 04/20] bloblang(v2): Add spec conformance test runner Adds internal/bloblang2/go/spectest/, a runner that reads the YAML spec test corpus, executes each case against a configurable interpreter, and produces structured pass/fail reports. Provides the schema for spec tests, typed-value support so YAML can carry V2-typed inputs/outputs, and a compare layer that distinguishes exact-equality assertions from error-shape assertions. The runner is shared by both the Go interpreter tests and (via the TypeScript port) the TS interpreter tests. --- internal/bloblang2/go/spectest/compare.go | 197 +++ .../bloblang2/go/spectest/compare_test.go | 204 +++ internal/bloblang2/go/spectest/interpreter.go | 49 + internal/bloblang2/go/spectest/runner.go | 501 +++++++ internal/bloblang2/go/spectest/runner_test.go | 1183 +++++++++++++++++ internal/bloblang2/go/spectest/schema.go | 140 ++ internal/bloblang2/go/spectest/schema_test.go | 137 ++ internal/bloblang2/go/spectest/typedvalue.go | 158 +++ .../bloblang2/go/spectest/typedvalue_test.go | 228 ++++ 9 files changed, 2797 insertions(+) create mode 100644 internal/bloblang2/go/spectest/compare.go create mode 100644 internal/bloblang2/go/spectest/compare_test.go create mode 100644 internal/bloblang2/go/spectest/interpreter.go create mode 100644 internal/bloblang2/go/spectest/runner.go create mode 100644 internal/bloblang2/go/spectest/runner_test.go create mode 100644 internal/bloblang2/go/spectest/schema.go create mode 100644 internal/bloblang2/go/spectest/schema_test.go create mode 100644 internal/bloblang2/go/spectest/typedvalue.go create mode 100644 internal/bloblang2/go/spectest/typedvalue_test.go diff --git a/internal/bloblang2/go/spectest/compare.go b/internal/bloblang2/go/spectest/compare.go new file mode 100644 index 000000000..e924080d2 --- /dev/null +++ b/internal/bloblang2/go/spectest/compare.go @@ -0,0 +1,197 @@ +package spectest + +import ( + "bytes" + "fmt" + "math" + "reflect" + "sort" + "strings" + "time" +) + +// DeepEqual compares expected and actual values using spec-aware semantics. +// Returns true if they match, or false with a human-readable diff message. +// +// Differences from reflect.DeepEqual: +// - NaN == NaN is true (for test assertions) +// - float32 and float64 are compared within their own type (no cross-type promotion) +// - time.Time uses .Equal() for timezone-aware comparison +// - Produces a path-annotated diff message on mismatch +func DeepEqual(expected, actual any) (bool, string) { + return deepEqual(expected, actual, "") +} + +func deepEqual(expected, actual any, path string) (bool, string) { + if path == "" { + path = "root" + } + + // Both nil. + if expected == nil && actual == nil { + return true, "" + } + if expected == nil || actual == nil { + return false, fmt.Sprintf("%s: expected %v (%T), got %v (%T)", path, expected, expected, actual, actual) + } + + // Type must match exactly for typed values. + et := reflect.TypeOf(expected) + at := reflect.TypeOf(actual) + if et != at { + return false, fmt.Sprintf("%s: type mismatch: expected %T, got %T (expected value: %v, actual value: %v)", path, expected, actual, expected, actual) + } + + switch ev := expected.(type) { + case map[string]any: + av := actual.(map[string]any) + return compareMaps(ev, av, path) + case []any: + av := actual.([]any) + return compareSlices(ev, av, path) + case float64: + av := actual.(float64) + return compareFloat64(ev, av, path) + case float32: + av := actual.(float32) + return compareFloat32(ev, av, path) + case time.Time: + av := actual.(time.Time) + if ev.Equal(av) { + return true, "" + } + return false, fmt.Sprintf("%s: timestamp mismatch: expected %s, got %s", path, ev.Format(time.RFC3339Nano), av.Format(time.RFC3339Nano)) + case []byte: + av := actual.([]byte) + if bytes.Equal(ev, av) { + return true, "" + } + return false, fmt.Sprintf("%s: bytes mismatch: expected %v, got %v", path, ev, av) + default: + if expected == actual { + return true, "" + } + return false, fmt.Sprintf("%s: expected %v (%T), got %v (%T)", path, expected, expected, actual, actual) + } +} + +func compareMaps(expected, actual map[string]any, path string) (bool, string) { + // Check for missing and extra keys. + var diffs []string + for k := range expected { + if _, ok := actual[k]; !ok { + diffs = append(diffs, fmt.Sprintf("%s: missing key %q", path, k)) + } + } + for k := range actual { + if _, ok := expected[k]; !ok { + diffs = append(diffs, fmt.Sprintf("%s: unexpected key %q", path, k)) + } + } + if len(diffs) > 0 { + sort.Strings(diffs) + return false, strings.Join(diffs, "\n") + } + + // Compare values for each key. + keys := make([]string, 0, len(expected)) + for k := range expected { + keys = append(keys, k) + } + sort.Strings(keys) + + for _, k := range keys { + ok, diff := deepEqual(expected[k], actual[k], path+"."+k) + if !ok { + return false, diff + } + } + return true, "" +} + +func compareSlices(expected, actual []any, path string) (bool, string) { + if len(expected) != len(actual) { + return false, fmt.Sprintf("%s: array length mismatch: expected %d, got %d", path, len(expected), len(actual)) + } + for i := range expected { + ok, diff := deepEqual(expected[i], actual[i], fmt.Sprintf("%s[%d]", path, i)) + if !ok { + return false, diff + } + } + return true, "" +} + +func compareFloat64(expected, actual float64, path string) (bool, string) { + // NaN == NaN for test assertion purposes. + if math.IsNaN(expected) && math.IsNaN(actual) { + return true, "" + } + // -0.0 == 0.0 per spec (they are equal per IEEE 754). + if expected == 0 && actual == 0 { + return true, "" + } + // Bitwise comparison for exact values (handles ±Inf). + if math.Float64bits(expected) == math.Float64bits(actual) { + return true, "" + } + return false, fmt.Sprintf("%s: float64 mismatch: expected %v, got %v", path, expected, actual) +} + +func compareFloat32(expected, actual float32, path string) (bool, string) { + // NaN == NaN for test assertion purposes. + if math.IsNaN(float64(expected)) && math.IsNaN(float64(actual)) { + return true, "" + } + // -0.0 == 0.0 per spec. + if expected == 0 && actual == 0 { + return true, "" + } + if math.Float32bits(expected) == math.Float32bits(actual) { + return true, "" + } + return false, fmt.Sprintf("%s: float32 mismatch: expected %v, got %v", path, expected, actual) +} + +// CheckOutputType verifies that actual has the expected Bloblang type name. +func CheckOutputType(expectedType string, actual any) (bool, string) { + actualType := goTypeToBloblangType(actual) + if actualType == expectedType { + return true, "" + } + return false, fmt.Sprintf("output type: expected %q, got %q (%T)", expectedType, actualType, actual) +} + +func goTypeToBloblangType(v any) string { + if v == nil { + return "null" + } + switch v.(type) { + case string: + return "string" + case int32: + return "int32" + case int64: + return "int64" + case uint32: + return "uint32" + case uint64: + return "uint64" + case float32: + return "float32" + case float64: + return "float64" + case bool: + return "bool" + case []byte: + return "bytes" + case time.Time: + return "timestamp" + case []any: + return "array" + case map[string]any: + return "object" + default: + return fmt.Sprintf("unknown(%T)", v) + } +} diff --git a/internal/bloblang2/go/spectest/compare_test.go b/internal/bloblang2/go/spectest/compare_test.go new file mode 100644 index 000000000..a9d90d72e --- /dev/null +++ b/internal/bloblang2/go/spectest/compare_test.go @@ -0,0 +1,204 @@ +package spectest + +import ( + "math" + "testing" + "time" +) + +func TestDeepEqual(t *testing.T) { + tests := []struct { + name string + expected any + actual any + want bool + }{ + // Nil. + {"both nil", nil, nil, true}, + {"expected nil actual not", nil, "hello", false}, + {"actual nil expected not", "hello", nil, false}, + + // Strings. + {"equal strings", "hello", "hello", true}, + {"different strings", "hello", "world", false}, + + // Booleans. + {"equal bools", true, true, true}, + {"different bools", true, false, false}, + + // int64. + {"equal int64", int64(42), int64(42), true}, + {"different int64", int64(42), int64(43), false}, + + // int32. + {"equal int32", int32(10), int32(10), true}, + {"different int32", int32(10), int32(11), false}, + + // Type mismatch between int types. + {"int32 vs int64", int32(5), int64(5), false}, + {"int64 vs uint64", int64(5), uint64(5), false}, + + // uint32. + {"equal uint32", uint32(100), uint32(100), true}, + + // uint64. + {"equal uint64", uint64(999), uint64(999), true}, + + // float64. + {"equal float64", 3.14, 3.14, true}, + {"different float64", 3.14, 3.15, false}, + {"float64 NaN both", math.NaN(), math.NaN(), true}, + {"float64 +Inf", math.Inf(1), math.Inf(1), true}, + {"float64 -Inf", math.Inf(-1), math.Inf(-1), true}, + {"float64 +Inf vs -Inf", math.Inf(1), math.Inf(-1), false}, + {"float64 NaN vs number", math.NaN(), 1.0, false}, + {"float64 -0 vs +0", math.Float64frombits(1 << 63), 0.0, true}, // spec: -0.0 == 0.0 + + // float32. + {"equal float32", float32(1.5), float32(1.5), true}, + {"different float32", float32(1.5), float32(2.5), false}, + {"float32 NaN both", float32(math.NaN()), float32(math.NaN()), true}, + + // float32 vs float64 type mismatch. + {"float32 vs float64", float32(1.0), float64(1.0), false}, + + // Bytes. + {"equal bytes", []byte("hello"), []byte("hello"), true}, + {"different bytes", []byte("hello"), []byte("world"), false}, + + // Time. + { + "equal timestamps", + time.Date(2024, 3, 1, 12, 0, 0, 0, time.UTC), + time.Date(2024, 3, 1, 12, 0, 0, 0, time.UTC), + true, + }, + { + "equal timestamps different timezone", + time.Date(2024, 3, 1, 12, 0, 0, 0, time.UTC), + time.Date(2024, 3, 1, 7, 0, 0, 0, time.FixedZone("EST", -5*3600)), + true, + }, + { + "different timestamps", + time.Date(2024, 3, 1, 12, 0, 0, 0, time.UTC), + time.Date(2024, 3, 1, 13, 0, 0, 0, time.UTC), + false, + }, + + // Maps. + { + "equal maps", + map[string]any{"a": int64(1), "b": int64(2)}, + map[string]any{"b": int64(2), "a": int64(1)}, + true, + }, + { + "maps different values", + map[string]any{"a": int64(1)}, + map[string]any{"a": int64(2)}, + false, + }, + { + "maps missing key", + map[string]any{"a": int64(1), "b": int64(2)}, + map[string]any{"a": int64(1)}, + false, + }, + { + "maps extra key", + map[string]any{"a": int64(1)}, + map[string]any{"a": int64(1), "b": int64(2)}, + false, + }, + { + "empty maps", + map[string]any{}, + map[string]any{}, + true, + }, + + // Slices. + { + "equal slices", + []any{int64(1), "two", true}, + []any{int64(1), "two", true}, + true, + }, + { + "slices different length", + []any{int64(1), int64(2)}, + []any{int64(1)}, + false, + }, + { + "slices different element", + []any{int64(1), int64(2)}, + []any{int64(1), int64(3)}, + false, + }, + { + "empty slices", + []any{}, + []any{}, + true, + }, + + // Nested structures. + { + "nested equal", + map[string]any{"items": []any{map[string]any{"id": int64(1)}}}, + map[string]any{"items": []any{map[string]any{"id": int64(1)}}}, + true, + }, + { + "nested different", + map[string]any{"items": []any{map[string]any{"id": int64(1)}}}, + map[string]any{"items": []any{map[string]any{"id": int64(2)}}}, + false, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + got, diff := DeepEqual(tt.expected, tt.actual) + if got != tt.want { + if tt.want { + t.Fatalf("expected equal, got diff: %s", diff) + } else { + t.Fatalf("expected not equal, but got equal") + } + } + }) + } +} + +func TestCheckOutputType(t *testing.T) { + tests := []struct { + name string + expectedType string + actual any + want bool + }{ + {"string", "string", "hello", true}, + {"int64", "int64", int64(42), true}, + {"int32", "int32", int32(42), true}, + {"float64", "float64", 3.14, true}, + {"bool", "bool", true, true}, + {"null", "null", nil, true}, + {"bytes", "bytes", []byte{1, 2}, true}, + {"timestamp", "timestamp", time.Now(), true}, + {"array", "array", []any{1}, true}, + {"object", "object", map[string]any{}, true}, + {"wrong type", "string", int64(42), false}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + got, _ := CheckOutputType(tt.expectedType, tt.actual) + if got != tt.want { + t.Fatalf("CheckOutputType(%q, %T): expected %v, got %v", tt.expectedType, tt.actual, tt.want, got) + } + }) + } +} diff --git a/internal/bloblang2/go/spectest/interpreter.go b/internal/bloblang2/go/spectest/interpreter.go new file mode 100644 index 000000000..cfd1060f8 --- /dev/null +++ b/internal/bloblang2/go/spectest/interpreter.go @@ -0,0 +1,49 @@ +package spectest + +// Interpreter compiles and executes Bloblang V2 mappings. +type Interpreter interface { + // Compile parses a mapping string. files provides a virtual filesystem + // for import resolution (filename -> content). If compilation fails, + // return a *CompileError. + Compile(mapping string, files map[string]string) (Mapping, error) +} + +// Mapping is a compiled Bloblang mapping ready for execution. +type Mapping interface { + // Exec runs the mapping against the given input document and metadata. + // Returns the output document, output metadata, whether the message was + // deleted (output = deleted()), and any runtime error. + // + // Runtime errors must NOT be wrapped as *CompileError — the test runner + // uses type assertion to distinguish compile errors from runtime errors. + // + // Values passed as input and returned as output use native Go types: + // + // Bloblang type | Go type + // --------------|-------- + // string | string + // int32 | int32 + // int64 | int64 + // uint32 | uint32 + // uint64 | uint64 + // float32 | float32 + // float64 | float64 + // bool | bool + // null | nil + // bytes | []byte + // timestamp | time.Time + // array | []any + // object | map[string]any + // + // Metadata is always an object (map[string]any). When the mapping does + // not modify metadata, return an empty map. + Exec(input any, metadata map[string]any) (output any, outputMeta map[string]any, deleted bool, err error) +} + +// CompileError indicates a compilation failure. The test runner uses +// errors.As to distinguish compile errors from runtime errors. +type CompileError struct { + Message string +} + +func (e *CompileError) Error() string { return e.Message } diff --git a/internal/bloblang2/go/spectest/runner.go b/internal/bloblang2/go/spectest/runner.go new file mode 100644 index 000000000..404c9e558 --- /dev/null +++ b/internal/bloblang2/go/spectest/runner.go @@ -0,0 +1,501 @@ +package spectest + +import ( + "errors" + "fmt" + "path/filepath" + "strings" + "testing" +) + +// ResultKind classifies the outcome of a test case. +type ResultKind int + +const ( + // KindPass indicates the test passed. + KindPass ResultKind = iota + // KindFail indicates the test produced incorrect results. + KindFail + // KindLoadError indicates the test file could not be loaded. + KindLoadError + // KindInvalidTest indicates the test specification is malformed. + KindInvalidTest +) + +// Result represents the outcome of a single test case execution. +type Result struct { + File string // path to the YAML test file + Test string // test case name + Case string // case name within a multi-case test (empty for single-case tests) + Kind ResultKind // classification of the outcome + Err error // nil if the test passed +} + +// Passed returns true if this test passed. +func (r Result) Passed() bool { return r.Err == nil } + +// String returns a human-readable summary of this result. +func (r Result) String() string { + name := r.Test + if r.Case != "" { + name += "/" + r.Case + } + if r.Err == nil { + return fmt.Sprintf("PASS %s / %s", r.File, name) + } + return fmt.Sprintf("FAIL %s / %s: %v", r.File, name, r.Err) +} + +// Run discovers and executes all spec tests in dir using the given +// interpreter. Returns a result for every test case. The error return +// is reserved for infrastructure failures (directory not found, etc.) — +// individual test failures are reported in the results slice. +func Run(dir string, interp Interpreter) ([]Result, error) { + files, err := DiscoverFiles(dir) + if err != nil { + return nil, fmt.Errorf("discovering test files: %w", err) + } + if len(files) == 0 { + return nil, fmt.Errorf("no test files found in %s", dir) + } + + var results []Result + for _, path := range files { + rel, err := filepath.Rel(dir, path) + if err != nil { + rel = path + } + tf, err := LoadFile(path) + if err != nil { + results = append(results, Result{ + File: rel, + Test: "(load)", + Kind: KindLoadError, + Err: fmt.Errorf("loading test file: %w", err), + }) + continue + } + results = append(results, RunFile(tf, rel, interp)...) + } + return results, nil +} + +// RunFile executes all tests from a single parsed TestFile and returns +// a result for each test case. +func RunFile(file *TestFile, filePath string, interp Interpreter) []Result { + results := make([]Result, 0, len(file.Tests)) + for i := range file.Tests { + tc := &file.Tests[i] + if len(tc.Cases) > 0 { + results = append(results, runMultiCaseTest(file, tc, filePath, interp)...) + } else { + kind, err := runTestCase(file, tc, interp) + results = append(results, Result{ + File: filePath, + Test: tc.Name, + Kind: kind, + Err: err, + }) + } + } + return results +} + +// RunT is a convenience that runs all spec tests and reports failures +// through testing.T with proper subtest hierarchy. +func RunT(t *testing.T, dir string, interp Interpreter) { + t.Helper() + + files, err := DiscoverFiles(dir) + if err != nil { + t.Fatalf("discovering test files: %v", err) + } + if len(files) == 0 { + t.Fatalf("no test files found in %s", dir) + } + + for _, path := range files { + rel, relErr := filepath.Rel(dir, path) + if relErr != nil { + rel = path + } + t.Run(rel, func(t *testing.T) { + tf, err := LoadFile(path) + if err != nil { + t.Fatalf("loading test file: %v", err) + } + for i := range tf.Tests { + tc := &tf.Tests[i] + if len(tc.Cases) > 0 { + t.Run(tc.Name, func(t *testing.T) { + for _, r := range runMultiCaseTest(tf, tc, rel, interp) { + t.Run(r.Case, func(t *testing.T) { + if r.Err != nil { + t.Fatal(r.Err) + } + }) + } + }) + } else { + kind, err := runTestCase(tf, tc, interp) + r := Result{File: rel, Test: tc.Name, Kind: kind, Err: err} + t.Run(r.Test, func(t *testing.T) { + if r.Err != nil { + t.Fatal(r.Err) + } + }) + } + } + }) + } +} + +// runTestCase executes a single test case and returns its result kind and +// an error if it failed. +func runTestCase(file *TestFile, tc *TestCase, interp Interpreter) (ResultKind, error) { + // 0. Validate that exactly one expectation is set. + if err := validateExpectations(tc); err != nil { + return KindInvalidTest, err + } + + // 1. Merge files: file-level + test-level (test wins). + mergedFiles := mergeFiles(file.Files, tc.Files) + + // 2. Decode inputs. + input, err := DecodeValue(tc.Input) + if err != nil { + return KindInvalidTest, fmt.Errorf("invalid test: decoding input: %w", err) + } + + inputMeta, err := decodeMetadata(tc.InputMetadata) + if err != nil { + return KindInvalidTest, fmt.Errorf("invalid test: decoding input_metadata: %w", err) + } + + // 3. Compile. + mapping, compileErr := interp.Compile(tc.Mapping, mergedFiles) + if tc.CompileError != "" { + return KindFail, checkCompileError(compileErr, tc.CompileError) + } + if compileErr != nil { + return KindFail, fmt.Errorf("unexpected compile error: %w", compileErr) + } + + // 4. Execute. + output, outputMeta, deleted, execErr := mapping.Exec(input, inputMeta) + if tc.Error != "" || tc.HasError { + return KindFail, checkRuntimeError(execErr, tc.Error) + } + if tc.Deleted { + if execErr != nil { + return KindFail, fmt.Errorf("unexpected error (expected deleted): %w", execErr) + } + if !deleted { + return KindFail, errors.New("expected message to be deleted, but it was not") + } + return KindPass, nil + } + if execErr != nil { + return KindFail, fmt.Errorf("unexpected runtime error: %w", execErr) + } + if deleted { + return KindFail, errors.New("message was unexpectedly deleted") + } + + // 5. Compare output. + if err := checkOutput(tc, output); err != nil { + return KindFail, err + } + + // 6. Compare output metadata. + if err := checkMetadata(tc, outputMeta); err != nil { + return KindFail, err + } + return KindPass, nil +} + +// validateExpectations checks that a test case specifies exactly one +// expectation: output (or no_output_check), compile_error, error, or deleted. +// Also validates that output_type is only used with no_output_check. +func validateExpectations(tc *TestCase) error { + count := 0 + if tc.CompileError != "" { + count++ + } + if tc.Error != "" || tc.HasError { + count++ + } + if tc.Deleted { + count++ + } + if tc.HasOutput || tc.Output != nil || tc.NoOutputCheck { + count++ + } + + if count == 0 { + return errors.New("invalid test: no expectation set (need output, compile_error, error, or deleted)") + } + if count > 1 { + return fmt.Errorf("invalid test: multiple expectations set (compile_error=%q, error=%q, deleted=%v, has_output=%v)", + tc.CompileError, tc.Error, tc.Deleted, tc.Output != nil || tc.NoOutputCheck) + } + + if tc.OutputType != "" && !tc.NoOutputCheck { + return errors.New("invalid test: output_type requires no_output_check to be true") + } + + return nil +} + +func mergeFiles(fileLevel, testLevel map[string]string) map[string]string { + if len(fileLevel) == 0 && len(testLevel) == 0 { + return nil + } + merged := make(map[string]string, len(fileLevel)+len(testLevel)) + for k, v := range fileLevel { + merged[k] = v + } + for k, v := range testLevel { + merged[k] = v + } + return merged +} + +func decodeMetadata(raw any) (map[string]any, error) { + if raw == nil { + return map[string]any{}, nil + } + decoded, err := DecodeValue(raw) + if err != nil { + return nil, err + } + meta, ok := decoded.(map[string]any) + if !ok { + return nil, fmt.Errorf("input_metadata must be an object, got %T", decoded) + } + return meta, nil +} + +func checkCompileError(err error, expectedSubstring string) error { + if err == nil { + return fmt.Errorf("expected compile error containing %q, but compilation succeeded", expectedSubstring) + } + var ce *CompileError + if !errors.As(err, &ce) { + return fmt.Errorf("expected a *CompileError, got %T: %v", err, err) + } + if !strings.Contains(err.Error(), expectedSubstring) { + return fmt.Errorf("compile error %q does not contain expected substring %q", err.Error(), expectedSubstring) + } + return nil +} + +func checkRuntimeError(err error, expectedSubstring string) error { + if err == nil { + return fmt.Errorf("expected runtime error containing %q, but execution succeeded", expectedSubstring) + } + var ce *CompileError + if errors.As(err, &ce) { + return fmt.Errorf("expected a runtime error, got *CompileError: %v", err) + } + if !strings.Contains(err.Error(), expectedSubstring) { + return fmt.Errorf("runtime error %q does not contain expected substring %q", err.Error(), expectedSubstring) + } + return nil +} + +func checkOutputFields(output any, outputType string, noOutputCheck bool, actual any) error { + if noOutputCheck { + if outputType != "" { + ok, diff := CheckOutputType(outputType, actual) + if !ok { + return fmt.Errorf("output type mismatch: %s", diff) + } + } + return nil + } + + expected, err := DecodeValue(output) + if err != nil { + return fmt.Errorf("invalid test: decoding expected output: %w", err) + } + + ok, diff := DeepEqual(expected, actual) + if !ok { + return fmt.Errorf("output mismatch:\n%s", diff) + } + return nil +} + +func checkOutput(tc *TestCase, actual any) error { + return checkOutputFields(tc.Output, tc.OutputType, tc.NoOutputCheck, actual) +} + +func checkMetadataFields(outputMetadata any, noMetadataCheck bool, actual map[string]any) error { + if noMetadataCheck { + return nil + } + + var expected map[string]any + if outputMetadata != nil { + decoded, err := DecodeValue(outputMetadata) + if err != nil { + return fmt.Errorf("invalid test: decoding expected output_metadata: %w", err) + } + var ok bool + expected, ok = decoded.(map[string]any) + if !ok { + return fmt.Errorf("invalid test: output_metadata must be an object, got %T", decoded) + } + } else { + expected = map[string]any{} + } + + if actual == nil { + actual = map[string]any{} + } + + ok, diff := DeepEqual(any(expected), any(actual)) + if !ok { + return fmt.Errorf("output metadata mismatch:\n%s", diff) + } + return nil +} + +func checkMetadata(tc *TestCase, actual map[string]any) error { + return checkMetadataFields(tc.OutputMetadata, tc.NoMetadataCheck, actual) +} + +// runMultiCaseTest executes a test that has multiple cases sharing one +// compiled mapping. +func runMultiCaseTest(file *TestFile, tc *TestCase, filePath string, interp Interpreter) []Result { + if err := validateMultiCase(tc); err != nil { + return []Result{{ + File: filePath, Test: tc.Name, + Kind: KindInvalidTest, Err: err, + }} + } + + mergedFiles := mergeFiles(file.Files, tc.Files) + + mapping, compileErr := interp.Compile(tc.Mapping, mergedFiles) + if compileErr != nil { + return []Result{{ + File: filePath, Test: tc.Name, + Kind: KindFail, + Err: fmt.Errorf("unexpected compile error: %w", compileErr), + }} + } + + results := make([]Result, 0, len(tc.Cases)) + for i := range tc.Cases { + c := &tc.Cases[i] + kind, err := runCase(mapping, c) + results = append(results, Result{ + File: filePath, + Test: tc.Name, + Case: c.Name, + Kind: kind, + Err: err, + }) + } + return results +} + +// runCase executes a single case against an already-compiled mapping. +func runCase(mapping Mapping, c *Case) (ResultKind, error) { + if err := validateCaseExpectations(c); err != nil { + return KindInvalidTest, err + } + + input, err := DecodeValue(c.Input) + if err != nil { + return KindInvalidTest, fmt.Errorf("invalid case: decoding input: %w", err) + } + + inputMeta, err := decodeMetadata(c.InputMetadata) + if err != nil { + return KindInvalidTest, fmt.Errorf("invalid case: decoding input_metadata: %w", err) + } + + output, outputMeta, deleted, execErr := mapping.Exec(input, inputMeta) + + if c.Error != "" || c.HasError { + return KindFail, checkRuntimeError(execErr, c.Error) + } + if c.Deleted { + if execErr != nil { + return KindFail, fmt.Errorf("unexpected error (expected deleted): %w", execErr) + } + if !deleted { + return KindFail, errors.New("expected message to be deleted, but it was not") + } + return KindPass, nil + } + if execErr != nil { + return KindFail, fmt.Errorf("unexpected runtime error: %w", execErr) + } + if deleted { + return KindFail, errors.New("message was unexpectedly deleted") + } + + if err := checkOutputFields(c.Output, c.OutputType, c.NoOutputCheck, output); err != nil { + return KindFail, err + } + if err := checkMetadataFields(c.OutputMetadata, c.NoMetadataCheck, outputMeta); err != nil { + return KindFail, err + } + return KindPass, nil +} + +// validateMultiCase checks that a multi-case test is well-formed. +func validateMultiCase(tc *TestCase) error { + if len(tc.Cases) == 0 { + return errors.New("invalid test: cases array is empty") + } + + // Cases must not coexist with inline execution fields. + if tc.HasOutput || tc.Output != nil || tc.NoOutputCheck || + tc.Error != "" || tc.HasError || tc.Deleted || + tc.Input != nil || tc.InputMetadata != nil || tc.OutputMetadata != nil { + return errors.New("invalid test: cannot mix inline input/output fields with cases") + } + + if tc.CompileError != "" { + return errors.New("invalid test: compile_error cannot be combined with cases") + } + + for i := range tc.Cases { + if tc.Cases[i].Name == "" { + return fmt.Errorf("invalid test: case at index %d has no name", i) + } + } + return nil +} + +// validateCaseExpectations checks that a case specifies exactly one expectation. +func validateCaseExpectations(c *Case) error { + count := 0 + if c.Error != "" || c.HasError { + count++ + } + if c.Deleted { + count++ + } + if c.HasOutput || c.Output != nil || c.NoOutputCheck { + count++ + } + + if count == 0 { + return errors.New("invalid case: no expectation set (need output, error, or deleted)") + } + if count > 1 { + return fmt.Errorf("invalid case: multiple expectations set (error=%q, deleted=%v, has_output=%v)", + c.Error, c.Deleted, c.Output != nil || c.NoOutputCheck) + } + + if c.OutputType != "" && !c.NoOutputCheck { + return errors.New("invalid case: output_type requires no_output_check to be true") + } + return nil +} diff --git a/internal/bloblang2/go/spectest/runner_test.go b/internal/bloblang2/go/spectest/runner_test.go new file mode 100644 index 000000000..55d7d231f --- /dev/null +++ b/internal/bloblang2/go/spectest/runner_test.go @@ -0,0 +1,1183 @@ +package spectest + +import ( + "fmt" + "os" + "path/filepath" + "testing" +) + +// mockInterpreter implements Interpreter for testing the runner itself. +type mockInterpreter struct { + compileFunc func(mapping string, files map[string]string) (Mapping, error) +} + +func (m *mockInterpreter) Compile(mapping string, files map[string]string) (Mapping, error) { + return m.compileFunc(mapping, files) +} + +// mockMapping implements Mapping for testing. +type mockMapping struct { + execFunc func(input any, metadata map[string]any) (any, map[string]any, bool, error) +} + +func (m *mockMapping) Exec(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return m.execFunc(input, metadata) +} + +func requirePass(t *testing.T, results []Result) { + t.Helper() + for _, r := range results { + if r.Err != nil { + t.Fatalf("expected all tests to pass, but %q failed: %v", r.Test, r.Err) + } + } +} + +func requireFail(t *testing.T, results []Result, testName string) { + t.Helper() + for _, r := range results { + if r.Test == testName { + if r.Err == nil { + t.Fatalf("expected test %q to fail, but it passed", testName) + } + return + } + } + t.Fatalf("test %q not found in results", testName) +} + +func TestRunFile_Success(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "basic output", + Mapping: "output.x = 42", + Output: map[string]any{"x": int(42)}, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{"x": int64(42)}, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_CompileError(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect compile error", + Mapping: "bad syntax", + CompileError: "syntax", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return nil, &CompileError{Message: "syntax error at line 1"} + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_CompileErrorWrongKind(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect compile error but get runtime", + Mapping: "bad", + CompileError: "syntax", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + // Return a plain error, not *CompileError. + return nil, fmt.Errorf("syntax issue") + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "expect compile error but get runtime") +} + +func TestRunFile_RuntimeError(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect runtime error", + Mapping: "output = 5 / 0", + Error: "division by zero", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return nil, nil, false, fmt.Errorf("division by zero") + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_RuntimeErrorWrongKind(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect runtime error but get compile", + Mapping: "output = 5 / 0", + Error: "overflow", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + // Return a *CompileError when runtime error was expected. + return nil, nil, false, &CompileError{Message: "overflow detected"} + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "expect runtime error but get compile") +} + +func TestRunFile_Deleted(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect deletion", + Mapping: "output = deleted()", + Deleted: true, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return nil, nil, true, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_FileMerge(t *testing.T) { + tf := &TestFile{ + Files: map[string]string{ + "lib.blobl": "map double(x) { x * 2 }", + }, + Tests: []TestCase{ + { + Name: "file-level files available", + Mapping: `import "lib.blobl" as l`, + Output: int64(42), + }, + { + Name: "test-level override", + Mapping: `import "lib.blobl" as l`, + Files: map[string]string{ + "lib.blobl": "map triple(x) { x * 3 }", + }, + Output: int64(42), + }, + }, + } + + var capturedFiles []map[string]string + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + cpy := make(map[string]string, len(files)) + for k, v := range files { + cpy[k] = v + } + capturedFiles = append(capturedFiles, cpy) + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return int64(42), map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) + + if len(capturedFiles) != 2 { + t.Fatalf("expected 2 compile calls, got %d", len(capturedFiles)) + } + if capturedFiles[0]["lib.blobl"] != "map double(x) { x * 2 }" { + t.Fatalf("first test should see file-level lib.blobl, got: %q", capturedFiles[0]["lib.blobl"]) + } + if capturedFiles[1]["lib.blobl"] != "map triple(x) { x * 3 }" { + t.Fatalf("second test should see test-level override, got: %q", capturedFiles[1]["lib.blobl"]) + } +} + +func TestRunFile_OutputMetadataDefault(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "metadata defaults to empty", + Mapping: "output.x = 1", + Output: map[string]any{"x": int(1)}, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{"x": int64(1)}, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_MetadataLeakDetected(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "leaked metadata caught", + Mapping: "output.x = 1", + Output: map[string]any{"x": int(1)}, + // No OutputMetadata — defaults to {}, so leaked metadata is caught. + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{"x": int64(1)}, map[string]any{"leaked": "value"}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "leaked metadata caught") +} + +func TestRunFile_NoMetadataCheck(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "skip metadata check", + Mapping: "output.x = 1", + Output: map[string]any{"x": int(1)}, + NoMetadataCheck: true, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{"x": int64(1)}, map[string]any{"leaked": "value"}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_NoOutputCheck(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "skip output check with type", + Mapping: "output = uuid_v4()", + NoOutputCheck: true, + NoMetadataCheck: true, + OutputType: "string", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return "some-uuid-value", map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_InputDecoding(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "typed input decoded", + Mapping: "output = input", + Input: map[string]any{"val": map[string]any{"_type": "float32", "value": "1.5"}}, + Output: map[string]any{"val": map[string]any{"_type": "float32", "value": "1.5"}}, + NoMetadataCheck: true, + }, + }, + } + + var capturedInput any + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + capturedInput = input + return map[string]any{"val": float32(1.5)}, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) + + inputMap, ok := capturedInput.(map[string]any) + if !ok { + t.Fatalf("expected input to be map, got %T", capturedInput) + } + val, ok := inputMap["val"].(float32) + if !ok { + t.Fatalf("expected input.val to be float32, got %T", inputMap["val"]) + } + if val != 1.5 { + t.Fatalf("expected input.val = 1.5, got %v", val) + } +} + +func TestRunFile_OutputMismatchFails(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "wrong output", + Mapping: "output.x = 1", + Output: map[string]any{"x": int(99)}, + NoMetadataCheck: true, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{"x": int64(1)}, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "wrong output") +} + +func TestRunFile_CompileErrorButSucceeds(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect compile error but succeeds", + Mapping: "output = 1", + CompileError: "syntax", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return int64(1), map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "expect compile error but succeeds") +} + +func TestRunFile_RuntimeErrorButSucceeds(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect error but succeeds", + Mapping: "output = 1", + Error: "overflow", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return int64(1), map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "expect error but succeeds") +} + +func TestRunFile_CompileErrorWrongSubstring(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "wrong substring", + Mapping: "bad", + CompileError: "overflow", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return nil, &CompileError{Message: "syntax error"} + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "wrong substring") +} + +func TestRunFile_RuntimeErrorWrongSubstring(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "wrong substring", + Mapping: "output = 5 / 0", + Error: "overflow", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return nil, nil, false, fmt.Errorf("division by zero") + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "wrong substring") +} + +func TestRunFile_UnexpectedDeletion(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "not expecting deletion", + Mapping: "output = input", + Output: map[string]any{"x": int(1)}, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return nil, nil, true, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "not expecting deletion") +} + +func TestRunFile_DeletedButGotError(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect deleted but got error", + Mapping: "output = deleted()", + Deleted: true, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return nil, nil, false, fmt.Errorf("something broke") + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "expect deleted but got error") +} + +func TestRunFile_DeletedNotSet(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "expect deleted but not deleted", + Mapping: "output = deleted()", + Deleted: true, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{}, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "expect deleted but not deleted") +} + +func TestRunFile_ExplicitOutputMetadata(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "explicit metadata match", + Mapping: "output@.topic = \"events\"", + Output: map[string]any{}, + OutputMetadata: map[string]any{"topic": "events"}, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{}, map[string]any{"topic": "events"}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_ExplicitOutputMetadataMismatch(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "metadata mismatch", + Mapping: "output@.topic = \"events\"", + Output: map[string]any{}, + OutputMetadata: map[string]any{"topic": "events"}, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return map[string]any{}, map[string]any{"topic": "wrong"}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "metadata mismatch") +} + +func TestRunFile_NullInputDefault(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "null input default", + Mapping: "output = input", + NoOutputCheck: true, + NoMetadataCheck: true, + // Input not set — defaults to nil. + }, + }, + } + + var capturedInput any + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + capturedInput = input + return nil, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) + + if capturedInput != nil { + t.Fatalf("expected nil input, got %v (%T)", capturedInput, capturedInput) + } +} + +func TestRunFile_NoOutputCheckWithoutType(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "skip output entirely", + Mapping: "output = now()", + NoOutputCheck: true, + NoMetadataCheck: true, + // No OutputType — just skip output comparison entirely. + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return "anything at all", map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) +} + +func TestRunFile_NoOutputCheckWrongType(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "wrong output type", + Mapping: "output = now()", + NoOutputCheck: true, + NoMetadataCheck: true, + OutputType: "string", + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return int64(42), map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requireFail(t, results, "wrong output type") +} + +func TestRun_WithTempDir(t *testing.T) { + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "test.yaml"), []byte(` +description: "integration" +tests: + - name: "passthrough" + mapping: | + output = input + input: {"x": 1} + output: {"x": 1} +`), 0o644); err != nil { + t.Fatal(err) + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return input, map[string]any{}, false, nil + }, + }, nil + }, + } + + results, err := Run(dir, interp) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if len(results) != 1 { + t.Fatalf("expected 1 result, got %d", len(results)) + } + requirePass(t, results) +} + +func TestRun_EmptyDir(t *testing.T) { + dir := t.TempDir() + _, err := Run(dir, nil) + if err == nil { + t.Fatal("expected error for empty dir") + } +} + +func TestRunFile_ValidationRejectsMultipleExpectations(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "both error and compile_error", + Mapping: "output = 1", + Error: "overflow", + CompileError: "syntax", + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + requireFail(t, results, "both error and compile_error") + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestRunFile_ValidationRejectsNoExpectation(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "no expectation", + Mapping: "output = 1", + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + requireFail(t, results, "no expectation") + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestRunFile_ValidationRejectsOutputTypeWithoutNoOutputCheck(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "output_type without no_output_check", + Mapping: "output = 1", + Output: int64(1), + OutputType: "int64", + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + requireFail(t, results, "output_type without no_output_check") + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestRunFile_KindLoadError(t *testing.T) { + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, "bad.yaml"), []byte("{{invalid"), 0o644); err != nil { + t.Fatal(err) + } + + results, err := Run(dir, nil) + if err != nil { + t.Fatalf("unexpected infrastructure error: %v", err) + } + if results[0].Kind != KindLoadError { + t.Fatalf("expected KindLoadError, got %v", results[0].Kind) + } +} + +func TestRunFile_KindPassOnSuccess(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "passes", + Mapping: "output = 1", + Output: int64(1), + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return int64(1), map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + if results[0].Kind != KindPass { + t.Fatalf("expected KindPass, got %v", results[0].Kind) + } +} + +// --- Multi-case tests --- + +func TestRunFile_MultiCase_AllPassing(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "doubler", + Mapping: "output.v = input.x * 2", + Cases: []Case{ + {Name: "positive", Input: map[string]any{"x": int(3)}, Output: map[string]any{"v": int(6)}}, + {Name: "zero", Input: map[string]any{"x": int(0)}, Output: map[string]any{"v": int(0)}}, + {Name: "negative", Input: map[string]any{"x": int(-5)}, Output: map[string]any{"v": int(-10)}}, + }, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + m := input.(map[string]any) + x := m["x"].(int64) + return map[string]any{"v": x * 2}, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + if len(results) != 3 { + t.Fatalf("expected 3 results, got %d", len(results)) + } + requirePass(t, results) + + // Verify case names are populated. + for i, name := range []string{"positive", "zero", "negative"} { + if results[i].Case != name { + t.Fatalf("result[%d].Case = %q, want %q", i, results[i].Case, name) + } + if results[i].Test != "doubler" { + t.Fatalf("result[%d].Test = %q, want %q", i, results[i].Test, "doubler") + } + } +} + +func TestRunFile_MultiCase_OneFails(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "doubler", + Mapping: "output.v = input.x * 2", + Cases: []Case{ + {Name: "correct", Input: map[string]any{"x": int(3)}, Output: map[string]any{"v": int(6)}}, + {Name: "wrong", Input: map[string]any{"x": int(5)}, Output: map[string]any{"v": int(99)}}, + }, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + m := input.(map[string]any) + x := m["x"].(int64) + return map[string]any{"v": x * 2}, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + if len(results) != 2 { + t.Fatalf("expected 2 results, got %d", len(results)) + } + if results[0].Err != nil { + t.Fatalf("expected first case to pass, got: %v", results[0].Err) + } + if results[1].Err == nil { + t.Fatal("expected second case to fail") + } + if results[1].Case != "wrong" { + t.Fatalf("failed case = %q, want %q", results[1].Case, "wrong") + } +} + +func TestRunFile_MultiCase_MixedExpectations(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "mixed", + Mapping: "output = input", + Cases: []Case{ + {Name: "output case", Input: int(42), Output: int(42)}, + {Name: "error case", Input: "bad", Error: "kaboom"}, + {Name: "deleted case", Input: nil, Deleted: true}, + }, + }, + }, + } + + callIdx := 0 + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + callIdx++ + switch callIdx { + case 1: + return int64(42), map[string]any{}, false, nil + case 2: + return nil, nil, false, fmt.Errorf("kaboom: bad input") + case 3: + return nil, nil, true, nil + } + return nil, nil, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + if len(results) != 3 { + t.Fatalf("expected 3 results, got %d", len(results)) + } + requirePass(t, results) +} + +func TestRunFile_MultiCase_CompileError(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "bad mapping", + Mapping: "broken", + Cases: []Case{ + {Name: "a", Output: int(1)}, + }, + }, + }, + } + + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + return nil, &CompileError{Message: "syntax error"} + }, + } + + results := RunFile(tf, "test.yaml", interp) + if len(results) != 1 { + t.Fatalf("expected 1 result for compile error, got %d", len(results)) + } + if results[0].Err == nil { + t.Fatal("expected compile error to be reported") + } + if results[0].Kind != KindFail { + t.Fatalf("expected KindFail, got %v", results[0].Kind) + } +} + +func TestRunFile_MultiCase_CompilesOnce(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "shared mapping", + Mapping: "output = input", + Cases: []Case{ + {Name: "a", Input: int(1), Output: int(1)}, + {Name: "b", Input: int(2), Output: int(2)}, + {Name: "c", Input: int(3), Output: int(3)}, + }, + }, + }, + } + + compileCount := 0 + interp := &mockInterpreter{ + compileFunc: func(mapping string, files map[string]string) (Mapping, error) { + compileCount++ + return &mockMapping{ + execFunc: func(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return input, map[string]any{}, false, nil + }, + }, nil + }, + } + + results := RunFile(tf, "test.yaml", interp) + requirePass(t, results) + if compileCount != 1 { + t.Fatalf("expected 1 compile call, got %d", compileCount) + } +} + +func TestRunFile_MultiCase_ValidationRejectsMixedInline(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "mixed inline and cases", + Mapping: "output = 1", + Output: int(1), // inline output + Cases: []Case{ + {Name: "a", Output: int(1)}, + }, + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + if len(results) != 1 { + t.Fatalf("expected 1 result, got %d", len(results)) + } + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestRunFile_MultiCase_ValidationRejectsCompileErrorWithCases(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "compile_error with cases", + Mapping: "bad", + CompileError: "syntax", + Cases: []Case{ + {Name: "a", Output: int(1)}, + }, + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestRunFile_MultiCase_ValidationRejectsEmptyCases(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "empty cases", + Mapping: "output = 1", + Cases: []Case{}, + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestRunFile_MultiCase_ValidationRejectsCaseWithoutName(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "unnamed case", + Mapping: "output = 1", + Cases: []Case{ + {Name: "ok", Output: int(1)}, + {Output: int(2)}, // no name + }, + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestResult_StringWithCase(t *testing.T) { + pass := Result{File: "types/int.yaml", Test: "doubler", Case: "positive"} + expected := "PASS types/int.yaml / doubler/positive" + if pass.String() != expected { + t.Fatalf("got %q, want %q", pass.String(), expected) + } + + fail := Result{File: "types/int.yaml", Test: "doubler", Case: "negative", Err: fmt.Errorf("mismatch")} + expected = "FAIL types/int.yaml / doubler/negative: mismatch" + if fail.String() != expected { + t.Fatalf("got %q, want %q", fail.String(), expected) + } +} + +func TestRunFile_InvalidInputMetadata(t *testing.T) { + tf := &TestFile{ + Tests: []TestCase{ + { + Name: "bad metadata type", + Mapping: "output = 1", + InputMetadata: "not an object", + Output: int64(1), + }, + }, + } + + results := RunFile(tf, "test.yaml", nil) + requireFail(t, results, "bad metadata type") + if results[0].Kind != KindInvalidTest { + t.Fatalf("expected KindInvalidTest, got %v", results[0].Kind) + } +} + +func TestResult_String(t *testing.T) { + pass := Result{File: "types/int.yaml", Test: "add ints"} + if pass.String() != "PASS types/int.yaml / add ints" { + t.Fatalf("unexpected: %s", pass.String()) + } + + fail := Result{File: "types/int.yaml", Test: "add ints", Err: fmt.Errorf("mismatch")} + if fail.String() != "FAIL types/int.yaml / add ints: mismatch" { + t.Fatalf("unexpected: %s", fail.String()) + } +} diff --git a/internal/bloblang2/go/spectest/schema.go b/internal/bloblang2/go/spectest/schema.go new file mode 100644 index 000000000..32ef5e853 --- /dev/null +++ b/internal/bloblang2/go/spectest/schema.go @@ -0,0 +1,140 @@ +package spectest + +import ( + "fmt" + "os" + "path/filepath" + "sort" + "strings" + + "gopkg.in/yaml.v3" +) + +// TestFile represents one YAML test file. +type TestFile struct { + Description string `yaml:"description"` + Files map[string]string `yaml:"files"` + Tests []TestCase `yaml:"tests"` +} + +// TestCase is a single test within a file. +type TestCase struct { + Name string `yaml:"name"` + Mapping string `yaml:"mapping"` + Input any `yaml:"input"` + InputMetadata any `yaml:"input_metadata"` + Output any `yaml:"output"` + OutputMetadata any `yaml:"output_metadata"` + Error string `yaml:"error"` + CompileError string `yaml:"compile_error"` + Deleted bool `yaml:"deleted"` + NoOutputCheck bool `yaml:"no_output_check"` + NoMetadataCheck bool `yaml:"no_metadata_check"` + OutputType string `yaml:"output_type"` + Files map[string]string `yaml:"files"` + Cases []Case `yaml:"cases"` + HasOutput bool `yaml:"-"` // set by custom unmarshaling; true when output field is present + HasError bool `yaml:"-"` // set by custom unmarshaling; true when error field is present +} + +// Case is a single input/output case within a multi-case test. The mapping +// is defined on the parent TestCase and compiled once; each Case provides +// a different input and expected result to execute against it. +type Case struct { + Name string `yaml:"name"` + Input any `yaml:"input"` + InputMetadata any `yaml:"input_metadata"` + Output any `yaml:"output"` + OutputMetadata any `yaml:"output_metadata"` + Error string `yaml:"error"` + Deleted bool `yaml:"deleted"` + NoOutputCheck bool `yaml:"no_output_check"` + NoMetadataCheck bool `yaml:"no_metadata_check"` + OutputType string `yaml:"output_type"` + HasOutput bool `yaml:"-"` + HasError bool `yaml:"-"` +} + +// UnmarshalYAML implements custom unmarshaling to detect when the output +// field is explicitly set (including to null). +func (tc *TestCase) UnmarshalYAML(value *yaml.Node) error { + // Use an alias type to avoid infinite recursion. + type rawTestCase TestCase + var raw rawTestCase + if err := value.Decode(&raw); err != nil { + return err + } + *tc = TestCase(raw) + + // Check if "output" key is present in the YAML mapping. + if value.Kind == yaml.MappingNode { + for i := 0; i < len(value.Content)-1; i += 2 { + switch value.Content[i].Value { + case "output": + tc.HasOutput = true + case "error": + tc.HasError = true + } + } + } + return nil +} + +// UnmarshalYAML implements custom unmarshaling to detect when the output +// or error fields are explicitly set (including to null/empty). +func (c *Case) UnmarshalYAML(value *yaml.Node) error { + type rawCase Case + var raw rawCase + if err := value.Decode(&raw); err != nil { + return err + } + *c = Case(raw) + + if value.Kind == yaml.MappingNode { + for i := 0; i < len(value.Content)-1; i += 2 { + switch value.Content[i].Value { + case "output": + c.HasOutput = true + case "error": + c.HasError = true + } + } + } + return nil +} + +// LoadFile reads and unmarshals a single YAML test file. +func LoadFile(path string) (*TestFile, error) { + data, err := os.ReadFile(path) + if err != nil { + return nil, fmt.Errorf("reading test file %s: %w", path, err) + } + var tf TestFile + if err := yaml.Unmarshal(data, &tf); err != nil { + return nil, fmt.Errorf("parsing test file %s: %w", path, err) + } + return &tf, nil +} + +// DiscoverFiles recursively finds all .yaml files under dir, returning +// paths sorted lexicographically for deterministic ordering. +func DiscoverFiles(dir string) ([]string, error) { + var files []string + err := filepath.Walk(dir, func(path string, info os.FileInfo, err error) error { + if err != nil { + return err + } + if info.IsDir() { + return nil + } + if strings.HasSuffix(info.Name(), ".yaml") { + files = append(files, path) + } + return nil + }) + if err != nil { + return nil, fmt.Errorf("discovering test files in %s: %w", dir, err) + } + sort.Strings(files) + return files, nil +} diff --git a/internal/bloblang2/go/spectest/schema_test.go b/internal/bloblang2/go/spectest/schema_test.go new file mode 100644 index 000000000..c483ec86b --- /dev/null +++ b/internal/bloblang2/go/spectest/schema_test.go @@ -0,0 +1,137 @@ +package spectest + +import ( + "os" + "path/filepath" + "testing" +) + +func TestLoadFile(t *testing.T) { + dir := t.TempDir() + path := filepath.Join(dir, "test.yaml") + content := `description: "test file" +files: + "helper.blobl": | + map double(x) { x * 2 } +tests: + - name: "basic test" + mapping: | + output.v = 42 + output: {"v": 42} + - name: "error test" + mapping: | + output = bad + compile_error: "bad" + - name: "with input" + input: {"x": 1} + input_metadata: {"key": "val"} + mapping: | + output = input.x + output: 1 +` + if err := os.WriteFile(path, []byte(content), 0o644); err != nil { + t.Fatal(err) + } + + tf, err := LoadFile(path) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + + if tf.Description != "test file" { + t.Fatalf("expected description %q, got %q", "test file", tf.Description) + } + if len(tf.Files) != 1 { + t.Fatalf("expected 1 file, got %d", len(tf.Files)) + } + if _, ok := tf.Files["helper.blobl"]; !ok { + t.Fatal("expected helper.blobl in files") + } + if len(tf.Tests) != 3 { + t.Fatalf("expected 3 tests, got %d", len(tf.Tests)) + } + if tf.Tests[0].Name != "basic test" { + t.Fatalf("expected first test name %q, got %q", "basic test", tf.Tests[0].Name) + } + if tf.Tests[1].CompileError != "bad" { + t.Fatalf("expected compile_error %q, got %q", "bad", tf.Tests[1].CompileError) + } +} + +func TestLoadFile_NotFound(t *testing.T) { + _, err := LoadFile("/nonexistent/path.yaml") + if err == nil { + t.Fatal("expected error for nonexistent file") + } +} + +func TestLoadFile_InvalidYAML(t *testing.T) { + dir := t.TempDir() + path := filepath.Join(dir, "bad.yaml") + if err := os.WriteFile(path, []byte("{{invalid yaml"), 0o644); err != nil { + t.Fatal(err) + } + + _, err := LoadFile(path) + if err == nil { + t.Fatal("expected error for invalid YAML") + } +} + +func TestDiscoverFiles(t *testing.T) { + dir := t.TempDir() + + // Create nested structure. + subdir := filepath.Join(dir, "sub") + if err := os.MkdirAll(subdir, 0o755); err != nil { + t.Fatal(err) + } + + for _, name := range []string{ + filepath.Join(dir, "b.yaml"), + filepath.Join(dir, "a.yaml"), + filepath.Join(subdir, "c.yaml"), + filepath.Join(dir, "skip.txt"), + } { + if err := os.WriteFile(name, []byte(""), 0o644); err != nil { + t.Fatal(err) + } + } + + files, err := DiscoverFiles(dir) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + + if len(files) != 3 { + t.Fatalf("expected 3 yaml files, got %d: %v", len(files), files) + } + + // Should be sorted. + names := make([]string, len(files)) + for i, f := range files { + rel, _ := filepath.Rel(dir, f) + names[i] = rel + } + if names[0] != "a.yaml" || names[1] != "b.yaml" || names[2] != filepath.Join("sub", "c.yaml") { + t.Fatalf("unexpected order: %v", names) + } +} + +func TestDiscoverFiles_NonexistentDir(t *testing.T) { + _, err := DiscoverFiles("/nonexistent/dir") + if err == nil { + t.Fatal("expected error for nonexistent dir") + } +} + +func TestDiscoverFiles_EmptyDir(t *testing.T) { + dir := t.TempDir() + files, err := DiscoverFiles(dir) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + if len(files) != 0 { + t.Fatalf("expected 0 files, got %d", len(files)) + } +} diff --git a/internal/bloblang2/go/spectest/typedvalue.go b/internal/bloblang2/go/spectest/typedvalue.go new file mode 100644 index 000000000..ece4c5eb4 --- /dev/null +++ b/internal/bloblang2/go/spectest/typedvalue.go @@ -0,0 +1,158 @@ +package spectest + +import ( + "encoding/base64" + "fmt" + "math" + "strconv" + "time" +) + +// NormalizeYAMLValue recursively converts yaml.v3 decoded values to the +// canonical Go types expected by the rest of the package: +// - int → int64 (yaml.v3 decodes bare integers as int) +// - map keys are asserted to string +// - slices and maps are recursively normalized +func NormalizeYAMLValue(v any) any { + switch val := v.(type) { + case int: + return int64(val) + case map[string]any: + out := make(map[string]any, len(val)) + for k, v := range val { + out[k] = NormalizeYAMLValue(v) + } + return out + case map[any]any: + out := make(map[string]any, len(val)) + for k, v := range val { + out[fmt.Sprintf("%v", k)] = NormalizeYAMLValue(v) + } + return out + case []any: + out := make([]any, len(val)) + for i, v := range val { + out[i] = NormalizeYAMLValue(v) + } + return out + default: + return v + } +} + +// DecodeTypedValues recursively walks a value tree and decodes type +// annotations of the form {_type: "typename", value: "string_value"} +// into the corresponding Go types. +// +// A map is treated as a type annotation only when it has exactly two +// keys: "_type" and "value", both with string values. +func DecodeTypedValues(v any) (any, error) { + switch val := v.(type) { + case map[string]any: + if len(val) == 2 { + typeName, hasType := val["_type"].(string) + valueStr, hasValue := val["value"].(string) + if hasType && hasValue { + return decodeTypedValue(typeName, valueStr) + } + } + out := make(map[string]any, len(val)) + for k, v := range val { + decoded, err := DecodeTypedValues(v) + if err != nil { + return nil, fmt.Errorf("key %q: %w", k, err) + } + out[k] = decoded + } + return out, nil + case []any: + out := make([]any, len(val)) + for i, v := range val { + decoded, err := DecodeTypedValues(v) + if err != nil { + return nil, fmt.Errorf("index %d: %w", i, err) + } + out[i] = decoded + } + return out, nil + default: + return v, nil + } +} + +func decodeTypedValue(typeName, valueStr string) (any, error) { + switch typeName { + case "int32": + n, err := strconv.ParseInt(valueStr, 10, 32) + if err != nil { + return nil, fmt.Errorf("decoding int32 %q: %w", valueStr, err) + } + return int32(n), nil + case "int64": + n, err := strconv.ParseInt(valueStr, 10, 64) + if err != nil { + return nil, fmt.Errorf("decoding int64 %q: %w", valueStr, err) + } + return n, nil + case "uint32": + n, err := strconv.ParseUint(valueStr, 10, 32) + if err != nil { + return nil, fmt.Errorf("decoding uint32 %q: %w", valueStr, err) + } + return uint32(n), nil + case "uint64": + n, err := strconv.ParseUint(valueStr, 10, 64) + if err != nil { + return nil, fmt.Errorf("decoding uint64 %q: %w", valueStr, err) + } + return n, nil + case "float32": + f, err := parseFloat(valueStr) + if err != nil { + return nil, fmt.Errorf("decoding float32 %q: %w", valueStr, err) + } + return float32(f), nil + case "float64": + f, err := parseFloat(valueStr) + if err != nil { + return nil, fmt.Errorf("decoding float64 %q: %w", valueStr, err) + } + return f, nil + case "bytes": + b, err := base64.StdEncoding.DecodeString(valueStr) + if err != nil { + return nil, fmt.Errorf("decoding bytes (base64) %q: %w", valueStr, err) + } + return b, nil + case "timestamp": + t, err := time.Parse(time.RFC3339Nano, valueStr) + if err != nil { + return nil, fmt.Errorf("decoding timestamp %q: %w", valueStr, err) + } + return t, nil + default: + return nil, fmt.Errorf("unknown _type %q", typeName) + } +} + +// parseFloat handles special float string values (NaN, Infinity, -0.0). +func parseFloat(s string) (float64, error) { + switch s { + case "NaN": + return math.NaN(), nil + case "Infinity": + return math.Inf(1), nil + case "-Infinity": + return math.Inf(-1), nil + case "-0.0": + return math.Float64frombits(1 << 63), nil // negative zero + default: + return strconv.ParseFloat(s, 64) + } +} + +// DecodeValue is a convenience that applies NormalizeYAMLValue then +// DecodeTypedValues in sequence. +func DecodeValue(v any) (any, error) { + return DecodeTypedValues(NormalizeYAMLValue(v)) +} diff --git a/internal/bloblang2/go/spectest/typedvalue_test.go b/internal/bloblang2/go/spectest/typedvalue_test.go new file mode 100644 index 000000000..33d9ab6fd --- /dev/null +++ b/internal/bloblang2/go/spectest/typedvalue_test.go @@ -0,0 +1,228 @@ +package spectest + +import ( + "math" + "testing" + "time" +) + +func TestNormalizeYAMLValue(t *testing.T) { + tests := []struct { + name string + input any + expected any + }{ + {"int to int64", int(42), int64(42)}, + {"int64 unchanged", int64(42), int64(42)}, + {"float64 unchanged", 3.14, 3.14}, + {"string unchanged", "hello", "hello"}, + {"bool unchanged", true, true}, + {"nil unchanged", nil, nil}, + { + "map with int values", + map[string]any{"a": int(1), "b": int(2)}, + map[string]any{"a": int64(1), "b": int64(2)}, + }, + { + "slice with int values", + []any{int(1), int(2), int(3)}, + []any{int64(1), int64(2), int64(3)}, + }, + { + "nested map", + map[string]any{"outer": map[string]any{"inner": int(5)}}, + map[string]any{"outer": map[string]any{"inner": int64(5)}}, + }, + { + "map[any]any to map[string]any", + map[any]any{"key": int(10)}, + map[string]any{"key": int64(10)}, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + result := NormalizeYAMLValue(tt.input) + ok, diff := DeepEqual(tt.expected, result) + if !ok { + t.Fatalf("NormalizeYAMLValue mismatch:\n%s", diff) + } + }) + } +} + +func TestDecodeTypedValues(t *testing.T) { + tests := []struct { + name string + input any + expected any + wantErr bool + }{ + { + "int32", + map[string]any{"_type": "int32", "value": "42"}, + int32(42), + false, + }, + { + "int64", + map[string]any{"_type": "int64", "value": "-100"}, + int64(-100), + false, + }, + { + "uint32", + map[string]any{"_type": "uint32", "value": "255"}, + uint32(255), + false, + }, + { + "uint64", + map[string]any{"_type": "uint64", "value": "18446744073709551615"}, + uint64(18446744073709551615), + false, + }, + { + "float32", + map[string]any{"_type": "float32", "value": "3.14"}, + float32(3.14), + false, + }, + { + "float64", + map[string]any{"_type": "float64", "value": "3.14"}, + float64(3.14), + false, + }, + { + "float64 NaN", + map[string]any{"_type": "float64", "value": "NaN"}, + math.NaN(), + false, + }, + { + "float64 Infinity", + map[string]any{"_type": "float64", "value": "Infinity"}, + math.Inf(1), + false, + }, + { + "float64 -Infinity", + map[string]any{"_type": "float64", "value": "-Infinity"}, + math.Inf(-1), + false, + }, + { + "float64 -0.0", + map[string]any{"_type": "float64", "value": "-0.0"}, + math.Float64frombits(1 << 63), + false, + }, + { + "bytes", + map[string]any{"_type": "bytes", "value": "aGVsbG8="}, + []byte("hello"), + false, + }, + { + "timestamp", + map[string]any{"_type": "timestamp", "value": "2024-03-01T12:00:00Z"}, + time.Date(2024, 3, 1, 12, 0, 0, 0, time.UTC), + false, + }, + { + "regular map not decoded", + map[string]any{"_type": "int32", "value": "42", "extra": "field"}, + map[string]any{"_type": "int32", "value": "42", "extra": "field"}, + false, + }, + { + "nested typed values in map", + map[string]any{"a": map[string]any{"_type": "int32", "value": "5"}}, + map[string]any{"a": int32(5)}, + false, + }, + { + "nested typed values in slice", + []any{map[string]any{"_type": "uint64", "value": "100"}, "plain"}, + []any{uint64(100), "plain"}, + false, + }, + { + "scalar passthrough", + "hello", + "hello", + false, + }, + { + "nil passthrough", + nil, + nil, + false, + }, + { + "unknown type errors", + map[string]any{"_type": "unknown", "value": "x"}, + nil, + true, + }, + { + "value not a string — treated as regular map", + map[string]any{"_type": "int32", "value": 42}, + map[string]any{"_type": "int32", "value": 42}, + false, + }, + { + "type not a string — treated as regular map", + map[string]any{"_type": 99, "value": "hello"}, + map[string]any{"_type": 99, "value": "hello"}, + false, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + result, err := DecodeTypedValues(tt.input) + if tt.wantErr { + if err == nil { + t.Fatalf("expected error, got nil") + } + return + } + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + + // Special case for NaN comparison since DeepEqual handles it. + ok, diff := DeepEqual(tt.expected, result) + if !ok { + t.Fatalf("DecodeTypedValues mismatch:\n%s", diff) + } + }) + } +} + +func TestDecodeValue(t *testing.T) { + // Tests the combined NormalizeYAMLValue + DecodeTypedValues pipeline. + input := map[string]any{ + "count": int(42), + "typed": map[string]any{"_type": "float32", "value": "1.5"}, + "nested": []any{int(1), map[string]any{"_type": "int32", "value": "2"}}, + } + + result, err := DecodeValue(input) + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + + expected := map[string]any{ + "count": int64(42), + "typed": float32(1.5), + "nested": []any{int64(1), int32(2)}, + } + + ok, diff := DeepEqual(expected, result) + if !ok { + t.Fatalf("DecodeValue mismatch:\n%s", diff) + } +} From b2bb886c7d3c4206cfcadc8c4f1980bbb7040eac Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Thu, 23 Apr 2026 11:20:42 +0100 Subject: [PATCH 05/20] bloblang(v2): Add spec conformance test corpus Adds internal/bloblang2/spec/tests/, the YAML corpus of conformance cases that anchors both runtimes to the V2 spec. Tests are organised by topic: access, case_studies, control_flow, edge_cases, error_handling, imports, input_output, lambdas, maps, operators, optimizations, stdlib, types, variables. Each case is executed via the spectest runner; both the Go and TS runtimes are required to pass the full corpus. --- internal/bloblang2/spec/tests/README.md | 34 ++ .../spec/tests/access/dynamic_access.yaml | 185 +++++++++ .../spec/tests/access/field_access.yaml | 161 ++++++++ .../spec/tests/access/negative_indexing.yaml | 129 ++++++ .../spec/tests/access/null_safe.yaml | 149 +++++++ .../spec/tests/access/out_of_bounds.yaml | 222 +++++++++++ .../cloudformation_inventory.yaml | 152 ++++++++ .../spec/tests/case_studies/debezium_cdc.yaml | 138 +++++++ .../tests/case_studies/ecommerce_order.yaml | 158 ++++++++ .../tests/case_studies/ga4_clickstream.yaml | 213 ++++++++++ .../tests/case_studies/github_webhook.yaml | 95 +++++ .../tests/case_studies/kubernetes_pod.yaml | 157 ++++++++ .../tests/case_studies/nlp_enrichment.yaml | 83 ++++ .../spec/tests/case_studies/otel_traces.yaml | 195 +++++++++ .../tests/case_studies/stripe_invoice.yaml | 111 ++++++ .../case_studies/v2_feature_showcase.yaml | 198 ++++++++++ .../tests/case_studies/vpc_flow_logs.yaml | 159 ++++++++ .../tests/control_flow/block_scoping.yaml | 233 +++++++++++ .../tests/control_flow/if_else_chains.yaml | 119 ++++++ .../tests/control_flow/if_expression.yaml | 145 +++++++ .../spec/tests/control_flow/if_statement.yaml | 177 +++++++++ .../spec/tests/control_flow/match_as.yaml | 186 +++++++++ .../tests/control_flow/match_block_body.yaml | 216 ++++++++++ .../tests/control_flow/match_boolean.yaml | 157 ++++++++ .../tests/control_flow/match_edge_cases.yaml | 173 ++++++++ .../tests/control_flow/match_equality.yaml | 218 +++++++++++ .../spec/tests/control_flow/match_void.yaml | 139 +++++++ .../spec/tests/edge_cases/deeply_nested.yaml | 108 +++++ .../tests/edge_cases/empty_collections.yaml | 108 +++++ .../spec/tests/edge_cases/infinity.yaml | 126 ++++++ .../tests/edge_cases/integer_overflow.yaml | 114 ++++++ .../edge_cases/integer_overflow_ops.yaml | 117 ++++++ .../tests/edge_cases/interpreter_reuse.yaml | 39 ++ .../spec/tests/edge_cases/nan_behavior.yaml | 123 ++++++ .../spec/tests/edge_cases/precision_loss.yaml | 80 ++++ .../tests/edge_cases/string_codepoints.yaml | 139 +++++++ .../spec/tests/edge_cases/unicode.yaml | 103 +++++ .../tests/edge_cases/whitespace_newlines.yaml | 138 +++++++ .../spec/tests/error_handling/catch.yaml | 162 ++++++++ .../spec/tests/error_handling/not_null.yaml | 141 +++++++ .../spec/tests/error_handling/or.yaml | 166 ++++++++ .../error_handling/or_catch_composition.yaml | 122 ++++++ .../tests/error_handling/propagation.yaml | 141 +++++++ .../spec/tests/error_handling/throw.yaml | 135 +++++++ .../spec/tests/imports/basic_import.yaml | 141 +++++++ .../spec/tests/imports/circular_import.yaml | 92 +++++ .../tests/imports/duplicate_namespace.yaml | 81 ++++ .../spec/tests/imports/nested_import.yaml | 104 +++++ .../input_output/conditional_deletion.yaml | 194 +++++++++ .../spec/tests/input_output/deletion.yaml | 247 ++++++++++++ .../tests/input_output/dynamic_metadata.yaml | 95 +++++ .../spec/tests/input_output/input_access.yaml | 163 ++++++++ .../spec/tests/input_output/metadata.yaml | 187 +++++++++ .../tests/input_output/output_assignment.yaml | 159 ++++++++ .../spec/tests/input_output/output_root.yaml | 145 +++++++ .../bloblang2/spec/tests/lambdas/basic.yaml | 151 +++++++ .../spec/tests/lambdas/complex_iterators.yaml | 132 +++++++ .../spec/tests/lambdas/defaults.yaml | 83 ++++ .../spec/tests/lambdas/discard_params.yaml | 82 ++++ .../spec/tests/lambdas/fold_patterns.yaml | 114 ++++++ .../spec/tests/lambdas/outer_capture.yaml | 97 +++++ .../tests/lambdas/position_restriction.yaml | 128 ++++++ .../bloblang2/spec/tests/lambdas/purity.yaml | 141 +++++++ .../spec/tests/lambdas/return_values.yaml | 143 +++++++ internal/bloblang2/spec/tests/maps/basic.yaml | 192 +++++++++ .../bloblang2/spec/tests/maps/defaults.yaml | 176 +++++++++ .../spec/tests/maps/discard_params.yaml | 141 +++++++ .../spec/tests/maps/higher_order.yaml | 125 ++++++ .../bloblang2/spec/tests/maps/isolation.yaml | 158 ++++++++ .../bloblang2/spec/tests/maps/named_args.yaml | 141 +++++++ .../spec/tests/maps/parameter_shadowing.yaml | 106 +++++ .../bloblang2/spec/tests/maps/recursion.yaml | 142 +++++++ .../spec/tests/maps/recursion_advanced.yaml | 185 +++++++++ .../tests/maps/recursive_with_iterators.yaml | 103 +++++ .../spec/tests/maps/transitive_calls.yaml | 92 +++++ .../spec/tests/maps/void_returns.yaml | 114 ++++++ .../spec/tests/operators/arithmetic.yaml | 200 ++++++++++ .../spec/tests/operators/comparison.yaml | 209 ++++++++++ .../spec/tests/operators/division_modulo.yaml | 118 ++++++ .../spec/tests/operators/equality.yaml | 215 ++++++++++ .../spec/tests/operators/logical.yaml | 265 +++++++++++++ .../tests/operators/numeric_promotion.yaml | 186 +++++++++ .../operators/numeric_promotion_edge.yaml | 179 +++++++++ .../spec/tests/operators/precedence.yaml | 260 ++++++++++++ .../spec/tests/operators/string_concat.yaml | 201 ++++++++++ .../tests/optimizations/constant_folding.yaml | 235 +++++++++++ .../optimizations/dead_code_elimination.yaml | 150 +++++++ .../tests/optimizations/path_collapse.yaml | 167 ++++++++ .../spec/tests/stdlib/any_all_methods.yaml | 100 +++++ .../spec/tests/stdlib/array_modify.yaml | 196 ++++++++++ .../spec/tests/stdlib/array_query.yaml | 352 +++++++++++++++++ .../spec/tests/stdlib/array_transform.yaml | 292 ++++++++++++++ .../spec/tests/stdlib/collect_method.yaml | 74 ++++ .../spec/tests/stdlib/core_functions.yaml | 247 ++++++++++++ .../bloblang2/spec/tests/stdlib/encoding.yaml | 363 +++++++++++++++++ .../spec/tests/stdlib/enumerate_method.yaml | 139 +++++++ .../spec/tests/stdlib/find_method.yaml | 81 ++++ .../spec/tests/stdlib/into_method.yaml | 173 ++++++++ .../tests/stdlib/iter_chain_patterns.yaml | 108 +++++ .../spec/tests/stdlib/method_composition.yaml | 165 ++++++++ .../spec/tests/stdlib/numeric_methods.yaml | 325 +++++++++++++++ .../spec/tests/stdlib/object_methods.yaml | 224 +++++++++++ .../spec/tests/stdlib/object_transform.yaml | 192 +++++++++ .../spec/tests/stdlib/sequence_methods.yaml | 322 +++++++++++++++ .../spec/tests/stdlib/sort_edge_cases.yaml | 152 ++++++++ .../spec/tests/stdlib/string_methods.yaml | 267 +++++++++++++ .../spec/tests/stdlib/string_regex.yaml | 158 ++++++++ .../spec/tests/stdlib/timestamp_methods.yaml | 369 ++++++++++++++++++ .../spec/tests/stdlib/type_conversion.yaml | 357 +++++++++++++++++ .../spec/tests/stdlib/unique_flatten.yaml | 98 +++++ .../spec/tests/stdlib/void_function.yaml | 204 ++++++++++ .../bloblang2/spec/tests/types/array.yaml | 167 ++++++++ .../bloblang2/spec/tests/types/bool_null.yaml | 244 ++++++++++++ .../bloblang2/spec/tests/types/bytes.yaml | 263 +++++++++++++ .../bloblang2/spec/tests/types/floats.yaml | 251 ++++++++++++ .../bloblang2/spec/tests/types/integers.yaml | 278 +++++++++++++ .../bloblang2/spec/tests/types/object.yaml | 198 ++++++++++ .../bloblang2/spec/tests/types/string.yaml | 320 +++++++++++++++ .../bloblang2/spec/tests/types/timestamp.yaml | 244 ++++++++++++ .../tests/types/timestamp_arithmetic.yaml | 195 +++++++++ .../spec/tests/types/type_introspection.yaml | 184 +++++++++ internal/bloblang2/spec/tests/types/void.yaml | 184 +++++++++ .../variables/bare_ident_resolution.yaml | 89 +++++ .../spec/tests/variables/copy_on_write.yaml | 185 +++++++++ .../spec/tests/variables/declaration.yaml | 155 ++++++++ .../tests/variables/dynamic_assignment.yaml | 78 ++++ .../variables/expr_body_path_assign.yaml | 144 +++++++ .../variables/nested_scope_mutations.yaml | 167 ++++++++ .../spec/tests/variables/path_assignment.yaml | 235 +++++++++++ .../spec/tests/variables/reassignment.yaml | 209 ++++++++++ .../tests/variables/scope_boundaries.yaml | 138 +++++++ .../spec/tests/variables/shadowing.yaml | 222 +++++++++++ 132 files changed, 22211 insertions(+) create mode 100644 internal/bloblang2/spec/tests/README.md create mode 100644 internal/bloblang2/spec/tests/access/dynamic_access.yaml create mode 100644 internal/bloblang2/spec/tests/access/field_access.yaml create mode 100644 internal/bloblang2/spec/tests/access/negative_indexing.yaml create mode 100644 internal/bloblang2/spec/tests/access/null_safe.yaml create mode 100644 internal/bloblang2/spec/tests/access/out_of_bounds.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/cloudformation_inventory.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/debezium_cdc.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/ecommerce_order.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/ga4_clickstream.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/github_webhook.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/kubernetes_pod.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/nlp_enrichment.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/otel_traces.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/stripe_invoice.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/v2_feature_showcase.yaml create mode 100644 internal/bloblang2/spec/tests/case_studies/vpc_flow_logs.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/block_scoping.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/if_else_chains.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/if_expression.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/if_statement.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/match_as.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/match_block_body.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/match_boolean.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/match_edge_cases.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/match_equality.yaml create mode 100644 internal/bloblang2/spec/tests/control_flow/match_void.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/deeply_nested.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/empty_collections.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/infinity.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/integer_overflow.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/integer_overflow_ops.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/interpreter_reuse.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/nan_behavior.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/precision_loss.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/string_codepoints.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/unicode.yaml create mode 100644 internal/bloblang2/spec/tests/edge_cases/whitespace_newlines.yaml create mode 100644 internal/bloblang2/spec/tests/error_handling/catch.yaml create mode 100644 internal/bloblang2/spec/tests/error_handling/not_null.yaml create mode 100644 internal/bloblang2/spec/tests/error_handling/or.yaml create mode 100644 internal/bloblang2/spec/tests/error_handling/or_catch_composition.yaml create mode 100644 internal/bloblang2/spec/tests/error_handling/propagation.yaml create mode 100644 internal/bloblang2/spec/tests/error_handling/throw.yaml create mode 100644 internal/bloblang2/spec/tests/imports/basic_import.yaml create mode 100644 internal/bloblang2/spec/tests/imports/circular_import.yaml create mode 100644 internal/bloblang2/spec/tests/imports/duplicate_namespace.yaml create mode 100644 internal/bloblang2/spec/tests/imports/nested_import.yaml create mode 100644 internal/bloblang2/spec/tests/input_output/conditional_deletion.yaml create mode 100644 internal/bloblang2/spec/tests/input_output/deletion.yaml create mode 100644 internal/bloblang2/spec/tests/input_output/dynamic_metadata.yaml create mode 100644 internal/bloblang2/spec/tests/input_output/input_access.yaml create mode 100644 internal/bloblang2/spec/tests/input_output/metadata.yaml create mode 100644 internal/bloblang2/spec/tests/input_output/output_assignment.yaml create mode 100644 internal/bloblang2/spec/tests/input_output/output_root.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/basic.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/complex_iterators.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/defaults.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/discard_params.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/fold_patterns.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/outer_capture.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/position_restriction.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/purity.yaml create mode 100644 internal/bloblang2/spec/tests/lambdas/return_values.yaml create mode 100644 internal/bloblang2/spec/tests/maps/basic.yaml create mode 100644 internal/bloblang2/spec/tests/maps/defaults.yaml create mode 100644 internal/bloblang2/spec/tests/maps/discard_params.yaml create mode 100644 internal/bloblang2/spec/tests/maps/higher_order.yaml create mode 100644 internal/bloblang2/spec/tests/maps/isolation.yaml create mode 100644 internal/bloblang2/spec/tests/maps/named_args.yaml create mode 100644 internal/bloblang2/spec/tests/maps/parameter_shadowing.yaml create mode 100644 internal/bloblang2/spec/tests/maps/recursion.yaml create mode 100644 internal/bloblang2/spec/tests/maps/recursion_advanced.yaml create mode 100644 internal/bloblang2/spec/tests/maps/recursive_with_iterators.yaml create mode 100644 internal/bloblang2/spec/tests/maps/transitive_calls.yaml create mode 100644 internal/bloblang2/spec/tests/maps/void_returns.yaml create mode 100644 internal/bloblang2/spec/tests/operators/arithmetic.yaml create mode 100644 internal/bloblang2/spec/tests/operators/comparison.yaml create mode 100644 internal/bloblang2/spec/tests/operators/division_modulo.yaml create mode 100644 internal/bloblang2/spec/tests/operators/equality.yaml create mode 100644 internal/bloblang2/spec/tests/operators/logical.yaml create mode 100644 internal/bloblang2/spec/tests/operators/numeric_promotion.yaml create mode 100644 internal/bloblang2/spec/tests/operators/numeric_promotion_edge.yaml create mode 100644 internal/bloblang2/spec/tests/operators/precedence.yaml create mode 100644 internal/bloblang2/spec/tests/operators/string_concat.yaml create mode 100644 internal/bloblang2/spec/tests/optimizations/constant_folding.yaml create mode 100644 internal/bloblang2/spec/tests/optimizations/dead_code_elimination.yaml create mode 100644 internal/bloblang2/spec/tests/optimizations/path_collapse.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/any_all_methods.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/array_modify.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/array_query.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/array_transform.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/collect_method.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/core_functions.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/encoding.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/enumerate_method.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/find_method.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/into_method.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/iter_chain_patterns.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/method_composition.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/numeric_methods.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/object_methods.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/object_transform.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/sequence_methods.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/sort_edge_cases.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/string_methods.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/string_regex.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/timestamp_methods.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/type_conversion.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/unique_flatten.yaml create mode 100644 internal/bloblang2/spec/tests/stdlib/void_function.yaml create mode 100644 internal/bloblang2/spec/tests/types/array.yaml create mode 100644 internal/bloblang2/spec/tests/types/bool_null.yaml create mode 100644 internal/bloblang2/spec/tests/types/bytes.yaml create mode 100644 internal/bloblang2/spec/tests/types/floats.yaml create mode 100644 internal/bloblang2/spec/tests/types/integers.yaml create mode 100644 internal/bloblang2/spec/tests/types/object.yaml create mode 100644 internal/bloblang2/spec/tests/types/string.yaml create mode 100644 internal/bloblang2/spec/tests/types/timestamp.yaml create mode 100644 internal/bloblang2/spec/tests/types/timestamp_arithmetic.yaml create mode 100644 internal/bloblang2/spec/tests/types/type_introspection.yaml create mode 100644 internal/bloblang2/spec/tests/types/void.yaml create mode 100644 internal/bloblang2/spec/tests/variables/bare_ident_resolution.yaml create mode 100644 internal/bloblang2/spec/tests/variables/copy_on_write.yaml create mode 100644 internal/bloblang2/spec/tests/variables/declaration.yaml create mode 100644 internal/bloblang2/spec/tests/variables/dynamic_assignment.yaml create mode 100644 internal/bloblang2/spec/tests/variables/expr_body_path_assign.yaml create mode 100644 internal/bloblang2/spec/tests/variables/nested_scope_mutations.yaml create mode 100644 internal/bloblang2/spec/tests/variables/path_assignment.yaml create mode 100644 internal/bloblang2/spec/tests/variables/reassignment.yaml create mode 100644 internal/bloblang2/spec/tests/variables/scope_boundaries.yaml create mode 100644 internal/bloblang2/spec/tests/variables/shadowing.yaml diff --git a/internal/bloblang2/spec/tests/README.md b/internal/bloblang2/spec/tests/README.md new file mode 100644 index 000000000..53775b5dd --- /dev/null +++ b/internal/bloblang2/spec/tests/README.md @@ -0,0 +1,34 @@ +# Bloblang V2 Test Suite + +Machine-readable test suite for Bloblang V2 implementations. See `../TEST_PLAN.md` for the full schema documentation. + +## Quick Reference + +Each YAML file contains a `tests` array. Each test has: + +- `name` — unique identifier +- `mapping` — the Bloblang mapping to execute +- `input` — input document (default: `null`) +- `input_metadata` — input metadata (default: `{}`) +- Exactly one expectation: + - `output` — expected output (order-independent deep equality for objects) + - `deleted: true` — expect message deletion + - `error` — expect runtime error (substring match) + - `compile_error` — expect compile error (substring match) +- Optional: `output_metadata`, `no_output_check`, `output_type` + +## Type Annotations + +Use `{_type: "typename", value: "string_value"}` for precise types: + +- `int32`, `int64`, `uint32`, `uint64`, `float32`, `float64` +- `bytes` (base64-encoded value) +- `timestamp` (RFC 3339 value) + +All `value` fields are strings. Plain YAML integers default to int64, floats to float64. + +## Output Semantics + +- Output starts as `{}` before mapping runs +- Object comparison is order-independent +- `output_metadata` defaults to `{}` when not specified diff --git a/internal/bloblang2/spec/tests/access/dynamic_access.yaml b/internal/bloblang2/spec/tests/access/dynamic_access.yaml new file mode 100644 index 000000000..512e9ca6d --- /dev/null +++ b/internal/bloblang2/spec/tests/access/dynamic_access.yaml @@ -0,0 +1,185 @@ +description: "Dynamic access with [expr]: arrays, objects, strings, bytes; type errors for wrong index types" + +tests: + # --- Object dynamic access --- + + - name: "object dynamic access with string literal" + mapping: | + $obj = {"name": "Alice"} + output.v = $obj["name"] + output: {"v": "Alice"} + + - name: "object dynamic access with variable" + mapping: | + $obj = {"color": "blue"} + $key = "color" + output.v = $obj[$key] + output: {"v": "blue"} + + - name: "object dynamic access with expression" + mapping: | + $obj = {"key_a": "found"} + $suffix = "a" + output.v = $obj["key_" + $suffix] + output: {"v": "found"} + + - name: "object dynamic access non-existent key returns null" + mapping: | + $obj = {"a": 1} + output.v = $obj["missing"] + output: {"v": null} + + - name: "object dynamic access with integer key is error" + mapping: | + $obj = {"name": "Alice"} + output.v = $obj[0] + error: "non-string" + + - name: "object dynamic access with bool key is error" + mapping: | + $obj = {"name": "Alice"} + output.v = $obj[true] + error: "non-string" + + - name: "object dynamic access with null key is error" + mapping: | + $obj = {"name": "Alice"} + output.v = $obj[null] + error: "non-string" + + # --- Array dynamic access --- + + - name: "array index zero" + mapping: | + $arr = ["a", "b", "c"] + output.v = $arr[0] + output: {"v": "a"} + + - name: "array index middle" + mapping: | + $arr = ["a", "b", "c"] + output.v = $arr[1] + output: {"v": "b"} + + - name: "array index last" + mapping: | + $arr = ["a", "b", "c"] + output.v = $arr[2] + output: {"v": "c"} + + - name: "array index with float whole number accepted" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[2.0] + output: {"v": 30} + + - name: "array index with non-whole float is error" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[1.5] + error: "whole number" + + - name: "array index with string is error" + mapping: | + $arr = [10, 20, 30] + output.v = $arr["0"] + error: "non-numeric" + + - name: "array index with bool is error" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[true] + error: "non-numeric" + + - name: "array index with variable" + mapping: | + $arr = [10, 20, 30] + $i = 1 + output.v = $arr[$i] + output: {"v": 20} + + - name: "nested array and object dynamic access" + input: {"users": [{"name": "Alice"}, {"name": "Bob"}]} + mapping: | + output.v = input.users[1]["name"] + output: {"v": "Bob"} + + # --- String dynamic access (codepoint) --- + + - name: "string index returns codepoint value" + mapping: | + output.v = "hello"[0] + output: {"v": 104} + + - name: "string index with float whole number accepted" + mapping: | + output.v = "hello"[2.0] + output: {"v": 108} + + - name: "string index with non-whole float is error" + mapping: | + output.v = "hello"[1.5] + error: "whole number" + + - name: "string index with string key is error" + mapping: | + output.v = "hello"["0"] + error: "non-numeric" + + - name: "string index non-ascii codepoint value" + mapping: | + output.v = "caf\u00E9"[3] + output: {"v": 233} + + # --- Bytes dynamic access --- + + - name: "bytes index returns byte value" + mapping: | + output.v = "hello".bytes()[0] + output: {"v": 104} + + - name: "bytes index with float whole number accepted" + mapping: | + output.v = "hello".bytes()[4.0] + output: {"v": 111} + + - name: "bytes index with non-whole float is error" + mapping: | + output.v = "hello".bytes()[1.5] + error: "whole number" + + # --- Indexing non-indexable types --- + + - name: "index on boolean is error" + mapping: | + output.v = true[0] + error: "index" + + - name: "index on integer is error" + mapping: | + output.v = 42[0] + error: "index" + + - name: "index on float is error" + mapping: | + output.v = 3.14[0] + error: "index" + + - name: "index on null is error" + mapping: | + output.v = null[0] + error: "index" + + # --- Chained dynamic access --- + + - name: "chained bracket access on nested object" + mapping: | + $data = {"a": {"b": {"c": "deep"}}} + output.v = $data["a"]["b"]["c"] + output: {"v": "deep"} + + - name: "mixed dot and bracket access" + input: {"users": [{"name": "Alice"}]} + mapping: | + output.v = input.users[0].name + output: {"v": "Alice"} diff --git a/internal/bloblang2/spec/tests/access/field_access.yaml b/internal/bloblang2/spec/tests/access/field_access.yaml new file mode 100644 index 000000000..5f4492665 --- /dev/null +++ b/internal/bloblang2/spec/tests/access/field_access.yaml @@ -0,0 +1,161 @@ +description: "Static field access: dot notation, keywords as fields, quoted fields, nested access, null for missing, errors on non-objects" + +tests: + # --- Basic dot notation --- + + - name: "access input field" + input: {"name": "Alice"} + mapping: | + output.v = input.name + output: {"v": "Alice"} + + - name: "access nested input field" + input: {"user": {"name": "Alice"}} + mapping: | + output.v = input.user.name + output: {"v": "Alice"} + + - name: "deeply nested field access" + mapping: | + output.v = input.a.b.c.d + cases: + - name: "all present" + input: {"a": {"b": {"c": {"d": 42}}}} + output: {"v": 42} + - name: "null intermediate is error" + input: {"a": {}} + error: "null" + + - name: "access output field after assignment" + mapping: | + output.x = 10 + output.y = output.x + 5 + output: {"x": 10, "y": 15} + + - name: "access variable field" + mapping: | + $obj = {"name": "Bob", "age": 25} + output.name = $obj.name + output.age = $obj.age + output: {"name": "Bob", "age": 25} + + # --- Non-existent fields return null --- + + - name: "non-existent field on input returns null" + input: {"name": "Alice"} + mapping: | + output.v = input.missing + output: {"v": null} + + - name: "non-existent nested field returns null" + input: {"user": {"name": "Alice"}} + mapping: | + output.v = input.user.email + output: {"v": null} + + - name: "non-existent field on variable returns null" + mapping: | + $obj = {"x": 1} + output.v = $obj.y + output: {"v": null} + + - name: "non-existent field on empty object returns null" + input: {} + mapping: | + output.v = input.anything + output: {"v": null} + + # --- Keywords as valid field names --- + + - name: "keyword map as field name" + input: {"map": "value"} + mapping: | + output.v = input.map + output: {"v": "value"} + + - name: "keyword if as field name" + mapping: | + output.if = "conditional" + output.v = output.if + output: {"if": "conditional", "v": "conditional"} + + - name: "keyword match as field name" + input: {"match": 42} + mapping: | + output.v = input.match + output: {"v": 42} + + - name: "keyword true as field name" + input: {"true": "yes"} + mapping: | + output.v = input.true + output: {"v": "yes"} + + - name: "keyword null as field name" + input: {"null": "not actually null"} + mapping: | + output.v = input.null + output: {"v": "not actually null"} + + # --- Quoted field names --- + + - name: "quoted field with spaces" + input: {"field with spaces": "hello"} + mapping: | + output.v = input."field with spaces" + output: {"v": "hello"} + + - name: "quoted field with special characters" + input: {"special-chars!@#": "value"} + mapping: | + output.v = input."special-chars!@#" + output: {"v": "value"} + + - name: "quoted field starting with digit" + input: {"123": "numeric"} + mapping: | + output.v = input."123" + output: {"v": "numeric"} + + - name: "quoted field with dots" + input: {"a.b.c": "dotted"} + mapping: | + output.v = input."a.b.c" + output: {"v": "dotted"} + + - name: "quoted field on output" + mapping: | + output."my-field" = "works" + output: {"my-field": "works"} + + - name: "quoted field nested access" + input: {"top level": {"inner-key": "found"}} + mapping: | + output.v = input."top level"."inner-key" + output: {"v": "found"} + + # --- Field access on non-object types is error --- + + - name: "field access on string is error" + mapping: | + $s = "hello" + output.v = $s.field + error: "field" + + - name: "field access on integer is error" + mapping: | + $n = 42 + output.v = $n.field + error: "field" + + - name: "field access on boolean is error" + mapping: | + $b = true + output.v = $b.field + error: "field" + + - name: "field access on array is error" + mapping: | + $arr = [1, 2, 3] + output.v = $arr.field + error: "field" diff --git a/internal/bloblang2/spec/tests/access/negative_indexing.yaml b/internal/bloblang2/spec/tests/access/negative_indexing.yaml new file mode 100644 index 000000000..a61eeda87 --- /dev/null +++ b/internal/bloblang2/spec/tests/access/negative_indexing.yaml @@ -0,0 +1,129 @@ +description: "Negative indexing for arrays, strings, and bytes: -1 is last, -2 is second-to-last, out-of-bounds errors" + +tests: + # --- Array negative indexing --- + + - name: "array negative index -1 is last element" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-1] + output: {"v": 30} + + - name: "array negative index -2 is second to last" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-2] + output: {"v": 20} + + - name: "array negative index -3 is first element" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-3] + output: {"v": 10} + + - name: "array negative index on single element" + mapping: | + $arr = [42] + output.v = $arr[-1] + output: {"v": 42} + + - name: "array negative index out of bounds" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-4] + error: "out of bounds" + + - name: "array negative index far out of bounds" + mapping: | + $arr = [1, 2] + output.v = $arr[-100] + error: "out of bounds" + + - name: "array negative index on single element -2 is error" + mapping: | + $arr = [42] + output.v = $arr[-2] + error: "out of bounds" + + - name: "array negative index with float whole number" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-1.0] + output: {"v": 30} + + # --- String negative indexing (codepoint) --- + + - name: "string negative index -1 is last codepoint" + mapping: | + output.v = "hello"[-1] + output: {"v": 111} + + - name: "string negative index -2 is second to last" + mapping: | + output.v = "hello"[-2] + output: {"v": 108} + + - name: "string negative index -5 is first codepoint" + mapping: | + output.v = "hello"[-5] + output: {"v": 104} + + - name: "string negative index on single char" + mapping: | + output.v = "A"[-1] + output: {"v": 65} + + - name: "string negative index out of bounds" + mapping: | + output.v = "hello"[-6] + error: "out of bounds" + + - name: "string negative index on non-ascii" + mapping: | + output.v = "caf\u00E9"[-1] + output: {"v": 233} + + - name: "string negative index round trip with char" + mapping: | + output.v = "hello"[-1].char() + output: {"v": "o"} + + # --- Bytes negative indexing --- + + - name: "bytes negative index -1 is last byte" + mapping: | + output.v = "hello".bytes()[-1] + output: {"v": 111} + + - name: "bytes negative index -2 is second to last byte" + mapping: | + output.v = "hello".bytes()[-2] + output: {"v": 108} + + - name: "bytes negative index -5 is first byte" + mapping: | + output.v = "hello".bytes()[-5] + output: {"v": 104} + + - name: "bytes negative index out of bounds" + mapping: | + output.v = "hello".bytes()[-6] + error: "out of bounds" + + - name: "bytes negative index on multibyte utf8" + mapping: | + output.v = "\u00E9".bytes()[-1] + output: {"v": 169} + + - name: "bytes negative index on single byte" + mapping: | + output.v = "X".bytes()[-1] + output: {"v": 88} + + # --- Negative indexing from input --- + + - name: "array from input with negative index" + input: {"items": ["first", "middle", "last"]} + mapping: | + output.v = input.items[-1] + output: {"v": "last"} diff --git a/internal/bloblang2/spec/tests/access/null_safe.yaml b/internal/bloblang2/spec/tests/access/null_safe.yaml new file mode 100644 index 000000000..aa1aa7bbe --- /dev/null +++ b/internal/bloblang2/spec/tests/access/null_safe.yaml @@ -0,0 +1,149 @@ +description: "Null-safe navigation with ?. and ?[: short-circuits on null, type errors on non-null wrong type" + +tests: + # --- Basic ?. on null --- + + - name: "null-safe field access on null returns null" + mapping: | + $v = null + output.v = $v?.name + output: {"v": null} + + - name: "null-safe field access on non-null object works" + mapping: | + $v = {"name": "Alice"} + output.v = $v?.name + output: {"v": "Alice"} + + - name: "null-safe on null input field" + input: {"user": null} + mapping: | + output.v = input.user?.name + output: {"v": null} + + - name: "null-safe on missing input field returns null" + input: {} + mapping: | + output.v = input.missing?.name + output: {"v": null} + + # --- Chained ?. --- + + - name: "chained null-safe field access" + mapping: | + output.v = input.user?.address?.city + cases: + - name: "all present" + input: {"user": {"address": {"city": "London"}}} + output: {"v": "London"} + - name: "middle is null" + input: {"user": {"address": null}} + output: {"v": null} + - name: "first is null" + input: {"user": null} + output: {"v": null} + - name: "first missing" + input: {} + output: {"v": null} + + # --- ?[ on null --- + + - name: "null-safe bracket on null returns null" + mapping: | + $v = null + output.v = $v?["key"] + output: {"v": null} + + - name: "null-safe bracket on object works" + mapping: | + $v = {"key": "value"} + output.v = $v?["key"] + output: {"v": "value"} + + - name: "null-safe bracket on null array returns null" + mapping: | + $v = null + output.v = $v?[0] + output: {"v": null} + + - name: "null-safe bracket on array works" + mapping: | + $v = [10, 20, 30] + output.v = $v?[1] + output: {"v": 20} + + # --- Null-safe method call --- + + - name: "null-safe method on null returns null" + mapping: | + $v = null + output.v = $v?.length() + output: {"v": null} + + - name: "null-safe method on non-null works" + mapping: | + $v = "hello" + output.v = $v?.length() + output: {"v": 5} + + - name: "null-safe chained method on null" + mapping: | + $v = null + output.v = $v?.trim() + output: {"v": null} + + # --- Type errors still throw on non-null wrong type --- + + - name: "null-safe field on integer is error" + mapping: | + output.v = 5?.name + error: "field" + + - name: "null-safe field on string is error" + mapping: | + output.v = "hello"?.name + error: "field" + + - name: "null-safe field on boolean is error" + mapping: | + output.v = true?.name + error: "field" + + - name: "null-safe field on array is error" + mapping: | + output.v = [1, 2]?.name + error: "field" + + - name: "non-null wrong type in chain is error" + input: {"value": "hello"} + mapping: | + output.v = input.value?.nonfield?.trim() + error: "field" + + # --- Mixed null-safe and regular access --- + + - name: "regular then null-safe access" + input: {"user": {"profile": null}} + mapping: | + output.v = input.user.profile?.bio + output: {"v": null} + + - name: "null-safe then regular access on present value" + input: {"user": {"profile": {"bio": "hello"}}} + mapping: | + output.v = input.user?.profile.bio + output: {"v": "hello"} + + # --- Null-safe with dynamic access chains --- + + - name: "null-safe bracket then dot" + mapping: | + $data = null + output.v = $data?["key"]?.name + output: {"v": null} + + - name: "null-safe bracket on nested null" + input: {"items": null} + mapping: | + output.v = input.items?[0] + output: {"v": null} diff --git a/internal/bloblang2/spec/tests/access/out_of_bounds.yaml b/internal/bloblang2/spec/tests/access/out_of_bounds.yaml new file mode 100644 index 000000000..93be4875d --- /dev/null +++ b/internal/bloblang2/spec/tests/access/out_of_bounds.yaml @@ -0,0 +1,222 @@ +description: "Out of bounds index errors for arrays, strings, and bytes; empty collections; boundary indices" + +tests: + # --- Array out of bounds --- + + - name: "array index beyond length" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[3] + error: "out of bounds" + + - name: "array index far beyond length" + mapping: | + $arr = [1, 2] + output.v = $arr[100] + error: "out of bounds" + + - name: "empty array index 0 is error" + mapping: | + $arr = [] + output.v = $arr[0] + error: "out of bounds" + + - name: "empty array negative index is error" + mapping: | + $arr = [] + output.v = $arr[-1] + error: "out of bounds" + + - name: "array last valid positive index" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[2] + output: {"v": 30} + + - name: "array first invalid positive index" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[3] + error: "out of bounds" + + - name: "array last valid negative index" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-3] + output: {"v": 10} + + - name: "array first invalid negative index" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-4] + error: "out of bounds" + + - name: "single element array valid indices" + mapping: | + $arr = [42] + output.a = $arr[0] + output.b = $arr[-1] + output: {"a": 42, "b": 42} + + - name: "single element array invalid positive" + mapping: | + $arr = [42] + output.v = $arr[1] + error: "out of bounds" + + # --- String out of bounds --- + + - name: "string index beyond length" + mapping: | + output.v = "hello"[5] + error: "out of bounds" + + - name: "string index far beyond length" + mapping: | + output.v = "hi"[100] + error: "out of bounds" + + - name: "empty string index 0 is error" + mapping: | + output.v = ""[0] + error: "out of bounds" + + - name: "empty string negative index is error" + mapping: | + output.v = ""[-1] + error: "out of bounds" + + - name: "string last valid positive index" + mapping: | + output.v = "abc"[2] + output: {"v": 99} + + - name: "string first invalid positive index" + mapping: | + output.v = "abc"[3] + error: "out of bounds" + + - name: "string last valid negative index" + mapping: | + output.v = "abc"[-3] + output: {"v": 97} + + - name: "string first invalid negative index" + mapping: | + output.v = "abc"[-4] + error: "out of bounds" + + - name: "single char string boundary" + mapping: | + output.a = "X"[0] + output.b = "X"[-1] + output: {"a": 88, "b": 88} + + - name: "single char string invalid positive" + mapping: | + output.v = "X"[1] + error: "out of bounds" + + # --- Bytes out of bounds --- + + - name: "bytes index beyond length" + mapping: | + output.v = "hello".bytes()[5] + error: "out of bounds" + + - name: "bytes index far beyond length" + mapping: | + output.v = "hi".bytes()[100] + error: "out of bounds" + + - name: "empty bytes index 0 is error" + mapping: | + output.v = "".bytes()[0] + error: "out of bounds" + + - name: "empty bytes negative index is error" + mapping: | + output.v = "".bytes()[-1] + error: "out of bounds" + + - name: "bytes last valid positive index" + mapping: | + output.v = "abc".bytes()[2] + output: {"v": 99} + + - name: "bytes first invalid positive index" + mapping: | + output.v = "abc".bytes()[3] + error: "out of bounds" + + - name: "bytes last valid negative index" + mapping: | + output.v = "abc".bytes()[-3] + output: {"v": 97} + + - name: "bytes first invalid negative index" + mapping: | + output.v = "abc".bytes()[-4] + error: "out of bounds" + + - name: "single byte boundary" + mapping: | + output.a = "X".bytes()[0] + output.b = "X".bytes()[-1] + output: {"a": 88, "b": 88} + + - name: "single byte invalid positive" + mapping: | + output.v = "X".bytes()[1] + error: "out of bounds" + + # --- Multibyte boundary --- + + - name: "multibyte string codepoint boundary" + mapping: | + $s = "caf\u00E9" + output.last = $s[3] + output.len = $s.length() + output: {"last": 233, "len": 4} + + - name: "multibyte string codepoint out of bounds" + mapping: | + output.v = "caf\u00E9"[4] + error: "out of bounds" + + # --- Non-integer and extreme float indices --- + + - name: "array index with NaN is error" + input: {"nan": {_type: "float64", value: "NaN"}} + mapping: | + output.v = [10, 20, 30][input.nan] + error: "whole number" + + - name: "array index with Infinity is error" + input: {"inf": {_type: "float64", value: "Infinity"}} + mapping: | + output.v = [10, 20, 30][input.inf] + error: "index" + + - name: "array index with large float exceeding int64 range is error" + input: {"big": {_type: "float64", value: "1e19"}} + mapping: | + output.v = [10, 20, 30][input.big] + error: "index" + + - name: "array index with whole-number float 2.0 is accepted" + mapping: | + output.v = [10, 20, 30][2.0] + output: {"v": 30} + + - name: "array index with fractional float 1.5 is error" + mapping: | + output.v = [10, 20, 30][1.5] + error: "whole number" + + - name: "multibyte bytes boundary differs from string" + mapping: | + $s = "caf\u00E9" + output.string_len = $s.length() + output.bytes_len = $s.bytes().length() + output: {"string_len": 4, "bytes_len": 5} diff --git a/internal/bloblang2/spec/tests/case_studies/cloudformation_inventory.yaml b/internal/bloblang2/spec/tests/case_studies/cloudformation_inventory.yaml new file mode 100644 index 000000000..d1ac0fb69 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/cloudformation_inventory.yaml @@ -0,0 +1,152 @@ +description: > + AWS CloudFormation resource inventory — transform a stack description and its + resources into a CMDB-friendly record. Parses account and region from the stack + ARN, converts KV-pair arrays (Parameters, Tags, Outputs) into objects, coerces + numeric string parameters to integers, groups resources by AWS service + category, and checks overall health status. + +tests: + - name: "build inventory record from CloudFormation stack" + input: + stack: + StackName: "prod-api-stack" + StackId: "arn:aws:cloudformation:us-east-1:123456789012:stack/prod-api-stack/guid" + StackStatus: "UPDATE_COMPLETE" + LastUpdatedTime: "2024-01-14T22:15:00Z" + Parameters: + - {ParameterKey: "environment", ParameterValue: "production"} + - {ParameterKey: "instance_type", ParameterValue: "m5.xlarge"} + - {ParameterKey: "min_capacity", ParameterValue: "3"} + - {ParameterKey: "max_capacity", ParameterValue: "12"} + Tags: + - {Key: "team", Value: "platform"} + - {Key: "cost-center", Value: "eng-1234"} + Outputs: + - {OutputKey: "LoadBalancerDNS", OutputValue: "prod-api-lb-123.us-east-1.elb.amazonaws.com"} + - {OutputKey: "ApiEndpoint", OutputValue: "https://api.example.com"} + resources: + - LogicalResourceId: "WebLoadBalancer" + PhysicalResourceId: "arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/prod-api-lb/abc123" + ResourceType: "AWS::ElasticLoadBalancingV2::LoadBalancer" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "WebTargetGroup" + PhysicalResourceId: "arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/prod-api-tg/def456" + ResourceType: "AWS::ElasticLoadBalancingV2::TargetGroup" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "AppAutoScalingGroup" + PhysicalResourceId: "prod-api-asg-XYZ789" + ResourceType: "AWS::AutoScaling::AutoScalingGroup" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "AppSecurityGroup" + PhysicalResourceId: "sg-0a1b2c3d" + ResourceType: "AWS::EC2::SecurityGroup" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "AppLogGroup" + PhysicalResourceId: "/aws/ecs/prod-api" + ResourceType: "AWS::Logs::LogGroup" + ResourceStatus: "CREATE_COMPLETE" + mapping: | + # Map AWS resource type to a friendly service category. + map service_category(resource_type) { + match resource_type.split("::")[1] { + "ElasticLoadBalancingV2" => "ELB", + "AutoScaling" => "AutoScaling", + "EC2" => "EC2", + "Logs" => "CloudWatch", + _ => resource_type.split("::")[1], + } + } + + # Extract the short resource type (e.g. "LoadBalancer" from + # "AWS::ElasticLoadBalancingV2::LoadBalancer"). + map short_type(resource_type) { + resource_type.split("::")[2] + } + + $stack = input.stack + + # Parse region and account from the stack ARN + $arn_parts = $stack.StackId.split(":") + $region = $arn_parts[3] + $account = $arn_parts[4] + + # Convert Parameters KV array to object, parsing pure-numeric values + $config = $stack.Parameters.map(p -> { + $val = if p.ParameterValue.re_match("^[0-9]+$") { + p.ParameterValue.int64() + } else { + p.ParameterValue + } + {"key": p.ParameterKey, "value": $val} + }).collect() + + # Convert Tags and Outputs KV arrays to objects + $tags = $stack.Tags + .map(t -> {"key": t.Key, "value": t.Value}).collect() + $endpoints = $stack.Outputs + .map(o -> {"key": o.OutputKey, "value": o.OutputValue}).collect() + + # Group resources by service category + $services = input.resources.map(r -> service_category(r.ResourceType)).unique() + $resources_by_service = $services.map(svc -> { + "key": svc, + "value": input.resources + .filter(r -> service_category(r.ResourceType) == svc) + .map(r -> { + "logical_id": r.LogicalResourceId, + "physical_id": r.PhysicalResourceId, + "type": short_type(r.ResourceType), + }), + }).collect() + + output.stack = $stack.StackName + output.region = $region + output.account = $account + output.status = $stack.StackStatus + output.last_updated = $stack.LastUpdatedTime + output.team = $tags.team + output.cost_center = $tags."cost-center" + output.config = $config + output.endpoints = $endpoints + output.resources_by_service = $resources_by_service + output.resource_count = input.resources.length() + output.all_healthy = input.resources + .all(r -> r.ResourceStatus.has_suffix("COMPLETE")) + output: + stack: "prod-api-stack" + region: "us-east-1" + account: "123456789012" + status: "UPDATE_COMPLETE" + last_updated: "2024-01-14T22:15:00Z" + team: "platform" + cost_center: "eng-1234" + config: + environment: "production" + instance_type: "m5.xlarge" + min_capacity: 3 + max_capacity: 12 + endpoints: + LoadBalancerDNS: "prod-api-lb-123.us-east-1.elb.amazonaws.com" + ApiEndpoint: "https://api.example.com" + resources_by_service: + ELB: + - logical_id: "WebLoadBalancer" + physical_id: "arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/prod-api-lb/abc123" + type: "LoadBalancer" + - logical_id: "WebTargetGroup" + physical_id: "arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/prod-api-tg/def456" + type: "TargetGroup" + AutoScaling: + - logical_id: "AppAutoScalingGroup" + physical_id: "prod-api-asg-XYZ789" + type: "AutoScalingGroup" + EC2: + - logical_id: "AppSecurityGroup" + physical_id: "sg-0a1b2c3d" + type: "SecurityGroup" + CloudWatch: + - logical_id: "AppLogGroup" + physical_id: "/aws/ecs/prod-api" + type: "LogGroup" + resource_count: 5 + all_healthy: true diff --git a/internal/bloblang2/spec/tests/case_studies/debezium_cdc.yaml b/internal/bloblang2/spec/tests/case_studies/debezium_cdc.yaml new file mode 100644 index 000000000..0f0553a15 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/debezium_cdc.yaml @@ -0,0 +1,138 @@ +description: > + Debezium CDC change event processing — diff the before/after snapshots to + identify changed fields, parse embedded JSON columns (shipping address, line + items), convert Debezium epoch-day dates, convert cents to dollars, extract + provenance metadata, and set output metadata for downstream CDC routing. + +tests: + - name: "process Debezium MySQL update event with embedded JSON" + input: + payload: + before: + id: 10042 + customer_id: 1007 + status: "pending" + order_date: 19797 + shipping_address: '{"street":"742 Evergreen Terrace","city":"Springfield","state":"IL","zip":"62704"}' + line_items_json: '[{"sku":"WIDGET-A","qty":3,"unit_price_cents":1500},{"sku":"GADGET-B","qty":1,"unit_price_cents":4200}]' + total_cents: 8700 + currency: "USD" + updated_at: "2024-03-15T10:30:00Z" + after: + id: 10042 + customer_id: 1007 + status: "shipped" + order_date: 19797 + shipping_address: '{"street":"742 Evergreen Terrace","city":"Springfield","state":"IL","zip":"62704"}' + line_items_json: '[{"sku":"WIDGET-A","qty":3,"unit_price_cents":1500},{"sku":"GADGET-B","qty":1,"unit_price_cents":4200}]' + total_cents: 8700 + currency: "USD" + updated_at: "2024-03-15T14:22:17Z" + source: + connector: "mysql" + name: "dbserver1" + db: "inventory" + table: "orders" + ts_ms: 1710509137000 + gtid: "3e11fa47-71ca-11e1-9e33-c80aa9429562:58" + file: "mysql-bin.000003" + pos: 484 + version: "2.5.0.Final" + op: "u" + ts_ms: 1710509137425 + mapping: | + $before = input.payload.before + $after = input.payload.after + $src = input.payload.source + + # Diff: compare selected fields between before and after + $diff_fields = ["status", "updated_at", "total_cents", "currency"] + $changed = $diff_fields + .filter(f -> $before[f] != $after[f]) + .map(f -> {"key": f, "value": {"before": $before[f], "after": $after[f]}}) + .collect() + + # Parse embedded JSON string columns from the after image + $address = $after.shipping_address.parse_json() + $line_items = $after.line_items_json.parse_json() + .map(li -> li.merge({"subtotal_cents": li.qty * li.unit_price_cents})) + + # Convert Debezium epoch-day (days since 1970-01-01) to date string + $order_date = ($after.order_date * 86400) + .ts_from_unix().ts_format("%Y-%m-%d") + + # Event timestamp from Debezium milliseconds + $event_ts = input.payload.ts_ms + .ts_from_unix_milli().string() + + # Map operation code to name + $op_name = match input.payload.op { + "c" => "create", + "u" => "update", + "d" => "delete", + "r" => "snapshot", + _ => "unknown", + } + + output.table = $src.db + "." + $src.table + output.operation = $op_name + output.key = {"id": $after.id} + output.timestamp = $event_ts + output.changed_fields = $changed + output.current = { + "id": $after.id, + "customer_id": $after.customer_id, + "status": $after.status, + "order_date": $order_date, + "shipping_address": $address, + "line_items": $line_items, + "total_dollars": $after.total_cents.float64() / 100.0, + "currency": $after.currency, + } + output.provenance = { + "connector": $src.connector, + "server": $src.name, + "gtid": $src.gtid, + "binlog_file": $src.file, + "binlog_pos": $src.pos, + "version": $src.version, + } + + # Set CDC routing metadata + output@.cdc_table = $src.db + "." + $src.table + output@.cdc_operation = $op_name + output@.cdc_gtid = $src.gtid + output: + table: "inventory.orders" + operation: "update" + key: {id: 10042} + timestamp: "2024-03-15T13:25:37.425Z" + changed_fields: + status: {before: "pending", after: "shipped"} + updated_at: {before: "2024-03-15T10:30:00Z", after: "2024-03-15T14:22:17Z"} + current: + id: 10042 + customer_id: 1007 + status: "shipped" + order_date: "2024-03-15" + shipping_address: + street: "742 Evergreen Terrace" + city: "Springfield" + state: "IL" + zip: "62704" + line_items: + - {sku: "WIDGET-A", qty: 3, unit_price_cents: 1500, subtotal_cents: 4500} + - {sku: "GADGET-B", qty: 1, unit_price_cents: 4200, subtotal_cents: 4200} + total_dollars: 87.0 + currency: "USD" + provenance: + connector: "mysql" + server: "dbserver1" + gtid: "3e11fa47-71ca-11e1-9e33-c80aa9429562:58" + binlog_file: "mysql-bin.000003" + binlog_pos: 484 + version: "2.5.0.Final" + output_metadata: + cdc_table: "inventory.orders" + cdc_operation: "update" + cdc_gtid: "3e11fa47-71ca-11e1-9e33-c80aa9429562:58" diff --git a/internal/bloblang2/spec/tests/case_studies/ecommerce_order.yaml b/internal/bloblang2/spec/tests/case_studies/ecommerce_order.yaml new file mode 100644 index 000000000..2310737a6 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/ecommerce_order.yaml @@ -0,0 +1,158 @@ +description: > + E-commerce order normalization — flatten a Shopify-style order into a + warehouse-friendly format. Joins fulfillment tracking back to line items, + computes net prices from string-encoded decimals, converts units, formats + addresses, and detects billing/shipping mismatches. + +tests: + - name: "normalize Shopify order for warehouse ERP" + input: + order: + id: 5765806342 + email: "jane@example.com" + created_at: "2024-01-15T10:30:00-05:00" + currency: "USD" + billing_address: + first_name: "John" + last_name: "Smith" + address1: "123 Fake Street" + city: "Faketown" + province_code: "ON" + country_code: "CA" + zip: "K2P 1L4" + shipping_address: + first_name: "Jane" + last_name: "Smith" + address1: "123 Fake Street" + city: "Faketown" + province_code: "ON" + country_code: "CA" + zip: "K2P 1L4" + line_items: + - id: 1001 + title: "Red Leather Coat" + sku: "RLC-001" + quantity: 1 + price: "129.99" + grams: 1700 + tax_lines: + - {price: "7.80"} + discount_allocations: + - {amount: "13.00"} + - id: 1002 + title: "Blue Suede Shoes" + sku: "BSS-001" + quantity: 1 + price: "85.95" + grams: 750 + tax_lines: [] + discount_allocations: [] + - id: 1003 + title: "Raspberry Beret" + sku: "RB-001" + quantity: 2 + price: "19.99" + grams: 320 + tax_lines: + - {price: "2.40"} + discount_allocations: [] + fulfillments: + - id: 1 + tracking_numbers: ["1Z999AA10123456784"] + line_items: + - {id: 1001} + - {id: 1002} + discount_codes: + - {code: "FAKE30"} + mapping: | + map format_address(addr) { + addr.first_name + " " + addr.last_name + ", " + + addr.address1 + ", " + addr.city + " " + + addr.province_code + " " + addr.zip + ", " + addr.country_code + } + + $order = input.order + $fulfillments = $order.fulfillments + $bill = $order.billing_address + $ship = $order.shipping_address + + $items = $order.line_items.map(item -> { + $unit_price = item.price.float64() + $discount = item.discount_allocations + .map(d -> d.amount.float64()).sum().float64() + $tax = item.tax_lines + .map(t -> t.price.float64()).sum().float64() + $net = ($unit_price * item.quantity - $discount + $tax).round(2) + $weight_kg = (item.grams * item.quantity).float64() / 1000.0 + $ful = $fulfillments.find(f -> + f.line_items.any(li -> li.id == item.id) + ).or(null) + { + "sku": item.sku, + "title": item.title, + "quantity": item.quantity, + "unit_price": $unit_price, + "discount": $discount, + "tax_total": $tax, + "net_price": $net, + "weight_kg": $weight_kg, + "fulfilled": $ful != null, + "tracking": $ful?.tracking_numbers?[0].or(null), + } + }) + + output.order_id = $order.id + output.order_date = $order.created_at.slice(0, 10) + output.customer_email = $order.email + output.currency = $order.currency + output.shipping_differs_from_billing = + $bill.first_name != $ship.first_name || + $bill.last_name != $ship.last_name || + $bill.address1 != $ship.address1 || + $bill.city != $ship.city + output.ship_to = format_address($ship) + output.items = $items + output.unfulfilled_count = $items.filter(i -> !i.fulfilled).length() + output.total_weight_kg = $items.map(i -> i.weight_kg).sum().round(2) + output.discount_codes_used = $order.discount_codes.map(d -> d.code) + output: + order_id: 5765806342 + order_date: "2024-01-15" + customer_email: "jane@example.com" + currency: "USD" + shipping_differs_from_billing: true + ship_to: "Jane Smith, 123 Fake Street, Faketown ON K2P 1L4, CA" + items: + - sku: "RLC-001" + title: "Red Leather Coat" + quantity: 1 + unit_price: 129.99 + discount: 13.0 + tax_total: 7.8 + net_price: 124.79 + weight_kg: 1.7 + fulfilled: true + tracking: "1Z999AA10123456784" + - sku: "BSS-001" + title: "Blue Suede Shoes" + quantity: 1 + unit_price: 85.95 + discount: 0.0 + tax_total: 0.0 + net_price: 85.95 + weight_kg: 0.75 + fulfilled: true + tracking: "1Z999AA10123456784" + - sku: "RB-001" + title: "Raspberry Beret" + quantity: 2 + unit_price: 19.99 + discount: 0.0 + tax_total: 2.4 + net_price: 42.38 + weight_kg: 0.64 + fulfilled: false + tracking: null + unfulfilled_count: 1 + total_weight_kg: 3.09 + discount_codes_used: ["FAKE30"] diff --git a/internal/bloblang2/spec/tests/case_studies/ga4_clickstream.yaml b/internal/bloblang2/spec/tests/case_studies/ga4_clickstream.yaml new file mode 100644 index 000000000..9faad9d96 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/ga4_clickstream.yaml @@ -0,0 +1,213 @@ +description: > + Google Analytics 4 clickstream normalization — flatten the BigQuery export + format's typed-value union event_params and user_properties into plain + objects, parse microsecond timestamps, build item category hierarchies, + and compute per-item discount percentages and subtotals. + +tests: + - name: "flatten GA4 purchase event from BigQuery export" + input: + event_date: "20240315" + event_name: "purchase" + event_timestamp: "1710505200000000" + event_params: + - key: "session_id" + value: {string_value: null, int_value: 1710504000, double_value: null} + - key: "page_location" + value: {string_value: "https://shop.example.com/checkout", int_value: null, double_value: null} + - key: "transaction_id" + value: {string_value: "TXN-8A3F", int_value: null, double_value: null} + - key: "value" + value: {string_value: null, int_value: null, double_value: 127.5} + - key: "currency" + value: {string_value: "USD", int_value: null, double_value: null} + - key: "shipping" + value: {string_value: null, int_value: null, double_value: 7.5} + - key: "tax" + value: {string_value: null, int_value: null, double_value: 10.0} + - key: "coupon" + value: {string_value: "SPRING24", int_value: null, double_value: null} + user_id: "user-9f8e7d6c" + user_properties: + - key: "membership_tier" + value: {string_value: "gold", double_value: null} + - key: "lifetime_value" + value: {string_value: null, double_value: 1842.3} + device: + category: "mobile" + mobile_brand_name: "Apple" + mobile_model_name: "iPhone 15" + operating_system: "iOS" + operating_system_version: "17.3.1" + web_info: + browser: "Safari" + browser_version: "17.3" + geo: + country: "United States" + region: "California" + city: "San Francisco" + traffic_source: + source: "instagram" + medium: "paid_social" + name: "spring_collection_2024" + items: + - item_id: "SKU-W-1042" + item_name: "Merino Wool Cardigan" + item_brand: "HouseLabel" + item_category: "Apparel" + item_category2: "Women" + item_category3: "Knitwear" + item_variant: "Oatmeal / M" + price: 89.0 + quantity: 1 + discount: 13.35 + item_list_name: "Spring Collection" + - item_id: "SKU-A-2087" + item_name: "Leather Crossbody Bag" + item_brand: "HouseLabel" + item_category: "Accessories" + item_category2: "Bags" + item_category3: "Crossbody" + item_variant: "Tan" + price: 65.0 + quantity: 1 + discount: 0.0 + item_list_name: "Recommended For You" + mapping: | + # Extract the single non-null typed value from a GA4 value object. + map ga4_value(v) { + match { + v.string_value != null => v.string_value, + v.int_value != null => v.int_value, + v.double_value != null => v.double_value, + _ => null, + } + } + + # Collapse KV arrays with typed-value unions into plain objects + $params = input.event_params + .map(p -> {"key": p.key, "value": ga4_value(p.value)}) + .collect() + $user_props = input.user_properties + .map(p -> {"key": p.key, "value": ga4_value(p.value)}) + .collect() + + # Parse event timestamp (microseconds since epoch) + $event_ts = (input.event_timestamp.int64() / 1000000) + .ts_from_unix().string() + + # Build item records with category hierarchy and computed fields + $items = input.items.map(item -> { + $cats = [item.item_category, item.item_category2, item.item_category3] + .filter(c -> c != null && c != "") + $subtotal = (item.price * item.quantity - item.discount).round(2) + $disc_pct = if item.price > 0.0 && item.discount > 0.0 { + (item.discount / (item.price * item.quantity) * 100.0).round(1) + } else { + 0.0 + } + { + "sku": item.item_id, + "name": item.item_name, + "brand": item.item_brand, + "categories": $cats, + "variant": item.item_variant, + "price": item.price, + "quantity": item.quantity, + "discount": item.discount, + "discount_pct": $disc_pct, + "subtotal": $subtotal, + "source_list": item.item_list_name, + } + }) + + output.event = input.event_name + output.timestamp = $event_ts + output.session_id = $params.session_id + output.user = { + "id": input.user_id, + "properties": $user_props, + } + output.page_url = $params.page_location + output.transaction = { + "id": $params.transaction_id, + "revenue": $params.value, + "shipping": $params.shipping, + "tax": $params.tax, + "coupon": $params.coupon, + "item_total": $items.map(i -> i.subtotal).sum().round(2), + "total_discount": $items.map(i -> i.discount).sum().round(2), + } + output.items = $items + output.device = { + "type": input.device.category, + "brand": input.device.mobile_brand_name, + "model": input.device.mobile_model_name, + "os": input.device.operating_system + " " + input.device.operating_system_version, + "browser": input.device.web_info.browser + " " + input.device.web_info.browser_version, + } + output.geo = { + "country": input.geo.country, + "region": input.geo.region, + "city": input.geo.city, + } + output.attribution = { + "source": input.traffic_source.source, + "medium": input.traffic_source.medium, + "campaign": input.traffic_source.name, + } + output: + event: "purchase" + timestamp: "2024-03-15T12:20:00Z" + session_id: 1710504000 + user: + id: "user-9f8e7d6c" + properties: + membership_tier: "gold" + lifetime_value: 1842.3 + page_url: "https://shop.example.com/checkout" + transaction: + id: "TXN-8A3F" + revenue: 127.5 + shipping: 7.5 + tax: 10.0 + coupon: "SPRING24" + item_total: 140.65 + total_discount: 13.35 + items: + - sku: "SKU-W-1042" + name: "Merino Wool Cardigan" + brand: "HouseLabel" + categories: ["Apparel", "Women", "Knitwear"] + variant: "Oatmeal / M" + price: 89.0 + quantity: 1 + discount: 13.35 + discount_pct: 15.0 + subtotal: 75.65 + source_list: "Spring Collection" + - sku: "SKU-A-2087" + name: "Leather Crossbody Bag" + brand: "HouseLabel" + categories: ["Accessories", "Bags", "Crossbody"] + variant: "Tan" + price: 65.0 + quantity: 1 + discount: 0.0 + discount_pct: 0.0 + subtotal: 65.0 + source_list: "Recommended For You" + device: + type: "mobile" + brand: "Apple" + model: "iPhone 15" + os: "iOS 17.3.1" + browser: "Safari 17.3" + geo: + country: "United States" + region: "California" + city: "San Francisco" + attribution: + source: "instagram" + medium: "paid_social" + campaign: "spring_collection_2024" diff --git a/internal/bloblang2/spec/tests/case_studies/github_webhook.yaml b/internal/bloblang2/spec/tests/case_studies/github_webhook.yaml new file mode 100644 index 000000000..fd120ac6a --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/github_webhook.yaml @@ -0,0 +1,95 @@ +description: > + GitHub pull request webhook normalization — extract fields from a deeply nested + PR event payload, parse issue references from the body with regex, merge + reviewer lists, categorize PR size, and build a notification-ready summary. + +tests: + - name: "normalize PR opened webhook into notification event" + input: + action: "opened" + number: 42 + pull_request: + title: "feat: add retry logic to payment processor" + body: "## Summary\nAdds exponential backoff.\n\nCloses #38\nRelated: #35, #40" + html_url: "https://github.com/acme/payments/pull/42" + state: "open" + draft: false + additions: 347 + deletions: 42 + changed_files: 8 + created_at: "2024-01-15T14:30:00Z" + user: + login: "alice-dev" + head: + ref: "feat/payment-retry" + base: + ref: "main" + labels: + - {name: "enhancement"} + - {name: "payments"} + - {name: "needs-review"} + requested_reviewers: + - {login: "bob-reviewer"} + - {login: "carol-lead"} + requested_teams: + - {name: "platform-team"} + mapping: | + $pr = input.pull_request + $url_parts = $pr.html_url.split("/") + $repo = $url_parts[3] + "/" + $url_parts[4] + + # Categorize by total lines changed + $total_changes = $pr.additions + $pr.deletions + $size_category = match { + $total_changes > 300 => "large", + $total_changes > 100 => "medium", + _ => "small", + } + + # Parse issue references (#NNN) from PR body, deduplicate and sort + $issue_refs = $pr.body.re_find_all("#\\d+") + .map(ref -> ref.trim_prefix("#").int64()) + .sort() + .unique() + + # Merge individual reviewers and team reviewers into one sorted list + $reviewers = $pr.requested_reviewers.map(r -> r.login) + .concat($pr.requested_teams.map(t -> t.name)) + .sort() + + output.event_type = "pr_" + input.action + output.repo = $repo + output.pr_number = input.number + output.title = $pr.title + output.author = $pr.user.login + output.url = $pr.html_url + output.branch = $pr.head.ref + " -> " + $pr.base.ref + output.labels = $pr.labels.map(l -> l.name).sort() + output.reviewers = $reviewers + output.size = { + "additions": $pr.additions, + "deletions": $pr.deletions, + "files": $pr.changed_files, + "category": $size_category, + } + output.referenced_issues = $issue_refs + output.is_feature = $pr.title.has_prefix("feat") + output.summary = "[" + $repo + "] " + $pr.user.login + " " + input.action + " #" + input.number.string() + ": " + $pr.title + " (" + $size_category + ", " + $pr.changed_files.string() + " files)" + output: + event_type: "pr_opened" + repo: "acme/payments" + pr_number: 42 + title: "feat: add retry logic to payment processor" + author: "alice-dev" + url: "https://github.com/acme/payments/pull/42" + branch: "feat/payment-retry -> main" + labels: ["enhancement", "needs-review", "payments"] + reviewers: ["bob-reviewer", "carol-lead", "platform-team"] + size: + additions: 347 + deletions: 42 + files: 8 + category: "large" + referenced_issues: [35, 38, 40] + is_feature: true + summary: "[acme/payments] alice-dev opened #42: feat: add retry logic to payment processor (large, 8 files)" diff --git a/internal/bloblang2/spec/tests/case_studies/kubernetes_pod.yaml b/internal/bloblang2/spec/tests/case_studies/kubernetes_pod.yaml new file mode 100644 index 000000000..b5bc5468f --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/kubernetes_pod.yaml @@ -0,0 +1,157 @@ +description: > + Kubernetes pod health alert — zip container spec with status by name to + correlate resource limits with failure reasons, detect OOMKilled containers, + build a condition summary from the conditions array, and set output metadata + for alert routing. + +tests: + - name: "build alert from unhealthy pod with OOMKilled container" + input: + metadata: + name: "order-processor-7b4f8d6c9-xk2lp" + namespace: "production" + labels: + app.kubernetes.io/name: "order-processor" + app.kubernetes.io/version: "2.4.1" + creationTimestamp: "2024-03-15T08:22:05Z" + spec: + nodeName: "ip-10-0-47-132.ec2.internal" + containers: + - name: "app" + image: "registry.internal/order-processor:2.4.1" + resources: + limits: {cpu: "1000m", memory: "1Gi"} + - name: "sidecar" + image: "envoyproxy/envoy:v1.28.0" + resources: + limits: {cpu: "500m", memory: "256Mi"} + status: + phase: "Running" + conditions: + - {type: "Initialized", status: "True"} + - {type: "ContainersReady", status: "False", reason: "ContainersNotReady"} + - {type: "Ready", status: "False", reason: "ContainersNotReady"} + startTime: "2024-03-15T08:22:05Z" + containerStatuses: + - name: "app" + ready: false + restartCount: 4 + started: false + state: + waiting: {reason: "CrashLoopBackOff"} + lastState: + terminated: + exitCode: 137 + reason: "OOMKilled" + startedAt: "2024-03-15T09:11:45Z" + finishedAt: "2024-03-15T09:14:32Z" + - name: "sidecar" + ready: true + restartCount: 0 + started: true + state: + running: {startedAt: "2024-03-15T08:22:20Z"} + lastState: {} + mapping: | + $meta = input.metadata + $spec = input.spec + $status = input.status + + # For each unhealthy container, look up its resource limits from spec + $unhealthy = $status.containerStatuses + .filter(cs -> !cs.ready) + .map(cs -> { + $spec_c = $spec.containers + .find(c -> c.name == cs.name).or(null) + $last_exit = if cs.lastState.has_key("terminated") { + cs.lastState.terminated + } else { + null + } + $current_state = match { + cs.state.has_key("waiting") => cs.state.waiting.reason, + cs.state.has_key("terminated") => cs.state.terminated.reason, + _ => "Running", + } + { + "name": cs.name, + "state": $current_state, + "last_exit_code": $last_exit?.exitCode.or(null), + "last_exit_reason": $last_exit?.reason.or(null), + "restart_count": cs.restartCount, + "memory_limit": $spec_c?.resources?.limits?.memory.or(null), + "cpu_limit": $spec_c?.resources?.limits?.cpu.or(null), + "oom_suspected": $last_exit != null && $last_exit?.reason.or("") == "OOMKilled", + } + }) + + # Convert conditions array to a boolean summary object + $condition_summary = $status.conditions + .map(c -> {"key": c.type, "value": c.status == "True"}) + .collect() + + # Severity: critical if any OOM or high restart count + $severity = if $unhealthy.any(c -> c.oom_suspected || c.restart_count > 3) { + "critical" + } else if $unhealthy.length() > 0 { + "warning" + } else { + "ok" + } + + output.alert = { + "severity": $severity, + "source": "k8s-pod-monitor", + } + output.pod = { + "name": $meta.name, + "namespace": $meta.namespace, + "node": $spec.nodeName, + "app": $meta.labels."app.kubernetes.io/name", + "version": $meta.labels."app.kubernetes.io/version", + "phase": $status.phase, + } + output.unhealthy_containers = $unhealthy + output.condition_summary = $condition_summary + output.not_ready_reason = $status.conditions + .find(c -> c.type == "Ready" && c.status != "True") + .or(null)?.reason.or(null) + output.healthy_container_count = $status.containerStatuses + .filter(cs -> cs.ready).length() + output.total_container_count = $status.containerStatuses.length() + + # Set metadata for alert routing + output@.alert_severity = $severity + output@.k8s_namespace = $meta.namespace + output@.k8s_pod = $meta.name + output: + alert: + severity: "critical" + source: "k8s-pod-monitor" + pod: + name: "order-processor-7b4f8d6c9-xk2lp" + namespace: "production" + node: "ip-10-0-47-132.ec2.internal" + app: "order-processor" + version: "2.4.1" + phase: "Running" + unhealthy_containers: + - name: "app" + state: "CrashLoopBackOff" + last_exit_code: 137 + last_exit_reason: "OOMKilled" + restart_count: 4 + memory_limit: "1Gi" + cpu_limit: "1000m" + oom_suspected: true + condition_summary: + Initialized: true + ContainersReady: false + Ready: false + not_ready_reason: "ContainersNotReady" + healthy_container_count: 1 + total_container_count: 2 + output_metadata: + alert_severity: "critical" + k8s_namespace: "production" + k8s_pod: "order-processor-7b4f8d6c9-xk2lp" diff --git a/internal/bloblang2/spec/tests/case_studies/nlp_enrichment.yaml b/internal/bloblang2/spec/tests/case_studies/nlp_enrichment.yaml new file mode 100644 index 000000000..dee296d29 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/nlp_enrichment.yaml @@ -0,0 +1,83 @@ +description: > + NLP enrichment merge — combine entity recognition and key phrase extraction + results into a unified structure. Groups entities by type, correlates key + phrases with overlapping entities by character offset, and ranks entities + by confidence score. + +tests: + - name: "merge NER entities and key phrases into enriched document" + input: + source_text: "Bob ordered two sandwiches from Seattle Deli on January 5th for $24.99" + entities: + - {text: "Bob", score: 0.997, type: "PERSON", begin_offset: 0, end_offset: 3} + - {text: "two", score: 0.95, type: "QUANTITY", begin_offset: 12, end_offset: 15} + - {text: "Seattle Deli", score: 0.988, type: "ORGANIZATION", begin_offset: 33, end_offset: 45} + - {text: "January 5th", score: 0.999, type: "DATE", begin_offset: 49, end_offset: 60} + - {text: "$24.99", score: 0.93, type: "QUANTITY", begin_offset: 65, end_offset: 71} + sentiment: + label: "NEUTRAL" + scores: {positive: 0.12, negative: 0.03, neutral: 0.84, mixed: 0.01} + key_phrases: + - {text: "two sandwiches", score: 0.99, begin_offset: 12, end_offset: 27} + - {text: "Seattle Deli", score: 0.98, begin_offset: 33, end_offset: 45} + - {text: "January 5th", score: 0.97, begin_offset: 49, end_offset: 60} + mapping: | + $entities = input.entities + + # Group entities by type using unique types + filter + $entity_types = $entities.map(e -> e.type).unique() + $entities_by_type = $entity_types + .map(t -> { + "key": t, + "value": $entities.filter(e -> e.type == t).map(e -> e.text), + }) + .collect() + + # For each key phrase, find the first entity whose character span overlaps + $enriched_phrases = input.key_phrases.map(kp -> { + $overlap = $entities.find(e -> + e.begin_offset < kp.end_offset && e.end_offset > kp.begin_offset + ).or(null) + { + "phrase": kp.text, + "overlapping_entity_type": $overlap?.type.or(null), + } + }) + + # High-confidence entities (>= 0.95), sorted by score descending + $high_conf = $entities + .filter(e -> e.score >= 0.95) + .sort_by(e -> e.score) + .reverse() + .map(e -> {"text": e.text, "type": e.type, "score": e.score}) + + # Pick the dominant sentiment score + $sent = input.sentiment + + output.text = input.source_text + output.sentiment = $sent.label + output.sentiment_confidence = $sent.scores.values().max() + output.entities_by_type = $entities_by_type + output.enriched_phrases = $enriched_phrases + output.high_confidence_entities = $high_conf + output: + text: "Bob ordered two sandwiches from Seattle Deli on January 5th for $24.99" + sentiment: "NEUTRAL" + sentiment_confidence: 0.84 + entities_by_type: + PERSON: ["Bob"] + QUANTITY: ["two", "$24.99"] + ORGANIZATION: ["Seattle Deli"] + DATE: ["January 5th"] + enriched_phrases: + - phrase: "two sandwiches" + overlapping_entity_type: "QUANTITY" + - phrase: "Seattle Deli" + overlapping_entity_type: "ORGANIZATION" + - phrase: "January 5th" + overlapping_entity_type: "DATE" + high_confidence_entities: + - {text: "January 5th", type: "DATE", score: 0.999} + - {text: "Bob", type: "PERSON", score: 0.997} + - {text: "Seattle Deli", type: "ORGANIZATION", score: 0.988} + - {text: "two", type: "QUANTITY", score: 0.95} diff --git a/internal/bloblang2/spec/tests/case_studies/otel_traces.yaml b/internal/bloblang2/spec/tests/case_studies/otel_traces.yaml new file mode 100644 index 000000000..428306713 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/otel_traces.yaml @@ -0,0 +1,195 @@ +description: > + OpenTelemetry trace flattening — collapse the deeply nested OTLP export + format (resourceSpans -> scopeSpans -> spans) into flat span records. + Denormalizes resource attributes, converts the OTLP key-value attribute + model to plain objects, maps numeric span kinds to names, computes + durations from nanosecond timestamps, and extracts error info from events. + +tests: + - name: "flatten OTLP trace export into span records" + input: + resourceSpans: + - resource: + attributes: + - key: "service.name" + value: {stringValue: "api-gateway"} + - key: "service.version" + value: {stringValue: "2.4.1"} + - key: "deployment.environment" + value: {stringValue: "production"} + scopeSpans: + - spans: + - traceId: "5B8EFFF798038103D269B633813FC60C" + spanId: "EEE19B7EC3C1B174" + parentSpanId: "" + name: "POST /api/v1/orders" + kind: 2 + startTimeUnixNano: "1705312260000000000" + endTimeUnixNano: "1705312260345000000" + attributes: + - key: "http.method" + value: {stringValue: "POST"} + - key: "http.url" + value: {stringValue: "/api/v1/orders"} + - key: "http.status_code" + value: {intValue: "201"} + events: + - name: "auth.verified" + attributes: + - key: "auth.method" + value: {stringValue: "jwt"} + - name: "exception" + attributes: + - key: "exception.message" + value: {stringValue: "Deprecated field used"} + status: {code: 1} + - resource: + attributes: + - key: "service.name" + value: {stringValue: "order-service"} + - key: "service.version" + value: {stringValue: "3.1.0"} + - key: "deployment.environment" + value: {stringValue: "production"} + scopeSpans: + - spans: + - traceId: "5B8EFFF798038103D269B633813FC60C" + spanId: "AAA19B7EC3C1B175" + parentSpanId: "EEE19B7EC3C1B174" + name: "OrderService/CreateOrder" + kind: 2 + startTimeUnixNano: "1705312260050000000" + endTimeUnixNano: "1705312260300000000" + attributes: + - key: "rpc.system" + value: {stringValue: "grpc"} + - key: "rpc.method" + value: {stringValue: "CreateOrder"} + events: [] + status: {code: 1} + - traceId: "5B8EFFF798038103D269B633813FC60C" + spanId: "BBB19B7EC3C1B176" + parentSpanId: "AAA19B7EC3C1B175" + name: "postgres.query" + kind: 3 + startTimeUnixNano: "1705312260100000000" + endTimeUnixNano: "1705312260250000000" + attributes: + - key: "db.system" + value: {stringValue: "postgresql"} + - key: "db.statement" + value: {stringValue: "INSERT INTO orders (id, customer_id, total) VALUES ($1, $2, $3)"} + events: [] + status: {code: 1} + mapping: | + # Extract the typed value from an OTLP attribute value object. + map extract_value(v) { + match { + v.has_key("stringValue") => v.stringValue, + v.has_key("intValue") => v.intValue.int64(), + _ => null, + } + } + + # Convert an OTLP attribute array [{key, value}, ...] to a plain object. + map attrs_to_object(attrs) { + attrs.map(a -> {"key": a.key, "value": extract_value(a.value)}).collect() + } + + # Map OTLP numeric span kind to its name. + map span_kind_name(kind) { + match kind { + 1 => "INTERNAL", + 2 => "SERVER", + 3 => "CLIENT", + 4 => "PRODUCER", + 5 => "CONSUMER", + _ => "UNSPECIFIED", + } + } + + # Flatten: resourceSpans -> scopeSpans -> spans, denormalizing resource + # attributes onto each span. + output = input.resourceSpans.map(rs -> { + $res_attrs = attrs_to_object(rs.resource.attributes) + rs.scopeSpans.map(ss -> + ss.spans.map(span -> { + $attrs = attrs_to_object(span.attributes) + $duration_ns = span.endTimeUnixNano.int64() - span.startTimeUnixNano.int64() + $error_event = span.events + .find(e -> e.name == "exception").or(null) + $error_msg = if $error_event != null { + $err_attrs = attrs_to_object($error_event.attributes) + $err_attrs."exception.message".or(null) + } else { + null + } + { + "trace_id": span.traceId, + "span_id": span.spanId, + "parent_span_id": if span.parentSpanId == "" { null } else { span.parentSpanId }, + "service": $res_attrs."service.name", + "service_version": $res_attrs."service.version", + "environment": $res_attrs."deployment.environment", + "operation": span.name, + "span_kind": span_kind_name(span.kind), + "duration_ms": $duration_ns.float64() / 1000000.0, + "attributes": $attrs, + "event_names": span.events.map(e -> e.name), + "has_error_event": span.events.any(e -> e.name == "exception"), + "error_message": $error_msg, + "status_ok": span.status.code == 1, + } + }) + ).flatten() + }).flatten() + output: + - trace_id: "5B8EFFF798038103D269B633813FC60C" + span_id: "EEE19B7EC3C1B174" + parent_span_id: null + service: "api-gateway" + service_version: "2.4.1" + environment: "production" + operation: "POST /api/v1/orders" + span_kind: "SERVER" + duration_ms: 345.0 + attributes: + http.method: "POST" + http.url: "/api/v1/orders" + http.status_code: 201 + event_names: ["auth.verified", "exception"] + has_error_event: true + error_message: "Deprecated field used" + status_ok: true + - trace_id: "5B8EFFF798038103D269B633813FC60C" + span_id: "AAA19B7EC3C1B175" + parent_span_id: "EEE19B7EC3C1B174" + service: "order-service" + service_version: "3.1.0" + environment: "production" + operation: "OrderService/CreateOrder" + span_kind: "SERVER" + duration_ms: 250.0 + attributes: + rpc.system: "grpc" + rpc.method: "CreateOrder" + event_names: [] + has_error_event: false + error_message: null + status_ok: true + - trace_id: "5B8EFFF798038103D269B633813FC60C" + span_id: "BBB19B7EC3C1B176" + parent_span_id: "AAA19B7EC3C1B175" + service: "order-service" + service_version: "3.1.0" + environment: "production" + operation: "postgres.query" + span_kind: "CLIENT" + duration_ms: 150.0 + attributes: + db.system: "postgresql" + db.statement: "INSERT INTO orders (id, customer_id, total) VALUES ($1, $2, $3)" + event_names: [] + has_error_event: false + error_message: null + status_ok: true diff --git a/internal/bloblang2/spec/tests/case_studies/stripe_invoice.yaml b/internal/bloblang2/spec/tests/case_studies/stripe_invoice.yaml new file mode 100644 index 000000000..6f489a720 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/stripe_invoice.yaml @@ -0,0 +1,111 @@ +description: > + Stripe invoice webhook normalization — parse embedded JSON from metadata, + convert Unix timestamps to RFC 3339, convert amounts from cents to dollars, + transform object keys, and restructure for a billing data warehouse. + +tests: + - name: "normalize Stripe invoice.paid event" + input: + type: "invoice.paid" + created: 1709251200 + data: + object: + id: "in_1OqR3m" + number: "INV-2024-0218" + customer: "cus_PaB3xK" + customer_email: "ops@megacorp.io" + customer_name: "MegaCorp Engineering" + metadata: + internal_account_id: "acct-00482" + provisioning: '{"tier":"growth","seats":25,"features":["sso","audit_log"]}' + salesforce_opp_id: "006Dn000002XLPQ" + status: "paid" + subscription: "sub_1NrT7a" + currency: "usd" + subtotal: 14900 + tax: 1200 + total: 16100 + status_transitions: + paid_at: 1709251200 + voided_at: null + lines: + data: + - amount: 9900 + description: "Growth Plan" + quantity: 1 + product: "prod_Growth" + - amount: 5000 + description: "Extra seats" + quantity: 5 + product: "prod_Seats" + mapping: | + $inv = input.data.object + + # Parse embedded JSON from the provisioning metadata field + $provisioning = $inv.metadata.provisioning.parse_json() + + # Convert each line item: cents to dollars + $line_items = $inv.lines.data.map(item -> { + "description": item.description, + "amount_dollars": item.amount.float64() / 100.0, + "quantity": item.quantity, + "product_id": item.product, + }) + + output.invoice_id = $inv.id + output.invoice_number = $inv.number + output.event_type = input.type + output.customer = { + "id": $inv.customer, + "name": $inv.customer_name, + "email": $inv.customer_email, + } + output.provisioning = $provisioning + output.currency = $inv.currency.uppercase() + output.subtotal_dollars = $inv.subtotal.float64() / 100.0 + output.tax_dollars = $inv.tax.float64() / 100.0 + output.total_dollars = $inv.total.float64() / 100.0 + output.line_items = $line_items + output.status = $inv.status + + # Convert Unix epoch to RFC 3339 timestamp + output.paid_at = $inv.status_transitions.paid_at + .ts_from_unix().string() + + output.subscription_id = $inv.subscription + + # Strip the parsed provisioning field, then kebab-case the remaining keys + output.external_refs = $inv.metadata + .without(["provisioning"]) + .map_keys(k -> k.replace_all("_", "-")) + output: + invoice_id: "in_1OqR3m" + invoice_number: "INV-2024-0218" + event_type: "invoice.paid" + customer: + id: "cus_PaB3xK" + name: "MegaCorp Engineering" + email: "ops@megacorp.io" + provisioning: + tier: "growth" + seats: 25 + features: ["sso", "audit_log"] + currency: "USD" + subtotal_dollars: 149.0 + tax_dollars: 12.0 + total_dollars: 161.0 + line_items: + - description: "Growth Plan" + amount_dollars: 99.0 + quantity: 1 + product_id: "prod_Growth" + - description: "Extra seats" + amount_dollars: 50.0 + quantity: 5 + product_id: "prod_Seats" + status: "paid" + paid_at: "2024-03-01T00:00:00Z" + subscription_id: "sub_1NrT7a" + external_refs: + internal-account-id: "acct-00482" + salesforce-opp-id: "006Dn000002XLPQ" diff --git a/internal/bloblang2/spec/tests/case_studies/v2_feature_showcase.yaml b/internal/bloblang2/spec/tests/case_studies/v2_feature_showcase.yaml new file mode 100644 index 000000000..fbf41b379 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/v2_feature_showcase.yaml @@ -0,0 +1,198 @@ +description: > + Bloblang V2 feature showcase — demonstrates the major syntax and semantic + improvements over V1 in a single mapping: null-safe navigation, lambda + expressions, parameterized maps with defaults and named arguments, all three + match expression forms, separated error handling (.catch vs .or), metadata + access, and variable scoping rules. + +tests: + - name: "V2 feature showcase: SaaS user event processing" + input: + event_type: "subscription_renewed" + tenant_id: "tenant-42" + user: + id: "usr_8fK2p" + name: "Alice Chen" + email: "alice@megacorp.io" + address: + city: "Portland" + state: "OR" + billing_address: null + preferences: + theme: "dark" + subscription: + plan: "enterprise" + seats: 50 + annual_spend_cents: 5990000 + transactions: + - id: "txn_003" + amount: 7500 + currency: "usd" + status: "completed" + - id: "txn_002" + amount: -200 + currency: "usd" + status: "refunded" + - id: "txn_001" + amount: 500 + currency: "usd" + status: "completed" + referral_code: null + raw_enrichment: '{"risk_score": 12, "segment": "enterprise"}' + bad_json_field: "{{not valid" + mapping: | + # ================================================================ + # Bloblang V2 Feature Showcase + # ================================================================ + + # -- 1. Null-safe navigation (?.) --------------------------- + + output.city = input.user?.address?.city + # v1: root.city = this.user.address.city | null + + output.billing_city = input.user?.billing_address?.city + # v1: root.billing_city = this.user.billing_address.city | null + + output.referral = input.referral_code?.uppercase() + # v1: root.referral = this.referral_code.uppercase() | null + + # -- 2. Parameterized maps with defaults -------------------- + + map cents_to_dollars(cents, currency = "USD") { + { + "amount": cents.float64() / 100.0, + "currency": currency.uppercase(), + } + } + # v1: map cents_to_dollars { + # root.amount = this.float64() / 100.0 + # root.currency = "USD" + # } + # (no parameters, no defaults, currency not configurable) + + map classify(amount) { + match { + amount >= 5000 => "high", + amount >= 1000 => "medium", + _ => "low", + } + } + + output.annual_spend = cents_to_dollars(cents: input.subscription.annual_spend_cents) + # v1: root.annual_spend = this.subscription.annual_spend_cents.apply("cents_to_dollars") + # (no named arguments, no way to pass currency) + + # -- 3. Lambdas & sort_by ---------------------------------- + + output.completed_txns = input.transactions + .filter(t -> t.status == "completed") + .sort_by(t -> t.amount) + .map(t -> { + "id": t.id, + "dollars": cents_to_dollars(t.amount, t.currency), + "tier": classify(t.amount), + }) + # v1: root.completed_txns = this.transactions + # .filter(this.status == "completed") + # .sort_by(this.amount) + # .map_each({ + # "id": this.id, + # "dollars": this.amount.apply("cents_to_dollars"), + # }) + # ("this" silently rebound to each element — no explicit params) + + # -- 4. Match expressions (three forms) --------------------- + + # Form A — equality match: + output.plan_label = match input.subscription.plan { + "free" => "Free Tier", + "pro" => "Professional", + "enterprise" => "Enterprise", + _ => "Unknown", + } + # v1: root.plan_label = match this.subscription.plan { + # "enterprise" => "Enterprise" + # _ => "Unknown" + # } + + # Form B — boolean match with "as" binding: + output.seat_tier = match input.subscription.seats as s { + s >= 100 => "unlimited", + s >= 25 => "large", + s >= 5 => "small", + _ => "individual", + } + # v1: no equivalent — required an if/else if/else chain + + # Form C — boolean match (no subject): + $spend = input.subscription.annual_spend_cents + output.account_tier = match { + $spend >= 10000000 => "platinum", + $spend >= 1000000 => "gold", + _ => "silver", + } + # v1: no equivalent — required an if/else if/else chain + + # -- 5. Separated error handling ---------------------------- + + output.enrichment = input.raw_enrichment.parse_json() + .catch(err -> {"parse_error": err.what}) + # v1: root.enrichment = this.raw_enrichment.parse_json().catch("unknown") + # (no access to the error message, catch was a static value) + + output.safe_parse = input.bad_json_field.parse_json() + .catch(err -> null) + # v1: root.safe_parse = this.bad_json_field.parse_json() | null + + output.referral_display = input.referral_code.or("none") + # v1: root.referral_display = this.referral_code | "none" + # (v1 | conflated null coalescing and error catching) + + # -- 6. Metadata access (input@, output@) ------------------- + + output@.routing_key = input.tenant_id + # v1: meta routing_key = this.tenant_id + + output@.event_type = input.event_type + # v1: meta event_type = this.event_type + + # -- 7. Variable scoping ----------------------------------- + + $label = "default" + output.scoped_demo = if true { + $label = "inner" + $label + } + output.outer_label = $label + # v1: $label would be mutated in-place — outer_label would be "inner" + output_metadata: + routing_key: "tenant-42" + event_type: "subscription_renewed" + output: + city: "Portland" + billing_city: null + referral: null + annual_spend: + amount: 59900.0 + currency: "USD" + completed_txns: + - id: "txn_001" + dollars: + amount: 5.0 + currency: "USD" + tier: "low" + - id: "txn_003" + dollars: + amount: 75.0 + currency: "USD" + tier: "high" + plan_label: "Enterprise" + seat_tier: "large" + account_tier: "gold" + enrichment: + risk_score: 12 + segment: "enterprise" + safe_parse: null + referral_display: "none" + scoped_demo: "inner" + outer_label: "default" diff --git a/internal/bloblang2/spec/tests/case_studies/vpc_flow_logs.yaml b/internal/bloblang2/spec/tests/case_studies/vpc_flow_logs.yaml new file mode 100644 index 000000000..19217e9b0 --- /dev/null +++ b/internal/bloblang2/spec/tests/case_studies/vpc_flow_logs.yaml @@ -0,0 +1,159 @@ +description: > + VPC Flow Logs parsing and enrichment — split space-delimited log records into + structured objects, map protocol numbers to names, classify IPs by address + range, identify well-known port services, detect anomalies (rejected SSH), + compute per-flow throughput, filter NODATA records, and aggregate a summary. + +tests: + - name: "parse and enrich VPC flow log batch" + input: + owner: "123456789012" + logGroup: "/aws/vpc/flowlogs/vpc-0a1b2c3d" + logStream: "eni-0f1a2b3c-all" + logEvents: + - message: "2 123456789012 eni-0f1a2b3c 10.0.1.47 203.0.113.50 44832 443 6 12 1680 1710505200 1710505260 ACCEPT OK" + - message: "2 123456789012 eni-0f1a2b3c 198.51.100.22 10.0.1.47 55912 22 6 847 52140 1710505200 1710505260 REJECT OK" + - message: "2 123456789012 eni-0f1a2b3c 10.0.1.47 10.0.2.83 38210 5432 6 245 89200 1710505200 1710505260 ACCEPT OK" + - message: "2 123456789012 eni-0f1a2b3c - - - - - - - 1710505260 1710505320 - NODATA" + mapping: | + map protocol_name(num) { + match num { + "6" => "TCP", + "17" => "UDP", + "1" => "ICMP", + _ => "proto-" + num, + } + } + + map port_service(port) { + match port { + "443" => "HTTPS", + "80" => "HTTP", + "22" => "SSH", + "5432" => "PostgreSQL", + "3306" => "MySQL", + _ => null, + } + } + + map classify_ip(addr) { + match { + addr.has_prefix("10.") => "internal", + addr.has_prefix("172.") => "internal", + addr.has_prefix("192.168.") => "internal", + addr.has_prefix("169.254.") => "link-local", + _ => "external", + } + } + + $nodata_count = input.logEvents + .filter(e -> e.message.has_suffix("NODATA")).length() + + # Parse space-delimited records, dropping NODATA + $flows = input.logEvents + .filter(e -> !e.message.has_suffix("NODATA")) + .map(e -> { + $f = e.message.split(" ") + $srcaddr = $f[3] + $dstaddr = $f[4] + $srcport = $f[5] + $dstport = $f[6] + $proto = protocol_name($f[7]) + $service = port_service($dstport).or(port_service($srcport).or(null)) + $src_class = classify_ip($srcaddr) + $dst_class = classify_ip($dstaddr) + $direction = match { + $src_class == "internal" && $dst_class == "external" => "outbound", + $src_class == "external" && $dst_class == "internal" => "inbound", + _ => "internal", + } + $action = $f[12] + $packets = $f[8].int64() + $bytes = $f[9].int64() + $duration = $f[11].int64() - $f[10].int64() + $bps = ($bytes.float64() / $duration.float64()).round(1) + $is_anomaly = $action == "REJECT" && $service == "SSH" + { + "protocol": $proto, + "src_addr": $srcaddr, + "dst_addr": $dstaddr, + "src_port": $srcport.int64(), + "dst_port": $dstport.int64(), + "service": $service, + "direction": $direction, + "traffic_class": match $direction { + "outbound" => $dst_class, + "inbound" => $src_class, + _ => "internal", + }, + "action": $action, + "packets": $packets, + "bytes": $bytes, + "bytes_per_sec": $bps, + "anomaly": $is_anomaly, + } + }) + + output.vpc_id = input.logGroup.split("/")[-1] + output.account_id = input.owner + output.interface_id = input.logStream.split("-all")[0] + output.flows = $flows + output.summary = { + "total_flows": $flows.length(), + "accepted": $flows.filter(r -> r.action == "ACCEPT").length(), + "rejected": $flows.filter(r -> r.action == "REJECT").length(), + "anomalies": $flows.filter(r -> r.anomaly).length(), + "total_bytes": $flows.map(r -> r.bytes).sum(), + "nodata_dropped": $nodata_count, + } + output: + vpc_id: "vpc-0a1b2c3d" + account_id: "123456789012" + interface_id: "eni-0f1a2b3c" + flows: + - protocol: "TCP" + src_addr: "10.0.1.47" + dst_addr: "203.0.113.50" + src_port: 44832 + dst_port: 443 + service: "HTTPS" + direction: "outbound" + traffic_class: "external" + action: "ACCEPT" + packets: 12 + bytes: 1680 + bytes_per_sec: 28.0 + anomaly: false + - protocol: "TCP" + src_addr: "198.51.100.22" + dst_addr: "10.0.1.47" + src_port: 55912 + dst_port: 22 + service: "SSH" + direction: "inbound" + traffic_class: "external" + action: "REJECT" + packets: 847 + bytes: 52140 + bytes_per_sec: 869.0 + anomaly: true + - protocol: "TCP" + src_addr: "10.0.1.47" + dst_addr: "10.0.2.83" + src_port: 38210 + dst_port: 5432 + service: "PostgreSQL" + direction: "internal" + traffic_class: "internal" + action: "ACCEPT" + packets: 245 + bytes: 89200 + bytes_per_sec: 1486.7 + anomaly: false + summary: + total_flows: 3 + accepted: 2 + rejected: 1 + anomalies: 1 + total_bytes: 143020 + nodata_dropped: 1 diff --git a/internal/bloblang2/spec/tests/control_flow/block_scoping.yaml b/internal/bloblang2/spec/tests/control_flow/block_scoping.yaml new file mode 100644 index 000000000..e5fc45a71 --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/block_scoping.yaml @@ -0,0 +1,233 @@ +description: "Block scoping — statement vs expression contexts, variable visibility, shadowing vs mutation" + +tests: + # --- Expression context: shadowing (if expression) --- + + - name: "if expression shadows outer variable" + mapping: | + $value = 10 + output.inner = if true { + $value = 20 + $value + } + output.outer = $value + output: {"inner": 20, "outer": 10} + + - name: "if expression shadow does not leak to else branch" + mapping: | + $value = 10 + output.result = if false { + $value = 20 + $value + } else { + $value + } + output: {"result": 10} + + - name: "if expression shadow in else branch" + mapping: | + $value = 10 + output.inner = if false { + $value + } else { + $value = 30 + $value + } + output.outer = $value + output: {"inner": 30, "outer": 10} + + # --- Statement context: mutation (if statement) --- + + - name: "if statement mutates outer variable" + mapping: | + $value = 10 + if true { + $value = 20 + } + output.result = $value + output: {"result": 20} + + - name: "if statement mutation persists after block" + mapping: | + $x = 1 + $y = 2 + if true { + $x = 10 + $y = 20 + } + output.x = $x + output.y = $y + output: {"x": 10, "y": 20} + + - name: "if statement mutation only happens if condition is true" + mapping: | + $value = "original" + if false { + $value = "changed" + } + output.result = $value + output: {"result": "original"} + + # --- New variables in statement context are block-scoped --- + + - name: "new variable in if statement not visible outside" + mapping: | + if true { + $new_var = 42 + } + output.result = $new_var + compile_error: "new_var" + + - name: "new variable in else branch not visible outside" + mapping: | + if false { + $a = 1 + } else { + $b = 2 + } + output.result = $b + compile_error: "b" + + - name: "variable in one branch not visible in sibling branch" + mapping: | + if false { + $x = 10 + } else { + output.result = $x + } + compile_error: "x" + + # --- Expression context: match expression shadows --- + + - name: "match expression shadows outer variable" + mapping: | + $x = "outer" + output.inner = match { + true => { + $x = "inner" + $x + }, + } + output.outer = $x + output: {"inner": "inner", "outer": "outer"} + + # --- Statement context: match statement mutates --- + + - name: "match statement mutates outer variable" + mapping: | + $x = "before" + match { + true => { + $x = "after" + }, + } + output.result = $x + output: {"result": "after"} + + # --- Nested scopes --- + + - name: "nested if expressions each shadow independently" + mapping: | + $x = 1 + output.result = if true { + $x = 10 + $inner = if true { + $x = 100 + $x + } + $inner + $x + } + output.outer = $x + output: {"result": 110, "outer": 1} + + - name: "if statement inside expression body is compile error" + mapping: | + $x = 1 + output.result = if true { + $x = 10 + if true { + $x = 20 + } + $x + } + compile_error: "expression" + + # --- Variables declared in block not accessible after --- + + - name: "variable from match case not visible after match" + mapping: | + match { + true => { + $local = 42 + }, + } + output.result = $local + compile_error: "local" + + # --- Else-if scoping --- + + - name: "else-if branches have independent scopes" + mapping: | + $result = "none" + if false { + $result = "first" + } else if true { + $result = "second" + $local = "only here" + } else { + $result = "third" + } + output.result = $result + output.local = $local + compile_error: "local" + + - name: "else-if statement mutates outer variable from middle branch" + mapping: | + $result = "none" + if false { + $result = "first" + } else if true { + $result = "second" + } else { + $result = "third" + } + output.result = $result + output: {"result": "second"} + + # --- Deep nesting --- + + - name: "three levels of nesting with correct scoping" + mapping: | + $x = "L0" + output.level0 = $x + output.inner = if true { + $x = "L1" + $result = if true { + $x = "L2" + $x + } + $result + " from " + $x + } + output.after = $x + output: {"level0": "L0", "inner": "L2 from L1", "after": "L0"} + + # --- Match-as binding scope --- + + - name: "match-as binding not visible after match expression" + mapping: | + output.result = match 42 as val { + val > 0 => val, + _ => 0, + } + output.binding = val + compile_error: "val" + + - name: "match-as binding does not conflict with variable of different naming" + mapping: | + $val = "outer" + output.inner = match 42 as val { + val > 0 => val, + _ => 0, + } + output.outer = $val + output: {"inner": 42, "outer": "outer"} diff --git a/internal/bloblang2/spec/tests/control_flow/if_else_chains.yaml b/internal/bloblang2/spec/tests/control_flow/if_else_chains.yaml new file mode 100644 index 000000000..385563c7f --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/if_else_chains.yaml @@ -0,0 +1,119 @@ +description: > + If-else chains, nested conditionals, if-expression in various contexts, + and empty body handling. + +tests: + # --- if-else if-else chains --- + + - name: "if-else if-else first branch" + mapping: | + output.v = if true { "first" } else if true { "second" } else { "third" } + output: {"v": "first"} + + - name: "if-else if-else second branch" + mapping: | + output.v = if false { "first" } else if true { "second" } else { "third" } + output: {"v": "second"} + + - name: "if-else if-else third branch" + mapping: | + output.v = if false { "first" } else if false { "second" } else { "third" } + output: {"v": "third"} + + - name: "long if-else if chain" + mapping: | + $x = 3 + output.v = if $x == 1 { "one" } else if $x == 2 { "two" } else if $x == 3 { "three" } else { "other" } + output: {"v": "three"} + + # --- if-else if without final else produces void --- + + - name: "if-else if without else — void when none match" + mapping: | + output.v = "default" + output.v = if false { "a" } else if false { "b" } + output: {"v": "default"} + + - name: "if-else if without else — first matches" + mapping: | + output.v = if true { "a" } else if false { "b" } + output: {"v": "a"} + + - name: "if-else if without else — second matches" + mapping: | + output.v = if false { "a" } else if true { "b" } + output: {"v": "b"} + + # --- Nested if expressions --- + + - name: "nested if in then branch" + mapping: | + $x = 5 + output.v = if $x > 0 { + if $x > 10 { "big" } else { "small" } + } else { "negative" } + output: {"v": "small"} + + - name: "nested if in else branch" + mapping: | + $x = -5 + output.v = if $x > 0 { "positive" } else { + if $x == 0 { "zero" } else { "negative" } + } + output: {"v": "negative"} + + # --- If-expression in various contexts --- + + - name: "if-expression as map argument" + mapping: | + map greet(name) { "hello " + name } + output.v = greet(if true { "Alice" } else { "Bob" }) + output: {"v": "hello Alice"} + + - name: "if-expression in array literal" + mapping: | + output.v = [1, if true { 2 } else { 20 }, 3] + output: {"v": [1, 2, 3]} + + - name: "if-expression in object literal" + mapping: | + output.v = {"status": if true { "ok" } else { "error" }} + output: {"v": {"status": "ok"}} + + - name: "if-expression in method argument" + mapping: | + output.v = [3, 1, 2].sort_by(x -> if true { x } else { -x }) + output: {"v": [1, 2, 3]} + + # --- If-statement with empty body --- + + - name: "if-statement with empty body is no-op" + mapping: | + output.v = "unchanged" + if true { } + output: {"v": "unchanged"} + + - name: "if-statement empty body, else has content" + mapping: | + output.v = "init" + if false { } else { + output.v = "else ran" + } + output: {"v": "else ran"} + + # --- Boolean coercion not implicit --- + + - name: "non-boolean condition is error" + mapping: | + output.v = if "truthy" { "yes" } else { "no" } + error: "bool" + + - name: "integer condition is error" + mapping: | + output.v = if 1 { "yes" } else { "no" } + error: "bool" + + - name: "null condition is error" + mapping: | + output.v = if null { "yes" } else { "no" } + error: "bool" diff --git a/internal/bloblang2/spec/tests/control_flow/if_expression.yaml b/internal/bloblang2/spec/tests/control_flow/if_expression.yaml new file mode 100644 index 000000000..eee0f3c95 --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/if_expression.yaml @@ -0,0 +1,145 @@ +description: "If as expression — returns value, with/without else, else-if chains, void behavior" + +tests: + # --- Basic if-else expression --- + + - name: "if true with else returns then branch" + mapping: | + output.result = if true { "yes" } else { "no" } + output: {"result": "yes"} + + - name: "if false with else returns else branch" + mapping: | + output.result = if true == false { "yes" } else { "no" } + output: {"result": "no"} + + - name: "if expression returns computed value" + input: {"score": 95} + mapping: | + output.grade = if input.score >= 90 { "A" } else { "B" } + output: {"grade": "A"} + + # --- Without else: void behavior --- + + - name: "if without else produces void when false — assignment skipped, field absent" + mapping: | + output.x = if false { "hello" } + output: {} + + - name: "if without else produces value when true" + mapping: | + output.x = if true { "hello" } + output: {"x": "hello"} + + - name: "void preserves prior output value" + mapping: | + output.status = "pending" + output.status = if false { "override" } + output: {"status": "pending"} + + # --- Else-if chains --- + + - name: "else-if tier classification" + mapping: | + output.tier = if input.score >= 90 { + "gold" + } else if input.score >= 70 { + "silver" + } else { + "bronze" + } + cases: + - name: "selects middle branch" + input: {"score": 75} + output: {"tier": "silver"} + - name: "falls through to else" + input: {"score": 30} + output: {"tier": "bronze"} + - name: "selects first branch" + input: {"score": 95} + output: {"tier": "gold"} + + - name: "else-if without final else produces void when no branch matches" + input: {"score": 30} + mapping: | + output.tier = "default" + output.tier = if input.score >= 90 { "gold" } else if input.score >= 70 { "silver" } + output: {"tier": "default"} + + - name: "else-if without final else — branch matches" + mapping: | + output.tier = if input.score >= 90 { "gold" } else if input.score >= 70 { "silver" } + cases: + - name: "first branch" + input: {"score": 95} + output: {"tier": "gold"} + - name: "second branch" + input: {"score": 75} + output: {"tier": "silver"} + + # --- Variable declaration with if expression --- + + - name: "if expression assigned to variable" + mapping: | + $x = if true { 42 } else { 0 } + output.result = $x + output: {"result": 42} + + - name: "void in variable declaration is runtime error" + mapping: | + $x = if false { 42 } + error: "void" + + - name: "void in variable reassignment is skipped" + mapping: | + $x = 10 + $x = if false { 42 } + output.result = $x + output: {"result": 10} + + # --- Variables inside if expression body --- + + - name: "variables declared inside if expression body" + mapping: | + output.result = if true { + $a = 10 + $b = 20 + $a + $b + } else { + 0 + } + output: {"result": 30} + + # --- Cannot contain output assignments (expression context) --- + + - name: "output assignment inside if expression is compile error" + mapping: | + output.x = if true { + output.y = 10 + 42 + } + compile_error: "output" + + # --- Nested if expressions --- + + - name: "nested if expression" + input: {"a": true, "b": false} + mapping: | + output.result = if input.a { + if input.b { "both" } else { "only a" } + } else { + "neither" + } + output: {"result": "only a"} + + # --- If expression with non-boolean condition --- + + - name: "non-boolean condition is error" + mapping: | + output.x = if "hello" { 1 } else { 2 } + error: "bool" + + - name: "null condition is error" + mapping: | + output.x = if null { 1 } else { 2 } + error: "bool" diff --git a/internal/bloblang2/spec/tests/control_flow/if_statement.yaml b/internal/bloblang2/spec/tests/control_flow/if_statement.yaml new file mode 100644 index 000000000..c3633ccf7 --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/if_statement.yaml @@ -0,0 +1,177 @@ +description: "If as statement — standalone with output assignments, empty body, else-if, trailing expression error" + +tests: + # --- Basic if statement with output assignment --- + + - name: "if statement conditional on input type" + mapping: | + if input.type == "user" { + output.role = "member" + } + cases: + - name: "assigns when true" + input: {"type": "user"} + output: {"role": "member"} + - name: "skips body when false" + input: {"type": "guest"} + output: {} + + - name: "if statement with multiple output assignments" + input: {"type": "admin"} + mapping: | + if input.type == "admin" { + output.role = "admin" + output.level = 10 + } + output: {"role": "admin", "level": 10} + + # --- Empty body is valid no-op --- + + - name: "empty if body is valid no-op" + mapping: | + if true { } + output.x = "after" + output: {"x": "after"} + + - name: "empty else body is valid no-op" + mapping: | + if false { + output.x = "then" + } else { } + output: {} + + # --- If-else statement --- + + - name: "if-else statement branch selection" + mapping: | + if input.type == "admin" { + output.role = "admin" + } else { + output.role = "user" + } + cases: + - name: "selects then branch" + input: {"type": "admin"} + output: {"role": "admin"} + - name: "takes else branch" + input: {"type": "guest"} + output: {"role": "user"} + + # --- Else-if chains --- + + - name: "else-if chain selects middle branch" + input: {"type": "mod"} + mapping: | + if input.type == "admin" { + output.role = "admin" + output.permissions = ["read", "write", "delete"] + } else if input.type == "mod" { + output.role = "moderator" + output.permissions = ["read", "write"] + } else { + output.role = "user" + output.permissions = ["read"] + } + output: {"role": "moderator", "permissions": ["read", "write"]} + + - name: "else-if chain falls to else" + input: {"type": "visitor"} + mapping: | + if input.type == "admin" { + output.role = "admin" + } else if input.type == "mod" { + output.role = "moderator" + } else { + output.role = "user" + } + output: {"role": "user"} + + - name: "else-if without final else — no branch matches is no-op" + input: {"type": "visitor"} + mapping: | + if input.type == "admin" { + output.role = "admin" + } else if input.type == "mod" { + output.role = "moderator" + } + output: {} + + # --- Statement context modifies outer variables --- + + - name: "if statement modifies outer variable" + mapping: | + $x = 10 + if true { + $x = 20 + } + output.result = $x + output: {"result": 20} + + - name: "if statement does not modify outer variable when false" + mapping: | + $x = 10 + if false { + $x = 20 + } + output.result = $x + output: {"result": 10} + + # --- New variables in if statement are block-scoped --- + + - name: "new variable in if statement body not visible outside" + mapping: | + if true { + $local = 42 + } + output.result = $local + compile_error: "local" + + # --- Trailing expression in statement body is parse error --- + + - name: "trailing expression in if statement is compile error" + mapping: | + if true { + $x = 10 + $x + 5 + } + compile_error: "expression" + + - name: "bare expression as only content in if statement is compile error" + mapping: | + if true { + 42 + } + compile_error: "expression" + + # --- Nested if statements --- + + - name: "nested if statements" + input: {"type": "admin", "active": true} + mapping: | + if input.type == "admin" { + if input.active { + output.status = "active admin" + } else { + output.status = "inactive admin" + } + } + output: {"status": "active admin"} + + # --- If statement preserves prior output --- + + - name: "if statement preserves output from before" + mapping: | + output.before = "exists" + if true { + output.added = "new" + } + output: {"before": "exists", "added": "new"} + + # --- Non-boolean condition error --- + + - name: "non-boolean condition in if statement is error" + mapping: | + if 42 { + output.x = "yes" + } + error: "bool" diff --git a/internal/bloblang2/spec/tests/control_flow/match_as.yaml b/internal/bloblang2/spec/tests/control_flow/match_as.yaml new file mode 100644 index 000000000..7f6476bea --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/match_as.yaml @@ -0,0 +1,186 @@ +description: "Match with 'as' binding — binding in conditions and results, block scoping of binding" + +tests: + # --- Basic match with as --- + + - name: "match as score tier classification" + mapping: | + output.tier = match input.score as s { + s >= 100 => "gold", + s >= 50 => "silver", + _ => "bronze", + } + cases: + - name: "middle tier" + input: {"score": 85} + output: {"tier": "silver"} + - name: "top tier" + input: {"score": 150} + output: {"tier": "gold"} + - name: "falls to wildcard" + input: {"score": 10} + output: {"tier": "bronze"} + + # --- Binding used in result expression --- + + - name: "as binding used in result expression" + input: {"value": 42} + mapping: | + output.result = match input.value as v { + v > 0 => v * 2, + _ => 0, + } + output: {"result": 84} + + - name: "as binding used in both condition and result" + input: {"name": "Alice"} + mapping: | + output.greeting = match input.name as n { + n == "Alice" => "Hello, " + n + "!", + n == "Bob" => "Hey " + n, + _ => "Hi " + n, + } + output: {"greeting": "Hello, Alice!"} + + # --- Expression evaluated once --- + + - name: "matched expression evaluated once" + mapping: | + $counter = 0 + $counter = $counter + 1 + output.result = match $counter as c { + c == 1 => "one", + c == 2 => "two", + _ => "other", + } + output: {"result": "one"} + + # --- Non-boolean case is error --- + + - name: "non-boolean case in match-as is runtime error" + mapping: | + output.result = match 42 as x { + "hello" => "yes", + _ => "no", + } + error: "bool" + + - name: "integer case in match-as is runtime error" + mapping: | + output.result = match "test" as x { + 42 => "yes", + _ => "no", + } + error: "bool" + + # --- Wildcard exempt from boolean requirement --- + + - name: "wildcard in match-as is not checked as boolean" + mapping: | + output.result = match 42 as x { + x > 100 => "big", + _ => "small", + } + output: {"result": "small"} + + # --- Binding is block-scoped to the match --- + + - name: "as binding not accessible after match" + mapping: | + output.result = match 42 as x { + x > 0 => "positive", + _ => "other", + } + output.leaked = x + compile_error: "x" + + - name: "as binding does not shadow outer variable — it is a new scope" + mapping: | + $x = "outer" + output.inner = match 42 as x { + x > 0 => x, + _ => 0, + } + output.outer = $x + output: {"inner": 42, "outer": "outer"} + + # --- Match-as with braced case body --- + + - name: "match-as with braced body using binding and local vars" + input: {"price": 120} + mapping: | + output.label = match input.price as p { + p >= 100 => { + $discount = p * 0.1 + "expensive (save " + $discount.string() + ")" + }, + _ => "affordable", + } + output: {"label": "expensive (save 12.0)"} + + # --- Match-as with string matched expression --- + + - name: "match-as on string value" + input: {"name": "test_file.csv"} + mapping: | + output.type = match input.name as n { + n.has_suffix(".csv") => "csv", + n.has_suffix(".json") => "json", + _ => "unknown", + } + output: {"type": "csv"} + + # --- Non-exhaustive match-as produces void --- + + - name: "non-exhaustive match-as produces void" + mapping: | + output.x = "prior" + output.x = match 5 as v { + v > 100 => "big", + v > 50 => "medium", + } + output: {"x": "prior"} + + # --- Match-as as statement --- + + - name: "match-as as statement with output assignments" + input: {"score": 75} + mapping: | + match input.score as s { + s >= 90 => { + output.grade = "A" + output.pass = true + }, + s >= 60 => { + output.grade = "B" + output.pass = true + }, + _ => { + output.grade = "F" + output.pass = false + }, + } + output: {"grade": "B", "pass": true} + + # --- Cases after first true are not evaluated --- + + - name: "cases after first true not evaluated in match-as" + mapping: | + output.result = match 10 as x { + x > 0 => "positive", + throw("should not evaluate") => "never", + _ => "default", + } + output: {"result": "positive"} + + # --- Binding with computed expression --- + + - name: "match-as with computed matched expression" + input: {"a": 10, "b": 20} + mapping: | + output.result = match (input.a + input.b) as total { + total >= 50 => "high", + total >= 20 => "medium", + _ => "low", + } + output: {"result": "medium"} diff --git a/internal/bloblang2/spec/tests/control_flow/match_block_body.yaml b/internal/bloblang2/spec/tests/control_flow/match_block_body.yaml new file mode 100644 index 000000000..ec28b544e --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/match_block_body.yaml @@ -0,0 +1,216 @@ +description: > + Match expression block bodies — braced expression bodies with variable + assignments in match arms. Object literals require parentheses to + distinguish from block bodies. + +tests: + # --- Block body with assignments --- + + - name: "block body with variable and final expression" + mapping: | + output.v = match "hello" { + "hello" => { + $greeting = "hi there" + $greeting + }, + _ => "unknown", + } + output: {"v": "hi there"} + + - name: "block body with multiple assignments" + mapping: | + output.v = match "format" { + "format" => { + $prefix = "[" + $suffix = "]" + $prefix + "done" + $suffix + }, + _ => "nope", + } + output: {"v": "[done]"} + + - name: "block body with path assignment" + mapping: | + output.v = match "build" { + "build" => { + $result = {"action": "build"} + $result.status = "done" + $result + }, + _ => ({}), + } + output: {"v": {"action": "build", "status": "done"}} + + - name: "block body with dynamic key assignment" + mapping: | + output.v = match "set" { + "set" => { + $obj = {} + $key = "dynamic" + $obj[$key] = 42 + $obj + }, + _ => ({}), + } + output: {"v": {"dynamic": 42}} + + # --- Object literals via parentheses --- + + - name: "empty object via parentheses" + mapping: | + output.v = match "miss" { + "hit" => "found", + _ => ({}), + } + output: {"v": {}} + + - name: "object literal via parentheses" + mapping: | + output.v = match "fallback" { + "match" => ({"status": "matched"}), + _ => ({"status": "default"}), + } + output: {"v": {"status": "default"}} + + - name: "bare string result needs no parens" + mapping: | + output.v = match "a" { + "a" => "alpha", + _ => "other", + } + output: {"v": "alpha"} + + - name: "bare expression result no block needed" + mapping: | + output.v = match 5 { + 5 => 5 * 10, + _ => 0, + } + output: {"v": 50} + + # --- Block body on wildcard arm --- + + - name: "wildcard arm with block body" + mapping: | + output.v = match "xyz" { + "a" => "alpha", + _ => { + $fallback = "unknown" + $fallback + " value" + }, + } + output: {"v": "unknown value"} + + # --- Block body with outer variable capture --- + + - name: "block body captures outer variable" + mapping: | + $multiplier = 10 + output.v = match "scale" { + "scale" => { + $base = 5 + $base * $multiplier + }, + _ => 0, + } + output: {"v": 50} + + # --- Block body with match-as binding --- + + - name: "match-as block body name truncation" + mapping: | + output.v = match input.name as n { + n.length() > 5 => { + $short = n.slice(0, 5) + $short + "..." + }, + _ => n, + } + cases: + - name: "long name truncated" + input: {"name": "Alexander"} + output: {"v": "Alexa..."} + - name: "short name passthrough" + input: {"name": "Bob"} + output: {"v": "Bob"} + + # --- Block body returning deleted --- + + - name: "block body returning deleted removes field" + mapping: | + output.keep = "yes" + output.maybe = match "remove" { + "remove" => { + $check = true + if $check { deleted() } else { "kept" } + }, + _ => "default", + } + output: {"keep": "yes"} + + # --- Block body with void from if-without-else --- + + - name: "block body with void skips assignment" + mapping: | + output.v = "prior" + output.v = match "miss" { + "miss" => { + $x = 42 + if false { $x } + }, + _ => "fallback", + } + output: {"v": "prior"} + + # --- Nested match with block bodies --- + + - name: "nested match both with block bodies" + mapping: | + output.v = match input.type { + "user" => { + $role = match input.role { + "admin" => { + $level = "full" + $level + " access" + }, + _ => "limited", + } + $role + }, + _ => "unknown", + } + input: {"type": "user", "role": "admin"} + output: {"v": "full access"} + + # --- Block body in boolean match --- + + - name: "boolean match with block body" + mapping: | + output.v = match { + input.score > 90 => { + $grade = "A" + $grade + "+" + }, + input.score > 80 => "B", + _ => "C", + } + input: {"score": 95} + output: {"v": "A+"} + + # --- Multiple arms with block bodies --- + + - name: "multiple arms all with block bodies" + mapping: | + output.v = match input.op { + "add" => { + $result = input.a + input.b + {"op": "add", "result": $result} + }, + "mul" => { + $result = input.a * input.b + {"op": "mul", "result": $result} + }, + _ => ({"op": "unknown", "result": 0}), + } + input: {"op": "mul", "a": 3, "b": 4} + output: {"v": {"op": "mul", "result": 12}} diff --git a/internal/bloblang2/spec/tests/control_flow/match_boolean.yaml b/internal/bloblang2/spec/tests/control_flow/match_boolean.yaml new file mode 100644 index 000000000..3e8ff9237 --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/match_boolean.yaml @@ -0,0 +1,157 @@ +description: "Match boolean form — with and without matched expression, non-boolean case error, first true wins" + +tests: + # --- Boolean match without expression --- + + - name: "boolean match grade classification" + mapping: | + output.grade = match { + input.score >= 90 => "A", + input.score >= 80 => "B", + input.score >= 70 => "C", + _ => "F", + } + cases: + - name: "selects B for 85" + input: {"score": 85} + output: {"grade": "B"} + - name: "selects A for 95" + input: {"score": 95} + output: {"grade": "A"} + - name: "falls to wildcard for 30" + input: {"score": 30} + output: {"grade": "F"} + + - name: "boolean match first true wins even if later also true" + input: {"score": 95} + mapping: | + output.result = match { + input.score >= 80 => "eighty plus", + input.score >= 90 => "ninety plus", + _ => "other", + } + output: {"result": "eighty plus"} + + # --- Non-boolean case is error --- + + - name: "non-boolean case in boolean match is runtime error" + mapping: | + output.result = match { + "hello" => "yes", + _ => "no", + } + error: "bool" + + - name: "integer case in boolean match is runtime error" + mapping: | + output.result = match { + 42 => "yes", + _ => "no", + } + error: "bool" + + - name: "null case in boolean match is runtime error" + mapping: | + output.result = match { + null => "yes", + _ => "no", + } + error: "bool" + + # --- Wildcard exempt from boolean requirement --- + + - name: "wildcard is not checked as boolean" + mapping: | + output.result = match { + false => "no", + _ => "default", + } + output: {"result": "default"} + + # --- Cases evaluated in order, short-circuit --- + + - name: "cases after first true are not evaluated" + mapping: | + output.result = match { + true => "first", + throw("should not evaluate") => "never", + _ => "default", + } + output: {"result": "first"} + + # --- Non-exhaustive boolean match produces void --- + + - name: "non-exhaustive boolean match produces void" + mapping: | + output.x = "prior" + output.x = match { + false => "nope", + } + output: {"x": "prior"} + + # --- Boolean match with complex conditions --- + + - name: "boolean match with compound conditions" + input: {"age": 25, "member": true} + mapping: | + output.discount = match { + input.age < 18 => "youth", + input.age >= 65 => "senior", + input.member && input.age >= 21 => "member", + _ => "none", + } + output: {"discount": "member"} + + # --- Boolean match with variables --- + + - name: "boolean match using variables in conditions" + mapping: | + $threshold = 50 + $value = 75 + output.result = match { + $value >= $threshold => "above", + _ => "below", + } + output: {"result": "above"} + + # --- Boolean match with braced case body --- + + - name: "boolean match case with braced body" + input: {"score": 85} + mapping: | + output.result = match { + input.score >= 80 => { + $label = "high" + $label + " score" + }, + _ => "low score", + } + output: {"result": "high score"} + + # --- Boolean match as statement --- + + - name: "boolean match as statement assigns output" + input: {"level": 5} + mapping: | + match { + input.level >= 10 => { + output.rank = "expert" + }, + input.level >= 5 => { + output.rank = "intermediate" + }, + _ => { + output.rank = "beginner" + }, + } + output: {"rank": "intermediate"} + + # --- Non-boolean skipped by earlier true case does not error --- + + - name: "non-boolean case after true case is never evaluated" + mapping: | + output.result = match { + true => "found", + "not a bool" => "never", + } + output: {"result": "found"} diff --git a/internal/bloblang2/spec/tests/control_flow/match_edge_cases.yaml b/internal/bloblang2/spec/tests/control_flow/match_edge_cases.yaml new file mode 100644 index 000000000..30b9fc345 --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/match_edge_cases.yaml @@ -0,0 +1,173 @@ +description: > + Match expression edge cases — equality match with various types, + match-as with complex expressions, wildcard semantics, and + interactions with void and deleted. + +tests: + # --- Equality match with various types --- + + - name: "equality match on integer" + mapping: | + output.v = match 2 { + 1 => "one", + 2 => "two", + 3 => "three", + _ => "other", + } + output: {"v": "two"} + + - name: "equality match on string" + mapping: | + output.v = match "hello" { + "hi" => "informal", + "hello" => "formal", + _ => "unknown", + } + output: {"v": "formal"} + + - name: "equality match on null" + mapping: | + output.v = match null { + null => "is null", + _ => "not null", + } + output: {"v": "is null"} + + - name: "equality match falls through to wildcard" + mapping: | + output.v = match "xyz" { + "a" => 1, + "b" => 2, + _ => 0, + } + output: {"v": 0} + + - name: "equality match with float" + mapping: | + output.v = match 3.14 { + 3.14 => "pi", + 2.71 => "e", + _ => "other", + } + output: {"v": "pi"} + + # --- match subject evaluated once --- + + - name: "match subject expression evaluated once" + mapping: | + $counter = 0 + $counter = $counter + 1 + output.v = match $counter { + 1 => "one", + 2 => "two", + _ => "many", + } + output: {"v": "one"} + + # --- match-as binding --- + + - name: "match-as binds value to variable" + mapping: | + output.v = match input.data as d { + d > 10 => "big", + d > 0 => "small", + _ => "zero or negative", + } + input: {"data": 5} + output: {"v": "small"} + + - name: "match-as variable used in result expression" + mapping: | + output.v = match input.name as n { + n.length() > 5 => n.uppercase(), + _ => n.lowercase(), + } + cases: + - name: "long name uppercased" + input: {"name": "Alexander"} + output: {"v": "ALEXANDER"} + - name: "short name lowercased" + input: {"name": "Bob"} + output: {"v": "bob"} + + - name: "match-as variable not accessible outside match" + mapping: | + output.v = match "hello" as s { + true => s, + } + output.leaked = $s + compile_error: "undeclared" + + # --- Wildcard catches all --- + + - name: "wildcard matches any value" + mapping: | + output.v = match "anything" { + _ => "caught", + } + output: {"v": "caught"} + + # --- Match with deleted in result --- + + - name: "match result is deleted — field removed" + mapping: | + output.keep = "yes" + output.maybe = "testing" + output.maybe = match "remove" { + "remove" => deleted(), + _ => output.maybe, + } + output: {"keep": "yes"} + + # --- Match with void from if-without-else in arm --- + + - name: "match arm contains if-without-else producing void" + mapping: | + output.v = "default" + output.v = match "a" { + "a" => if false { "override" }, + _ => "fallback", + } + output: {"v": "default"} + + # --- Nested match --- + + - name: "nested match expressions" + mapping: | + output.v = match input.type { + "user" => match input.role { + "admin" => "full access", + "viewer" => "read only", + _ => "limited", + }, + _ => "unknown type", + } + cases: + - name: "user admin gets full access" + input: {"type": "user", "role": "admin"} + output: {"v": "full access"} + - name: "non-user type falls through" + input: {"type": "device", "role": "admin"} + output: {"v": "unknown type"} + + # --- Match in statement context --- + + - name: "match statement modifies output conditionally" + mapping: | + output.status = "unknown" + match input.code { + 200 => { output.status = "ok" }, + 404 => { output.status = "not found" }, + _ => { output.status = "error" }, + } + input: {"code": 404} + output: {"status": "not found"} + + - name: "match statement with no matching arm is no-op" + mapping: | + output.status = "default" + match "nope" { + "a" => { output.status = "a" }, + "b" => { output.status = "b" }, + } + output: {"status": "default"} diff --git a/internal/bloblang2/spec/tests/control_flow/match_equality.yaml b/internal/bloblang2/spec/tests/control_flow/match_equality.yaml new file mode 100644 index 000000000..af37a4bbd --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/match_equality.yaml @@ -0,0 +1,218 @@ +description: "Match equality form — expression evaluated once, == comparison, first match wins, boolean case error" + +tests: + # --- Basic equality match --- + + - name: "match equality on animal sound" + mapping: | + output.sound = match input.animal { + "cat" => "meow", + "dog" => "woof", + _ => "unknown", + } + cases: + - name: "selects first case" + input: {"animal": "cat"} + output: {"sound": "meow"} + - name: "selects second case" + input: {"animal": "dog"} + output: {"sound": "woof"} + - name: "falls to wildcard" + input: {"animal": "bird"} + output: {"sound": "unknown"} + + # --- First match wins --- + + - name: "first matching case wins when multiple could match" + mapping: | + $x = "hello" + output.result = match $x { + "hello" => "first", + "hello" => "second", + _ => "default", + } + output: {"result": "first"} + + # --- Numeric equality match --- + + - name: "match on integer value" + input: {"code": 200} + mapping: | + output.status = match input.code { + 200 => "ok", + 404 => "not found", + 500 => "error", + _ => "other", + } + output: {"status": "ok"} + + # --- Null matching --- + + - name: "match null case" + input: {"val": null} + mapping: | + output.result = match input.val { + null => "is null", + _ => "not null", + } + output: {"result": "is null"} + + # --- Case expressions are evaluated (not just literals) --- + + - name: "case expression uses variable" + mapping: | + $target = "hello" + output.result = match "hello" { + $target => "matched variable", + _ => "no match", + } + output: {"result": "matched variable"} + + - name: "case expression uses concatenation" + mapping: | + output.result = match "foobar" { + "foo" + "bar" => "matched concat", + _ => "no match", + } + output: {"result": "matched concat"} + + # --- Subsequent cases not evaluated after match --- + + - name: "cases after match are not evaluated" + mapping: | + output.result = match "a" { + "a" => "found", + throw("should not evaluate") => "never", + _ => "default", + } + output: {"result": "found"} + + # --- Boolean case values are errors --- + + - name: "boolean literal true as case is compile error" + mapping: | + output.result = match "hello" { + true => "yes", + _ => "no", + } + compile_error: "boolean" + + - name: "boolean literal false as case is compile error" + mapping: | + output.result = match 42 { + false => "no", + _ => "yes", + } + compile_error: "boolean" + + - name: "dynamic case evaluating to boolean is runtime error" + input: {"threshold": 100, "score": 150} + mapping: | + output.result = match input.score { + input.score >= input.threshold => "high", + _ => "low", + } + error: "boolean" + + - name: "dynamic boolean case error is catchable" + input: {"threshold": 100, "score": 150} + mapping: | + output.result = (match input.score { + input.score >= input.threshold => "high", + _ => "low", + }).catch(err -> "caught: " + err.what) + output: {"result": "caught: boolean case value in equality match (use 'as' for boolean conditions)"} + + # --- Expression evaluated once --- + + - name: "matched expression evaluated once (side effects)" + mapping: | + $counter = 0 + $counter = $counter + 1 + output.result = match $counter { + 1 => "one", + 2 => "two", + _ => "other", + } + output: {"result": "one"} + + # --- Match with braced case bodies --- + + - name: "match case with braced body containing variables" + input: {"currency": "USD", "amount": 42} + mapping: | + output.formatted = match input.currency { + "USD" => { + $symbol = "$" + $symbol + input.amount.string() + }, + "EUR" => { + input.amount.string() + " EUR" + }, + _ => input.currency + " " + input.amount.string(), + } + output: {"formatted": "$42"} + + # --- Wildcard is only catch-all (no else keyword) --- + + - name: "wildcard catches all unmatched values" + mapping: | + output.result = match 999 { + 1 => "one", + 2 => "two", + _ => "catch all", + } + output: {"result": "catch all"} + + # --- Non-exhaustive match produces void --- + + - name: "non-exhaustive match produces void — assignment skipped" + mapping: | + output.sound = "default" + output.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + output: {"sound": "default"} + + # --- Boolean-case runtime check is lazy (Section 4.2) --- + # Dynamic boolean cases in an equality match error only if actually + # evaluated. A preceding case that matches short-circuits and subsequent + # (potentially boolean) cases are never checked. + + - name: "boolean case not evaluated when prior case matches (no error)" + mapping: | + $b = true + output.v = match "x" { + "x" => "hit", + $b => "boom", + } + output: {"v": "hit"} + + - name: "boolean case evaluated when prior cases miss is a runtime error" + mapping: | + $b = true + output.v = match "y" { + "x" => "hit", + $b => "boom", + } + error: "" + + - name: "boolean case order matters — earlier non-boolean hit saves the match" + mapping: | + $b = false + output.v = match 5 { + 5 => "matched five", + $b => "never reached", + } + output: {"v": "matched five"} + + - name: "wildcard after boolean case short-circuits before the boolean is reached" + mapping: | + $b = true + output.v = match "anything" { + "anything" => "direct hit", + $b => "never reached", + _ => "wildcard", + } + output: {"v": "direct hit"} diff --git a/internal/bloblang2/spec/tests/control_flow/match_void.yaml b/internal/bloblang2/spec/tests/control_flow/match_void.yaml new file mode 100644 index 000000000..9376c4a66 --- /dev/null +++ b/internal/bloblang2/spec/tests/control_flow/match_void.yaml @@ -0,0 +1,139 @@ +description: "Non-exhaustive match producing void — void in assignment, declaration, collection, and rescue with .or()" + +tests: + # --- Void in output assignment: skipped --- + + - name: "non-exhaustive equality match produces void — field absent" + mapping: | + output.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + output: {} + + - name: "void from equality match preserves prior value" + mapping: | + output.sound = "chirp" + output.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + output: {"sound": "chirp"} + + - name: "non-exhaustive boolean match produces void — field absent" + mapping: | + output.x = match { + false => "nope", + false => "also nope", + } + output: {} + + - name: "non-exhaustive match-as produces void — field absent" + mapping: | + output.x = match 5 as v { + v > 100 => "big", + } + output: {} + + # --- Void in variable declaration: runtime error --- + + - name: "void from equality match in variable declaration is error" + mapping: | + $x = match "nope" { + "a" => 1, + "b" => 2, + } + error: "void" + + - name: "void from boolean match in variable declaration is error" + mapping: | + $x = match { + false => 1, + } + error: "void" + + - name: "void from match-as in variable declaration is error" + mapping: | + $x = match 0 as v { + v > 10 => "big", + } + error: "void" + + # --- Void in variable reassignment: skipped --- + + - name: "void from match skips variable reassignment" + mapping: | + $x = "original" + $x = match "nope" { + "a" => "found", + } + output.result = $x + output: {"result": "original"} + + # --- Void in collection literal: error --- + + - name: "void from match in array literal is error" + mapping: | + output.arr = [1, match "x" { "y" => 2 }, 3] + error: "void" + + - name: "void from match in object literal is error" + mapping: | + output.obj = {"key": match "x" { "y" => "val" }} + error: "void" + + # --- Void as function/map argument: error --- + + - name: "void from match as map argument is error" + mapping: | + map double(val) { val * 2 } + output.result = double(match "x" { "y" => 42 }) + error: "void" + + # --- Void rescued with .or() --- + + - name: "or rescues void from non-exhaustive equality match" + mapping: | + output.result = (match "bird" { "cat" => "meow" }).or("unknown") + output: {"result": "unknown"} + + - name: "or rescues void from non-exhaustive boolean match" + mapping: | + output.result = (match { false => "nope" }).or("default") + output: {"result": "default"} + + - name: "or does not trigger when match produces a value" + mapping: | + output.result = (match "cat" { "cat" => "meow", }).or("unknown") + output: {"result": "meow"} + + - name: "or rescues void for variable declaration" + mapping: | + $x = (match "bird" { "cat" => "meow" }).or("unknown") + output.result = $x + output: {"result": "unknown"} + + # --- Wildcard prevents void --- + + - name: "wildcard ensures match is exhaustive" + mapping: | + output.result = match "anything" { + "a" => 1, + _ => 0, + } + output: {"result": 0} + + # --- Void in expression context: error --- + + - name: "void from match in addition is error" + mapping: | + output.result = (match "x" { "y" => 1 }) + 10 + error: "void" + + # --- .catch() does not rescue void --- + + - name: "catch does not trigger on void from match" + mapping: | + output.x = "prior" + output.x = (match "x" { "y" => 1 }).catch(err -> 0) + output: {"x": "prior"} diff --git a/internal/bloblang2/spec/tests/edge_cases/deeply_nested.yaml b/internal/bloblang2/spec/tests/edge_cases/deeply_nested.yaml new file mode 100644 index 000000000..600f2684b --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/deeply_nested.yaml @@ -0,0 +1,108 @@ +description: "Edge cases: deeply nested objects, arrays, and expression chains" + +tests: + # --- Deep object nesting --- + + - name: "access deeply nested object field" + mapping: | + output = input.a.b.c.d.e + input: {"a": {"b": {"c": {"d": {"e": "deep"}}}}} + output: "deep" + + - name: "assign deeply nested output path" + mapping: | + output.a.b.c.d.e = "deep" + output: {"a": {"b": {"c": {"d": {"e": "deep"}}}}} + + - name: "deeply nested object with mixed types" + mapping: | + output.a.b.c = {"x": [1, 2, {"y": true}]} + output: {"a": {"b": {"c": {"x": [1, 2, {"y": true}]}}}} + + - name: "multiple deeply nested assignments" + mapping: | + output.a.b.c = 1 + output.a.b.d = 2 + output.a.e = 3 + output: {"a": {"b": {"c": 1, "d": 2}, "e": 3}} + + # --- Deep array nesting --- + + - name: "access deeply nested array element" + mapping: | + output = input.arr[0][0][0] + input: {"arr": [[[42]]]} + output: 42 + + - name: "deeply nested array literal" + mapping: | + output = [[[[1, 2], [3, 4]], [[5, 6]]]] + output: [[[[1, 2], [3, 4]], [[5, 6]]]] + + - name: "mixed deep nesting — object in array in object" + mapping: | + output = input.data[0].items[1].value + input: {"data": [{"items": [{"value": "a"}, {"value": "b"}]}]} + output: "b" + + # --- Deep expression chains --- + + - name: "long method chain on string" + mapping: | + output = " Hello World ".trim().lowercase().replace_all("hello", "hi").uppercase() + output: "HI WORLD" + + - name: "long method chain on array" + mapping: | + output = [3, 1, 4, 1, 5, 9, 2, 6].unique().sort().reverse() + output: [9, 6, 5, 4, 3, 2, 1] + + - name: "chained map calls" + mapping: | + map inc(x) { x + 1 } + map double(x) { x * 2 } + output = double(inc(double(inc(1)))) + output: 10 + + - name: "nested ternary-style if expressions" + mapping: | + $x = 5 + output = if $x > 10 { "big" } else { if $x > 3 { "medium" } else { "small" } } + output: "medium" + + - name: "nested match expressions" + mapping: | + $x = "b" + output = match $x { + "a" => match 1 { 1 => "a1", _ => "a?" }, + "b" => match 2 { 1 => "b1", 2 => "b2", _ => "b?" }, + _ => "other", + } + output: "b2" + + # --- Deep object construction --- + + - name: "object literal with nested objects and arrays" + mapping: | + output = { + "users": [ + {"name": "Alice", "tags": ["admin", "user"]}, + {"name": "Bob", "tags": ["user"]}, + ], + "meta": {"count": 2, "active": true}, + } + output: {"users": [{"name": "Alice", "tags": ["admin", "user"]}, {"name": "Bob", "tags": ["user"]}], "meta": {"count": 2, "active": true}} + + # --- Deep null-safe access --- + + - name: "null-safe chain through missing nested fields" + mapping: | + output = input.a?.b?.c?.d + input: {"a": null} + output: null + + - name: "null-safe chain where middle is present" + mapping: | + output = input.a?.b?.c + input: {"a": {"b": {"c": 42}}} + output: 42 diff --git a/internal/bloblang2/spec/tests/edge_cases/empty_collections.yaml b/internal/bloblang2/spec/tests/edge_cases/empty_collections.yaml new file mode 100644 index 000000000..f6f5aefbf --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/empty_collections.yaml @@ -0,0 +1,108 @@ +description: "Edge cases: empty array and empty object with all applicable methods" + +tests: + # --- Empty array methods --- + + - name: "empty array sort returns empty array" + mapping: | + output = [].sort() + output: [] + + - name: "empty array sum returns 0" + mapping: | + output = [].sum() + output: 0 + + - name: "empty array min is error" + mapping: | + output = [].min() + error: "empty" + + - name: "empty array max is error" + mapping: | + output = [].max() + error: "empty" + + - name: "empty array fold returns initial value" + mapping: | + output = [].fold(42, (tally, x) -> tally + x) + output: 42 + + - name: "empty array fold returns initial string" + mapping: | + output = [].fold("start", (tally, x) -> tally + x) + output: "start" + + - name: "empty array length is 0" + mapping: | + output = [].length() + output: 0 + + - name: "empty array contains returns false" + mapping: | + output = [].contains(1) + output: false + + - name: "empty array unique returns empty array" + mapping: | + output = [].unique() + output: [] + + - name: "empty array filter returns empty array" + mapping: | + output = [].filter(x -> x > 0) + output: [] + + - name: "empty array map returns empty array" + mapping: | + output = [].map(x -> x * 2) + output: [] + + - name: "empty array reverse returns empty array" + mapping: | + output = [].reverse() + output: [] + + # --- Empty object methods --- + + - name: "empty object keys returns empty array" + mapping: | + output = {}.keys() + output: [] + + - name: "empty object values returns empty array" + mapping: | + output = {}.values() + output: [] + + - name: "empty object length is 0" + mapping: | + output = {}.length() + output: 0 + + - name: "empty object type is object" + mapping: | + output = {}.type() + output: "object" + + - name: "empty array type is array" + mapping: | + output = [].type() + output: "array" + + # --- Single element edge cases --- + + - name: "single element array sort returns same" + mapping: | + output = [42].sort() + output: [42] + + - name: "single element array min returns element" + mapping: | + output = [42].min() + output: 42 + + - name: "single element array max returns element" + mapping: | + output = [42].max() + output: 42 diff --git a/internal/bloblang2/spec/tests/edge_cases/infinity.yaml b/internal/bloblang2/spec/tests/edge_cases/infinity.yaml new file mode 100644 index 000000000..4122038e4 --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/infinity.yaml @@ -0,0 +1,126 @@ +description: "Edge cases: Infinity comparisons, arithmetic, equality, bool conversion" + +tests: + # --- Infinity comparisons --- + + - name: "Infinity > any finite number" + mapping: | + output = input.inf > 999999999999.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + + - name: "Infinity >= any finite number" + mapping: | + output = input.inf >= 0.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + + - name: "-Infinity < any finite number" + mapping: | + output = input.ninf < -999999999999.0 + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + + - name: "-Infinity <= any finite number" + mapping: | + output = input.ninf <= 0.0 + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + + - name: "finite < Infinity" + mapping: | + output = 1000000.0 < input.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + + - name: "finite > -Infinity" + mapping: | + output = -1000000.0 > input.ninf + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + + # --- Infinity equality --- + + - name: "Infinity == Infinity is true" + mapping: | + output = input.inf == input.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + + - name: "-Infinity == -Infinity is true" + mapping: | + output = input.ninf == input.ninf + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + + - name: "Infinity != -Infinity is true" + mapping: | + output = input.inf != input.ninf + input: {"inf": {_type: "float64", value: "Infinity"}, "ninf": {_type: "float64", value: "-Infinity"}} + output: true + + - name: "Infinity > -Infinity" + mapping: | + output = input.inf > input.ninf + input: {"inf": {_type: "float64", value: "Infinity"}, "ninf": {_type: "float64", value: "-Infinity"}} + output: true + + # --- Infinity arithmetic --- + + - name: "Infinity + Infinity = Infinity" + mapping: | + output = input.inf + input.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "Infinity"} + + - name: "Infinity - Infinity = NaN" + mapping: | + output = input.inf - input.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "NaN"} + + - name: "Infinity * 2 = Infinity" + mapping: | + output = input.inf * 2.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "Infinity"} + + - name: "Infinity * -1 = -Infinity" + mapping: | + output = input.inf * -1.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "-Infinity"} + + - name: "-Infinity + -Infinity = -Infinity" + mapping: | + output = input.ninf + input.ninf + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: {_type: "float64", value: "-Infinity"} + + - name: "Infinity * 0.0 = NaN" + mapping: | + output = input.inf * 0.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "NaN"} + + # --- Infinity bool conversion --- + + - name: "Infinity.bool() is true" + mapping: | + output = input.inf.bool() + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + + - name: "-Infinity.bool() is true" + mapping: | + output = input.ninf.bool() + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + + # --- Infinity type --- + + - name: "Infinity type is float64" + mapping: | + output = input.inf.type() + input: {"inf": {_type: "float64", value: "Infinity"}} + output: "float64" diff --git a/internal/bloblang2/spec/tests/edge_cases/integer_overflow.yaml b/internal/bloblang2/spec/tests/edge_cases/integer_overflow.yaml new file mode 100644 index 000000000..677a4bf87 --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/integer_overflow.yaml @@ -0,0 +1,114 @@ +description: "Edge cases: integer overflow for int32, int64, uint32, uint64 across add, sub, mul" + +tests: + # --- int64 overflow --- + + - name: "int64 max + 1 overflows" + mapping: | + output = 9223372036854775807 + 1 + error: "overflow" + + - name: "int64 min literal is compile error" + mapping: | + output = -9223372036854775808 - 1 + compile_error: "exceeds" + + - name: "int64 min - 1 overflows" + mapping: | + output = (-9223372036854775807 - 1) - 1 + error: "overflow" + + - name: "int64 max * 2 overflows" + mapping: | + output = 9223372036854775807 * 2 + error: "overflow" + + - name: "int64 large positive multiplication overflows" + mapping: | + output = 4611686018427387904 * 3 + error: "overflow" + + - name: "int64 max value is representable" + mapping: | + output = 9223372036854775807 + output: 9223372036854775807 + + - name: "int64 min value via arithmetic is representable" + mapping: | + output = -9223372036854775807 - 1 + output: -9223372036854775808 + + # --- int32 overflow via conversion --- + + - name: "int32 max + 1 via conversion overflows" + mapping: | + output = 2147483648.int32() + error: "overflow" + + - name: "int32 min - 1 via conversion overflows" + mapping: | + output = (-2147483649).int32() + error: "overflow" + + - name: "int32 max is representable" + mapping: | + output = 2147483647.int32() + output: {_type: "int32", value: "2147483647"} + + - name: "int32 min is representable" + mapping: | + output = (-2147483648).int32() + output: {_type: "int32", value: "-2147483648"} + + # --- uint32 overflow via conversion --- + + - name: "uint32 max + 1 via conversion overflows" + mapping: | + output = 4294967296.uint32() + error: "overflow" + + - name: "uint32 negative via conversion overflows" + mapping: | + output = (-1).uint32() + error: "overflow" + + - name: "uint32 max is representable" + mapping: | + output = 4294967295.uint32() + output: {_type: "uint32", value: "4294967295"} + + # --- uint64 overflow --- + + - name: "uint64 max + 1 from string overflows" + mapping: | + output = "18446744073709551616".uint64() + error: "overflow" + + - name: "uint64 negative is error" + mapping: | + output = (-1).uint64() + error: "overflow" + + - name: "uint64 max from string is representable" + mapping: | + output = "18446744073709551615".uint64() + output: {_type: "uint64", value: "18446744073709551615"} + + - name: "uint64 max as bare literal is compile error" + mapping: | + output = 18446744073709551615.uint64() + compile_error: "exceeds" + + # --- Overflow at boundary --- + + - name: "int64 max minus 1 plus 2 overflows" + mapping: | + $v = 9223372036854775806 + output = $v + 2 + error: "overflow" + + - name: "int64 min plus 1 minus 2 overflows" + mapping: | + $v = -9223372036854775807 + output = $v - 2 + error: "overflow" diff --git a/internal/bloblang2/spec/tests/edge_cases/integer_overflow_ops.yaml b/internal/bloblang2/spec/tests/edge_cases/integer_overflow_ops.yaml new file mode 100644 index 000000000..1c67ee904 --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/integer_overflow_ops.yaml @@ -0,0 +1,117 @@ +description: > + Integer overflow across all operations and integer types — addition, + subtraction, multiplication for int32, int64, uint32, uint64. Also + tests that overflow is always a runtime error, never wraps. + +tests: + # --- int64 overflow: subtraction --- + + - name: "int64 min minus 1 via subtraction overflows" + mapping: | + $min = -9223372036854775807 - 1 + output = $min - 1 + error: "overflow" + + - name: "int64 large negative subtraction overflows" + mapping: | + output = (-9223372036854775807) - 9223372036854775807 + error: "overflow" + + # --- int32 overflow: arithmetic --- + + - name: "int32 max + 1 overflows" + mapping: | + output = 2147483647.int32() + 1.int32() + error: "overflow" + + - name: "int32 min - 1 overflows" + mapping: | + output = (-2147483648).int32() - 1.int32() + error: "overflow" + + - name: "int32 max * 2 overflows" + mapping: | + output = 2147483647.int32() * 2.int32() + error: "overflow" + + - name: "int32 large negative multiplication overflows" + mapping: | + output = (-2147483648).int32() * (-1).int32() + error: "overflow" + + # --- uint32 overflow --- + + - name: "uint32 max + 1 overflows" + mapping: | + output = 4294967295.uint32() + 1.uint32() + error: "overflow" + + - name: "uint32 zero minus 1 overflows" + mapping: | + output = 0.uint32() - 1.uint32() + error: "overflow" + + - name: "uint32 max * 2 overflows" + mapping: | + output = 4294967295.uint32() * 2.uint32() + error: "overflow" + + # --- uint64 overflow --- + + - name: "uint64 max + 1 overflows" + mapping: | + output = "18446744073709551615".uint64() + 1.uint64() + error: "overflow" + + - name: "uint64 zero minus 1 overflows" + mapping: | + output = 0.uint64() - 1.uint64() + error: "overflow" + + - name: "uint64 max * 2 overflows" + mapping: | + output = "18446744073709551615".uint64() * 2.uint64() + error: "overflow" + + # --- int64 multiplication near boundaries --- + + - name: "int64 max / 2 * 3 overflows" + mapping: | + $half = 4611686018427387903 + output = $half * 3 + error: "overflow" + + - name: "int64 negative overflow from multiplication" + mapping: | + output = 9223372036854775807 * (-2) + error: "overflow" + + # --- Non-overflow boundary values work --- + + - name: "int64 max - 1 + 1 is exactly max" + mapping: | + output = 9223372036854775806 + 1 + output: 9223372036854775807 + + - name: "int32 max representable after subtraction" + mapping: | + output = (2147483647.int32() - 1.int32()) + 1.int32() + output: {_type: "int32", value: "2147483647"} + + - name: "uint64 large value arithmetic within range" + mapping: | + output = "18446744073709551614".uint64() + 1.uint64() + output: {_type: "uint64", value: "18446744073709551615"} + + - name: "uint32 max minus 1 plus 1 is exactly max" + mapping: | + output = (4294967295.uint32() - 1.uint32()) + 1.uint32() + output: {_type: "uint32", value: "4294967295"} + + # --- Overflow in modulo (abs(min) overflows for signed) --- + + - name: "int64 min modulo -1 overflows" + mapping: | + $min = -9223372036854775807 - 1 + output = $min % (-1) + error: "overflow" diff --git a/internal/bloblang2/spec/tests/edge_cases/interpreter_reuse.yaml b/internal/bloblang2/spec/tests/edge_cases/interpreter_reuse.yaml new file mode 100644 index 000000000..fb7e5d5ed --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/interpreter_reuse.yaml @@ -0,0 +1,39 @@ +description: > + Interpreter reuse correctness — the same compiled mapping must produce + independent results when executed multiple times with different inputs. + Verifies no state leakage between executions. + +tests: + - name: "second execution sees fresh input" + input: {"x": 2} + mapping: | + output.doubled = input.x * 2 + output: {"doubled": 4} + + - name: "variables do not persist between executions" + mapping: | + $x = 42 + output.v = $x + output: {"v": 42} + + - name: "output starts empty on each execution" + mapping: | + output.fresh = true + output: {"fresh": true} + + - name: "map local variables are independent per execution" + mapping: | + map counter(n) { + $local = n * 10 + $local + 1 + } + output.v = counter(3) + output: {"v": 31} + + - name: "recursive map frames do not leak between executions" + mapping: | + map factorial(n) { + if n <= 1 { 1 } else { n * factorial(n - 1) } + } + output.v = factorial(5) + output: {"v": 120} diff --git a/internal/bloblang2/spec/tests/edge_cases/nan_behavior.yaml b/internal/bloblang2/spec/tests/edge_cases/nan_behavior.yaml new file mode 100644 index 000000000..607f6a321 --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/nan_behavior.yaml @@ -0,0 +1,123 @@ +description: "Edge cases: NaN equality, comparison, arithmetic, sort ordering, unique dedup, bool error" + +tests: + # --- NaN equality --- + + - name: "NaN == NaN is false" + mapping: | + output = input.n == input.n + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + - name: "NaN != NaN is true" + mapping: | + output = input.n != input.n + input: {"n": {_type: "float64", value: "NaN"}} + output: true + + - name: "NaN == 0.0 is false" + mapping: | + output = input.n == 0.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + - name: "NaN != 0.0 is true" + mapping: | + output = input.n != 0.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: true + + # --- NaN comparison --- + + - name: "NaN < 1.0 is false" + mapping: | + output = input.n < 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + - name: "NaN > 1.0 is false" + mapping: | + output = input.n > 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + - name: "NaN <= 1.0 is false" + mapping: | + output = input.n <= 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + - name: "NaN >= 1.0 is false" + mapping: | + output = input.n >= 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + - name: "1.0 < NaN is false" + mapping: | + output = 1.0 < input.n + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + - name: "1.0 > NaN is false" + mapping: | + output = 1.0 > input.n + input: {"n": {_type: "float64", value: "NaN"}} + output: false + + # --- NaN arithmetic --- + + - name: "NaN + 1.0 is NaN" + mapping: | + output = input.n + 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: {_type: "float64", value: "NaN"} + + - name: "NaN * 0.0 is NaN" + mapping: | + output = input.n * 0.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: {_type: "float64", value: "NaN"} + + - name: "NaN - NaN is NaN" + mapping: | + output = input.n - input.n + input: {"n": {_type: "float64", value: "NaN"}} + output: {_type: "float64", value: "NaN"} + + # --- NaN in sort (total ordering: NaN after all values) --- + + - name: "sort ordering with NaN values" + mapping: | + output = input.arr.sort() + cases: + - name: "NaN after all finite values" + input: {"arr": [3.0, {_type: "float64", value: "NaN"}, 1.0, 2.0]} + output: [1.0, 2.0, 3.0, {_type: "float64", value: "NaN"}] + - name: "multiple NaN values kept at end" + input: {"arr": [{_type: "float64", value: "NaN"}, 2.0, {_type: "float64", value: "NaN"}, 1.0]} + output: [1.0, 2.0, {_type: "float64", value: "NaN"}, {_type: "float64", value: "NaN"}] + + # --- NaN in unique (NaN treated as equal) --- + + - name: "unique treats multiple NaN as equal — keeps first" + mapping: | + output = input.arr.unique() + input: {"arr": [{_type: "float64", value: "NaN"}, 1.0, {_type: "float64", value: "NaN"}, 2.0]} + output: [{_type: "float64", value: "NaN"}, 1.0, 2.0] + + # --- NaN bool conversion --- + + - name: "NaN.bool() is error" + mapping: | + output = input.n.bool() + input: {"n": {_type: "float64", value: "NaN"}} + error: "NaN" + + # --- NaN type --- + + - name: "NaN type is float64" + mapping: | + output = input.n.type() + input: {"n": {_type: "float64", value: "NaN"}} + output: "float64" diff --git a/internal/bloblang2/spec/tests/edge_cases/precision_loss.yaml b/internal/bloblang2/spec/tests/edge_cases/precision_loss.yaml new file mode 100644 index 000000000..6d9c6d2f3 --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/precision_loss.yaml @@ -0,0 +1,80 @@ +description: "Edge cases: precision loss when promoting large int64 to float64, explicit conversion unchecked" + +tests: + # --- int64 > 2^53 + float arithmetic errors --- + + - name: "int64 just above 2^53 plus float is error" + mapping: | + output = 9007199254740993 + 1.0 + error: "exact" + + - name: "int64 at 2^53 plus float is ok" + mapping: | + output = 9007199254740992 + 1.0 + output: 9007199254740993.0 + + - name: "int64 well above 2^53 plus float is error" + mapping: | + output = 9223372036854775807 + 0.0 + error: "exact" + + - name: "large int64 minus float is error" + mapping: | + output = 9007199254740993 - 1.0 + error: "exact" + + - name: "large int64 times float is error" + mapping: | + output = 9007199254740993 * 2.0 + error: "exact" + + - name: "negative large int64 plus float is error" + mapping: | + output = -9007199254740993 + 1.0 + error: "exact" + + # --- Small int64 with float is fine --- + + - name: "small int64 plus float is ok" + mapping: | + output = 42 + 1.5 + output: 43.5 + + - name: "int64 at 2^53 minus 1 plus float is ok" + mapping: | + output = 9007199254740991 + 1.0 + output: 9007199254740992.0 + + # --- Explicit conversion (.float64()) is unchecked --- + + - name: "explicit float64 conversion of large int64 is unchecked" + mapping: | + output = 9007199254740993.float64() + output: 9007199254740992.0 + + - name: "explicit float64 conversion of int64 max is unchecked" + mapping: | + output = 9223372036854775807.float64().type() + output: "float64" + + # --- uint64 > int64 max + int is error --- + + - name: "uint64 above int64 max plus int is error" + mapping: | + $big = "18446744073709551615".uint64() + output = $big + 1 + error: "uint64 value exceeds int64 range" + + # --- Boundary: exactly 2^53 --- + + - name: "int64 exactly 2^53 float64 roundtrip is exact" + mapping: | + $v = 9007199254740992 + output = $v + 0.5 + output: 9007199254740992.5 + + - name: "int64 exactly 2^53 plus 1 float operation errors" + mapping: | + $v = 9007199254740993 + output = $v + 0.5 + error: "exact" diff --git a/internal/bloblang2/spec/tests/edge_cases/string_codepoints.yaml b/internal/bloblang2/spec/tests/edge_cases/string_codepoints.yaml new file mode 100644 index 000000000..5bd9739f4 --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/string_codepoints.yaml @@ -0,0 +1,139 @@ +description: > + String codepoint operations — indexing returns codepoints (int64), + .char() converts back, .length() counts codepoints, .split("") splits + by codepoint, and .reverse() reverses by codepoint. + +tests: + # --- String indexing returns codepoint values --- + + - name: "ASCII character index returns codepoint" + mapping: | + output.v = "hello"[0] + output: {"v": 104} + + - name: "second ASCII character" + mapping: | + output.v = "hello"[1] + output: {"v": 101} + + - name: "space codepoint" + mapping: | + output.v = "a b"[1] + output: {"v": 32} + + - name: "digit character codepoint" + mapping: | + output.v = "0"[0] + output: {"v": 48} + + # --- .char() round-trip --- + + - name: "char converts codepoint back to string" + mapping: | + output.v = 104.char() + output: {"v": "h"} + + - name: "char round-trip from indexing" + mapping: | + output.v = "hello"[0].char() + output: {"v": "h"} + + - name: "char with emoji codepoint" + mapping: | + output.v = 128075.char() + output: {"v": "\U0001F44B"} + + # --- .length() counts codepoints --- + + - name: "ASCII string length" + mapping: | + output.v = "hello".length() + output: {"v": 5} + + - name: "empty string length" + mapping: | + output.v = "".length() + output: {"v": 0} + + - name: "unicode string length counts codepoints" + mapping: | + output.v = "café".length() + output: {"v": 4} + + # --- .split("") splits by codepoint --- + + - name: "split empty delimiter splits by codepoint" + mapping: | + output.v = "abc".split("") + output: {"v": ["a", "b", "c"]} + + - name: "split empty delimiter on empty string" + mapping: | + output.v = "".split("") + output: {"v": []} + + - name: "split on normal delimiter" + mapping: | + output.v = "a,b,c".split(",") + output: {"v": ["a", "b", "c"]} + + - name: "split on multi-char delimiter" + mapping: | + output.v = "a::b::c".split("::") + output: {"v": ["a", "b", "c"]} + + # --- .reverse() reverses by codepoint --- + + - name: "reverse ASCII string" + mapping: | + output.v = "hello".reverse() + output: {"v": "olleh"} + + - name: "reverse empty string" + mapping: | + output.v = "".reverse() + output: {"v": ""} + + - name: "reverse single character" + mapping: | + output.v = "a".reverse() + output: {"v": "a"} + + # --- Negative indexing --- + + - name: "negative index -1 is last codepoint" + mapping: | + output.v = "hello"[-1] + output: {"v": 111} + + - name: "negative index -1 char round-trip" + mapping: | + output.v = "hello"[-1].char() + output: {"v": "o"} + + # --- Slice on strings --- + + - name: "string slice basic" + mapping: | + output.v = "hello world".slice(0, 5) + output: {"v": "hello"} + + - name: "string slice from middle" + mapping: | + output.v = "hello world".slice(6, 11) + output: {"v": "world"} + + - name: "string slice clamped to length" + mapping: | + output.v = "hi".slice(0, 100) + output: {"v": "hi"} + + - name: "string slice with negative start" + mapping: | + output.v = "hello".slice(-3, 5) + output: {"v": "llo"} + + - name: "string slice empty range" + mapping: | + output.v = "hello".slice(3, 3) + output: {"v": ""} diff --git a/internal/bloblang2/spec/tests/edge_cases/unicode.yaml b/internal/bloblang2/spec/tests/edge_cases/unicode.yaml new file mode 100644 index 000000000..37afe234d --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/unicode.yaml @@ -0,0 +1,103 @@ +description: "Edge cases: multi-codepoint emoji, combining characters, no normalization" + +tests: + # --- Multi-codepoint emoji --- + + - name: "skin-tone emoji has length 2" + mapping: | + output = "\u{1F44B}\u{1F3FD}".length() + output: 2 + + - name: "simple emoji has length 1" + mapping: | + output = "\u{1F600}".length() + output: 1 + + - name: "flag emoji has length 2 (two regional indicators)" + mapping: | + output = "\u{1F1FA}\u{1F1F8}".length() + output: 2 + + - name: "family emoji (ZWJ sequence) has multiple codepoints" + mapping: | + output = "\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}".length() + output: 5 + + # --- Codepoint indexing on emoji --- + + - name: "index first codepoint of multi-codepoint emoji" + mapping: | + output = "\u{1F44B}\u{1F3FD}"[0] + output: 128075 + + - name: "index second codepoint of multi-codepoint emoji" + mapping: | + output = "\u{1F44B}\u{1F3FD}"[1] + output: 127997 + + # --- Combining characters --- + + - name: "precomposed e-acute has length 1" + mapping: | + output = "\u00E9".length() + output: 1 + + - name: "decomposed e plus combining acute has length 2" + mapping: | + output = "e\u0301".length() + output: 2 + + - name: "precomposed and decomposed are not equal" + mapping: | + output = "\u00E9" == "e\u0301" + output: false + + - name: "precomposed and decomposed have different lengths" + mapping: | + $a = "\u00E9".length() + $b = "e\u0301".length() + output = $a == $b + output: false + + # --- String comparison is codepoint-by-codepoint --- + + - name: "string comparison is codepoint-based" + mapping: | + output = "a\u0301" < "\u00E9" + output: true + + # --- Unicode in object keys --- + + - name: "unicode string as object key" + mapping: | + output = {"\u00E9": "accent"} + output: {"\u00E9": "accent"} + + - name: "emoji as object key" + mapping: | + output = {"\u{1F600}": "smile"} + output: {"😀": "smile"} + + # --- Unicode string methods --- + + - name: "contains with unicode" + mapping: | + output = "caf\u00E9".contains("\u00E9") + output: true + + - name: "uppercase with unicode" + mapping: | + output = "caf\u00E9".uppercase() + output: "CAF\u00C9" + + # --- Mixed ASCII and multi-byte --- + + - name: "length of mixed ASCII and multi-byte string" + mapping: | + output = "a\u{1F600}b".length() + output: 3 + + - name: "index past emoji in mixed string" + mapping: | + output = "a\u{1F600}b"[2] + output: 98 diff --git a/internal/bloblang2/spec/tests/edge_cases/whitespace_newlines.yaml b/internal/bloblang2/spec/tests/edge_cases/whitespace_newlines.yaml new file mode 100644 index 000000000..d9841f117 --- /dev/null +++ b/internal/bloblang2/spec/tests/edge_cases/whitespace_newlines.yaml @@ -0,0 +1,138 @@ +description: "Edge cases: whitespace handling, raw string newlines, multi-line mappings" + +tests: + # --- Whitespace in expressions --- + + - name: "extra spaces around operators" + mapping: | + output = 1 + 2 + output: 3 + + - name: "no spaces around operators" + mapping: | + output = 1+2 + output: 3 + + - name: "tabs in expressions" + mapping: "output\t=\t1\t+\t2" + output: 3 + + - name: "spaces in array literal" + mapping: | + output = [ 1 , 2 , 3 ] + output: [1, 2, 3] + + - name: "spaces in object literal" + mapping: | + output = { "a" : 1 , "b" : 2 } + output: {"a": 1, "b": 2} + + # --- Multi-line mappings --- + + - name: "multi-line array literal" + mapping: | + output = [ + 1, + 2, + 3, + ] + output: [1, 2, 3] + + - name: "multi-line object literal" + mapping: | + output = { + "a": 1, + "b": 2, + "c": 3, + } + output: {"a": 1, "b": 2, "c": 3} + + - name: "multi-line method chain" + mapping: | + output = [3, 1, 2] + .sort() + .reverse() + output: [3, 2, 1] + + - name: "multi-line if expression" + mapping: | + output = if true { + "yes" + } else { + "no" + } + output: "yes" + + - name: "multi-line match expression" + mapping: | + output = match 2 { + 1 => "one", + 2 => "two", + _ => "other", + } + output: "two" + + # --- Raw strings preserve newlines --- + + - name: "raw string preserves literal newline" + mapping: | + output = `line1 + line2` + output: "line1\nline2" + + - name: "raw string preserves multiple newlines" + mapping: | + output = `a + + b` + output: "a\n\nb" + + - name: "raw string preserves tabs" + mapping: | + output = `col1 col2` + output: "col1\tcol2" + + - name: "raw string does not process escape sequences" + mapping: | + output = `hello\nworld` + output: "hello\\nworld" + + # --- Escaped newlines in regular strings --- + + - name: "escaped newline in regular string" + mapping: | + output = "line1\nline2" + output: "line1\nline2" + + - name: "escaped tab in regular string" + mapping: | + output = "col1\tcol2" + output: "col1\tcol2" + + # --- Blank lines between statements --- + + - name: "blank lines between statements are ignored" + mapping: | + output.a = 1 + + output.b = 2 + + output.c = 3 + output: {"a": 1, "b": 2, "c": 3} + + # --- Leading/trailing whitespace in mapping --- + + - name: "mapping with leading blank lines" + mapping: | + + output = 42 + output: 42 + + - name: "multiple assignments separated by blank lines" + mapping: | + $x = 10 + + $y = 20 + + output = $x + $y + output: 30 diff --git a/internal/bloblang2/spec/tests/error_handling/catch.yaml b/internal/bloblang2/spec/tests/error_handling/catch.yaml new file mode 100644 index 000000000..6ad9ab3a2 --- /dev/null +++ b/internal/bloblang2/spec/tests/error_handling/catch.yaml @@ -0,0 +1,162 @@ +description: ".catch() — intercepts errors, error object access, passthrough for non-errors, chaining, scope" + +tests: + # --- Basic catch usage --- + + - name: "catch intercepts division by zero" + mapping: | + output.result = (5 / 0).catch(err -> -1) + output: {"result": -1} + + - name: "catch intercepts throw" + mapping: | + output.result = throw("boom").catch(err -> "caught") + output: {"result": "caught"} + + - name: "catch intercepts out of bounds" + mapping: | + $arr = [1, 2, 3] + output.result = $arr[10].catch(err -> 0) + output: {"result": 0} + + - name: "catch intercepts type mismatch" + mapping: | + output.result = (5 + "nope").catch(err -> 0) + output: {"result": 0} + + # --- Error object access --- + + - name: "error object has what field with message" + mapping: | + output.result = throw("something broke").catch(err -> err.what) + output: {"result": "something broke"} + + - name: "error what field from division by zero" + mapping: | + output.result = (5 / 0).catch(err -> err.what) + output: {"result": "division by zero"} + + - name: "error what field used in concatenation" + mapping: | + output.result = throw("oops").catch(err -> "error: " + err.what) + output: {"result": "error: oops"} + + # --- Directly on input/output/variable (no intermediate field access) --- + + - name: "catch on input directly when error" + input: null + mapping: | + output = input.not_null().catch(err -> "was null") + output: "was null" + + - name: "catch on variable directly when no error" + mapping: | + $v = "hello" + output = $v.catch(err -> "caught") + output: "hello" + + - name: "catch on input directly when no error" + input: "hello" + mapping: | + output = input.catch(err -> "fallback") + output: "hello" + + # --- No-error passthrough --- + + - name: "catch returns value unchanged when no error" + mapping: | + output.result = "hello".catch(err -> "fallback") + output: {"result": "hello"} + + - name: "catch returns int unchanged when no error" + mapping: | + output.result = 42.catch(err -> 0) + output: {"result": 42} + + - name: "catch returns null unchanged when no error" + mapping: | + output.result = null.catch(err -> "fallback") + output: {"result": null} + + - name: "catch lambda not invoked when no error" + mapping: | + output.result = "ok".catch(err -> throw("should not run")) + output: {"result": "ok"} + + # --- Void passes through catch unchanged --- + + - name: "void passes through catch — assignment skipped" + mapping: | + output.x = "prior" + output.x = (if false { 1 }).catch(err -> 99) + output: {"x": "prior"} + + - name: "void passes through catch then subsequent method errors" + mapping: | + output.result = (if false { 1 }).catch(err -> 0).string().catch(err -> "caught void method") + output: {"result": "caught void method"} + + # --- Deleted passes through catch unchanged --- + + - name: "deleted passes through catch — field removed" + mapping: | + output.x = "prior" + output.x = deleted().catch(err -> "rescued") + output: {} + + # --- Parentheses define catch scope --- + + - name: "catch scoped to parenthesized expression" + mapping: | + output.result = (5 / 0).string().catch(err -> "0") + output: {"result": "0"} + + - name: "catch scoped to inner parens catches inner error only" + mapping: | + output.result = ((5 / 0).catch(err -> 42)).string() + output: {"result": "42"} + + - name: "catch on method chain catches entire chain" + mapping: | + output.result = "hello".int64().abs().catch(err -> -1) + output: {"result": -1} + + # --- Chained catches --- + + - name: "first catch handles error, second catch not invoked" + mapping: | + output.result = throw("x").catch(err -> "first").catch(err -> "second") + output: {"result": "first"} + + - name: "first catch re-throws, second catch handles" + mapping: | + output.result = throw("x").catch(err -> throw("re-thrown")).catch(err -> "final") + output: {"result": "final"} + + - name: "chained catch with error in first handler" + mapping: | + output.result = throw("x").catch(err -> 5 / 0).catch(err -> "safe") + output: {"result": "safe"} + + # --- Handler returning deleted --- + + - name: "catch handler returns deleted — field removed" + mapping: | + output.x = "prior" + output.x = throw("err").catch(err -> deleted()) + output: {} + + # --- Handler returning void --- + + - name: "catch handler returns void — assignment skipped" + mapping: | + output.x = "prior" + output.x = throw("err").catch(err -> if false { "nope" }) + output: {"x": "prior"} + + # --- Catch after method that errors --- + + - name: "catch after string method on non-stringable" + mapping: | + output.result = (5 / 0).string().catch(err -> "zero div") + output: {"result": "zero div"} diff --git a/internal/bloblang2/spec/tests/error_handling/not_null.yaml b/internal/bloblang2/spec/tests/error_handling/not_null.yaml new file mode 100644 index 000000000..62313486c --- /dev/null +++ b/internal/bloblang2/spec/tests/error_handling/not_null.yaml @@ -0,0 +1,141 @@ +description: ".not_null() — returns value if not null, throws error on null, optional custom message" + +tests: + # --- Passthrough for non-null values --- + + - name: "not_null passes through string" + mapping: | + output.result = "hello".not_null() + output: {"result": "hello"} + + - name: "not_null passes through integer" + mapping: | + output.result = 42.not_null() + output: {"result": 42} + + - name: "not_null passes through zero" + mapping: | + output.result = 0.not_null() + output: {"result": 0} + + - name: "not_null passes through false" + mapping: | + output.result = false.not_null() + output: {"result": false} + + - name: "not_null passes through empty string" + mapping: | + output.result = "".not_null() + output: {"result": ""} + + - name: "not_null passes through empty array" + mapping: | + output.result = [].not_null() + output: {"result": []} + + - name: "not_null passes through empty object" + mapping: | + output.result = {}.not_null() + output: {"result": {}} + + # --- Error on null with default message --- + + - name: "not_null on null produces default error" + mapping: | + output.result = null.not_null() + error: "unexpected null value" + + - name: "not_null on null input field produces default error" + input: {} + mapping: | + output.result = input.missing_field.not_null() + error: "unexpected null value" + + # --- Error on null with custom message --- + + - name: "not_null with custom message on null" + mapping: | + output.result = null.not_null("name required") + error: "name required" + + - name: "not_null with custom message on missing input field" + input: {} + mapping: | + output.result = input.email.not_null("email is required") + error: "email is required" + + - name: "not_null custom message ignored when value present" + mapping: | + output.result = "exists".not_null("should not appear") + output: {"result": "exists"} + + # --- Caught by catch --- + + - name: "not_null error caught by catch — default message" + mapping: | + output.result = null.not_null().catch(err -> "was null") + output: {"result": "was null"} + + - name: "not_null error caught by catch — custom message accessible" + mapping: | + output.result = null.not_null("name required").catch(err -> err.what) + output: {"result": "name required"} + + - name: "not_null error caught by catch with fallback value" + input: {} + mapping: | + output.result = input.name.not_null("missing").catch(err -> "anonymous") + output: {"result": "anonymous"} + + # --- Chained with or --- + + - name: "or rescues null before not_null is reached" + input: {} + mapping: | + output.result = input.name.or("default").not_null() + output: {"result": "default"} + + - name: "not_null then or — error propagates through or" + mapping: | + output.result = null.not_null().or("default") + error: "unexpected null value" + + # --- In postfix chain --- + + - name: "not_null in chain with subsequent method" + mapping: | + output.result = "hello".not_null().uppercase() + output: {"result": "HELLO"} + + - name: "not_null error skips subsequent methods" + mapping: | + output.result = null.not_null().uppercase() + error: "unexpected null value" + + - name: "not_null on result of field access" + input: {"user": {"name": "alice"}} + mapping: | + output.result = input.user.name.not_null("user name required") + output: {"result": "alice"} + + # --- Used in conditional patterns --- + + - name: "not_null in match arm with null check" + mapping: | + output.result = match input.val as v { + v == null => throw("got null"), + _ => v.not_null(), + } + cases: + - name: "non-null passes through" + input: {"val": "hello"} + output: {"result": "hello"} + - name: "null triggers throw" + input: {"val": null} + error: "got null" + + - name: "not_null with catch as validation pattern" + input: {"name": null} + mapping: | + output.result = input.name.not_null("name required").catch(err -> "unknown") + output: {"result": "unknown"} diff --git a/internal/bloblang2/spec/tests/error_handling/or.yaml b/internal/bloblang2/spec/tests/error_handling/or.yaml new file mode 100644 index 000000000..f55dac66f --- /dev/null +++ b/internal/bloblang2/spec/tests/error_handling/or.yaml @@ -0,0 +1,166 @@ +description: ".or() — rescues null, void, and deleted; does NOT catch errors; short-circuit evaluation" + +tests: + # --- Rescues null --- + + - name: "or rescues null with string default" + mapping: | + output.result = null.or("default") + output: {"result": "default"} + + - name: "or rescues null with integer default" + mapping: | + output.result = null.or(42) + output: {"result": 42} + + - name: "or rescues null from missing input field" + input: {} + mapping: | + output.result = input.name.or("anonymous") + output: {"result": "anonymous"} + + - name: "or rescues null with null (returns null)" + mapping: | + output.result = null.or(null) + output: {"result": null} + + # --- Rescues void --- + + - name: "or rescues void from if-without-else" + mapping: | + output.result = (if false { "hello" }).or("default") + output: {"result": "default"} + + - name: "or rescues void from non-exhaustive match" + mapping: | + output.result = (match "bird" { "cat" => "meow" }).or("unknown") + output: {"result": "unknown"} + + # --- Rescues deleted --- + + - name: "or rescues deleted with string" + mapping: | + output.result = deleted().or("fallback") + output: {"result": "fallback"} + + - name: "or rescues deleted with integer" + mapping: | + output.result = deleted().or(0) + output: {"result": 0} + + # --- Directly on input/output/variable (no intermediate field access) --- + + - name: "or on input directly" + mapping: | + output = input.or("fallback") + cases: + - name: "null input returns fallback" + input: null + output: "fallback" + - name: "present input returned unchanged" + input: "hello" + output: "hello" + + - name: "or on variable directly" + mapping: | + $v = null + output = $v.or("default") + output: "default" + + - name: "or on variable directly when present" + mapping: | + $v = "value" + output = $v.or("default") + output: "value" + + # --- Non-null/void/deleted: returns value unchanged --- + + - name: "or returns string unchanged" + mapping: | + output.result = "hello".or("default") + output: {"result": "hello"} + + - name: "or returns zero unchanged (not null)" + mapping: | + output.result = 0.or(42) + output: {"result": 0} + + - name: "or returns false unchanged (not null)" + mapping: | + output.result = false.or(true) + output: {"result": false} + + - name: "or returns empty string unchanged" + mapping: | + output.result = "".or("default") + output: {"result": ""} + + - name: "or returns empty array unchanged" + mapping: | + output.result = [].or([1, 2, 3]) + output: {"result": []} + + # --- Short-circuit: argument not evaluated when value present --- + + - name: "or short-circuits on non-null (throw not evaluated)" + mapping: | + output.result = "hello".or(throw("should not run")) + output: {"result": "hello"} + + - name: "or short-circuits on zero (throw not evaluated)" + mapping: | + output.result = 0.or(throw("should not run")) + output: {"result": 0} + + - name: "or short-circuits on false (throw not evaluated)" + mapping: | + output.result = false.or(throw("should not run")) + output: {"result": false} + + # --- Does NOT catch errors --- + + - name: "or does not catch division by zero" + mapping: | + output.result = (5 / 0).or("default") + error: "division by zero" + + - name: "or does not catch throw" + mapping: | + output.result = throw("boom").or("default") + error: "boom" + + - name: "or does not catch type mismatch" + mapping: | + output.result = (5 + "hello").or(0) + error: "cannot add" + + - name: "or does not catch out of bounds" + mapping: | + $arr = [1, 2] + output.result = $arr[5].or(0) + error: "out of bounds" + + # --- Composing .or() and .catch() --- + + - name: "catch then or — catch handles error, or not needed" + mapping: | + output.result = throw("x").catch(err -> "caught").or("default") + output: {"result": "caught"} + + - name: "catch then or — catch returns null, or rescues" + mapping: | + output.result = throw("x").catch(err -> null).or("rescued") + output: {"result": "rescued"} + + - name: "or then catch — or does not catch error, catch does" + mapping: | + output.result = (5 / 0).or("ignored").catch(err -> "caught") + output: {"result": "caught"} + + # --- or can itself return deleted --- + + - name: "or returns deleted — field removed" + mapping: | + output.x = "prior" + output.x = null.or(deleted()) + output: {} diff --git a/internal/bloblang2/spec/tests/error_handling/or_catch_composition.yaml b/internal/bloblang2/spec/tests/error_handling/or_catch_composition.yaml new file mode 100644 index 000000000..b5b92a9cd --- /dev/null +++ b/internal/bloblang2/spec/tests/error_handling/or_catch_composition.yaml @@ -0,0 +1,122 @@ +description: > + Composing .or() and .catch() — order matters, short-circuit semantics, + null-safe operator interactions, and complex rescue chains. + +tests: + # --- .catch() first, .or() second --- + + - name: "catch handles error, or not needed" + mapping: | + output.v = throw("x").catch(err -> "caught").or("default") + output: {"v": "caught"} + + - name: "catch returns null, or rescues null" + mapping: | + output.v = throw("x").catch(err -> null).or("default") + output: {"v": "default"} + + - name: "catch returns value, or passes through" + mapping: | + output.v = throw("x").catch(err -> 42).or(0) + output: {"v": 42} + + # --- .or() first, .catch() second --- + + - name: "or does not handle error, catch does" + mapping: | + output.v = (5 / 0).or("ignored").catch(err -> "caught") + output: {"v": "caught"} + + - name: "or rescues null, catch not needed" + mapping: | + output.v = null.or("default").catch(err -> "error") + output: {"v": "default"} + + # --- .or() default can itself error, caught by .catch() --- + + - name: "or default errors, caught by subsequent catch" + mapping: | + output.v = null.or(5 / 0).catch(err -> "safe") + output: {"v": "safe"} + + # --- Chained rescue patterns --- + + - name: "null-safe then or for nested field" + mapping: | + output.v = input.user?.name.or("anonymous") + cases: + - name: "null user returns fallback" + input: {"user": null} + output: {"v": "anonymous"} + - name: "present user returns name" + input: {"user": {"name": "Alice"}} + output: {"v": "Alice"} + + - name: "null-safe chain to null then catch for method error" + mapping: | + $v = null + output.v = $v?.trim().or("empty") + output: {"v": "empty"} + + # --- Directly on input/variable with chaining --- + + - name: "or on input then method chain" + input: null + mapping: | + output.v = input.or("hello").uppercase() + output: {"v": "HELLO"} + + - name: "catch on input then or" + input: null + mapping: | + output.v = input.not_null().catch(err -> null).or("rescued") + output: {"v": "rescued"} + + - name: "or on variable then catch" + mapping: | + $v = null + output.v = $v.or(5 / 0).catch(err -> "safe") + output: {"v": "safe"} + + # --- .or() with deleted --- + + - name: "or rescues deleted from conditional" + mapping: | + output.v = (if false { deleted() } else { deleted() }).or("rescued") + output: {"v": "rescued"} + + # --- Error in or argument is only evaluated when needed --- + + - name: "or short-circuits — error in argument not evaluated" + mapping: | + output.v = "present".or(throw("boom")) + output: {"v": "present"} + + - name: "or evaluates argument on null — error propagates" + mapping: | + output.v = null.or(throw("boom")) + error: "boom" + + # --- Triple chain: or, catch, or --- + + - name: "triple rescue chain" + mapping: | + output.v = null.or(5 / 0).catch(err -> null).or("final") + output: {"v": "final"} + + # --- not_null with catch --- + + - name: "not_null error caught by catch" + mapping: | + output.v = null.not_null().catch(err -> "was null") + output: {"v": "was null"} + + - name: "not_null passes through non-null" + mapping: | + output.v = "hello".not_null() + output: {"v": "hello"} + + - name: "not_null error message contains context" + mapping: | + output.v = null.not_null().catch(err -> err.what) + output: {"v": "unexpected null value"} diff --git a/internal/bloblang2/spec/tests/error_handling/propagation.yaml b/internal/bloblang2/spec/tests/error_handling/propagation.yaml new file mode 100644 index 000000000..622acbb96 --- /dev/null +++ b/internal/bloblang2/spec/tests/error_handling/propagation.yaml @@ -0,0 +1,141 @@ +description: "Error propagation through expressions, postfix chains, and multiple error sources" + +tests: + # --- Errors propagate through arithmetic --- + + - name: "error in left operand propagates through addition" + mapping: | + output.result = (5 / 0) + 1 + error: "division by zero" + + - name: "error in right operand propagates through addition" + mapping: | + output.result = 1 + (5 / 0) + error: "division by zero" + + - name: "error in left operand propagates through multiplication" + mapping: | + output.result = (5 / 0) * 3 + error: "division by zero" + + - name: "error propagates through nested arithmetic" + mapping: | + output.result = (1 + (5 / 0)) * 2 + error: "division by zero" + + # --- Errors propagate through comparison and logical operators --- + + - name: "error propagates through equality check" + mapping: | + output.result = (5 / 0) == 0 + error: "division by zero" + + - name: "error propagates through comparison" + mapping: | + output.result = (5 / 0) > 0 + error: "division by zero" + + - name: "error propagates through logical and" + mapping: | + output.result = true && ((5 / 0) > 0) + error: "division by zero" + + - name: "error propagates through negation" + mapping: | + output.result = !throw("bad") + error: "bad" + + # --- Postfix chain: subsequent operations skipped --- + + - name: "error in method receiver skips subsequent method" + mapping: | + output.result = (5 / 0).string() + error: "division by zero" + + - name: "error skips multiple chained methods" + mapping: | + output.result = (5 / 0).string().uppercase().length() + error: "division by zero" + + - name: "error from method skips subsequent methods" + mapping: | + output.result = "hello".int64().abs() + error: "int64" + + - name: "error in field access skips subsequent field" + input: {} + mapping: | + output.result = throw("no data").name.length() + error: "no data" + + - name: "error in index access skips subsequent operations" + mapping: | + $arr = [1, 2] + output.result = $arr[5].string() + error: "out of bounds" + + # --- Error propagation through string interpolation --- + + - name: "error in string concatenation propagates" + mapping: | + output.result = "value is: " + (5 / 0).string() + error: "division by zero" + + # --- Error propagates into variable assignment --- + + - name: "error assigned to variable propagates on use" + mapping: | + $x = 5 / 0 + output.result = $x + error: "division by zero" + + # --- Error propagates through collection literal --- + + - name: "error in array element propagates" + mapping: | + output.result = [1, 5 / 0, 3] + error: "division by zero" + + - name: "error in object value propagates" + mapping: | + output.result = {"a": 5 / 0} + error: "division by zero" + + # --- Error propagates through if condition --- + + - name: "error in if condition propagates" + mapping: | + output.result = if (5 / 0) > 0 { "positive" } else { "negative" } + error: "division by zero" + + # --- Error propagates through match subject --- + + - name: "error in match subject propagates" + mapping: | + output.result = match (5 / 0) { + 0 => "zero", + _ => "other", + } + error: "division by zero" + + # --- Error propagates through map arguments --- + + - name: "error in map argument propagates" + mapping: | + map double(val) { val * 2 } + output.result = double(5 / 0) + error: "division by zero" + + # --- Type mismatch errors propagate --- + + - name: "type mismatch error propagates through chain" + mapping: | + output.result = (5 + "hello").string() + error: "cannot add" + + # --- Multiple error sources: first error wins --- + + - name: "left operand error reported when both sides error" + mapping: | + output.result = (5 / 0) + (3 / 0) + error: "division by zero" diff --git a/internal/bloblang2/spec/tests/error_handling/throw.yaml b/internal/bloblang2/spec/tests/error_handling/throw.yaml new file mode 100644 index 000000000..0609b32c6 --- /dev/null +++ b/internal/bloblang2/spec/tests/error_handling/throw.yaml @@ -0,0 +1,135 @@ +description: "throw() — produces catchable errors, compile errors for bad args, conditional throw" + +tests: + # --- Basic throw --- + + - name: "throw with string message produces error" + mapping: | + output.result = throw("something went wrong") + error: "something went wrong" + + - name: "throw with empty string" + mapping: | + output.result = throw("") + error: "" + + - name: "throw error propagates to output" + mapping: | + output.result = throw("halt") + error: "halt" + + # --- Caught by catch --- + + - name: "throw caught by catch returns fallback" + mapping: | + output.result = throw("x").catch(err -> "fallback") + output: {"result": "fallback"} + + - name: "throw caught by catch — error message accessible" + mapping: | + output.result = throw("details here").catch(err -> err.what) + output: {"result": "details here"} + + - name: "throw in chain caught by catch at end" + mapping: | + output.result = throw("fail").string().catch(err -> "ok") + output: {"result": "ok"} + + # --- Uncaught throw halts mapping --- + + - name: "uncaught throw halts mapping — subsequent assignment not reached" + mapping: | + output.a = throw("halt") + output.b = "should not appear" + error: "halt" + + # --- Compile errors for bad arguments --- + + - name: "throw with zero args is compile error" + mapping: | + output.result = throw() + compile_error: "throw" + + - name: "throw with integer literal is compile error" + mapping: | + output.result = throw(42) + compile_error: "throw" + + - name: "throw with boolean literal is compile error" + mapping: | + output.result = throw(true) + compile_error: "throw" + + - name: "throw with null literal is compile error" + mapping: | + output.result = throw(null) + compile_error: "throw" + + - name: "throw with two arguments is compile error" + mapping: | + output.result = throw("a", "b") + compile_error: "throw" + + # --- Non-string dynamic expression is runtime error --- + + - name: "throw with dynamic int expression is runtime error" + mapping: | + $x = 42 + output.result = throw($x) + error: "throw" + + - name: "throw with dynamic bool expression is runtime error" + mapping: | + $x = true + output.result = throw($x) + error: "throw" + + # --- Dynamic string expression works --- + + - name: "throw with dynamic string expression" + mapping: | + $msg = "dynamic error" + output.result = throw($msg) + error: "dynamic error" + + - name: "throw with dynamic string concatenation" + mapping: | + $code = 404 + output.result = throw("error code: " + $code.string()) + error: "error code: 404" + + # --- Conditional throw --- + + - name: "conditional throw in if — condition true" + mapping: | + $x = -1 + output.result = if $x < 0 { throw("negative value") } else { $x } + error: "negative value" + + - name: "conditional throw in if — condition false" + mapping: | + $x = 5 + output.result = if $x < 0 { throw("negative value") } else { $x } + output: {"result": 5} + + - name: "conditional throw in statement assignment" + mapping: | + $valid = false + if !$valid { output.err = throw("invalid") } + error: "invalid" + + - name: "conditional throw in match arm" + mapping: | + $status = "error" + output.result = match $status { + "ok" => "success", + "error" => throw("status was error"), + _ => "unknown", + } + error: "status was error" + + - name: "conditional throw caught by catch" + mapping: | + $x = -1 + output.result = (if $x < 0 { throw("negative") } else { $x }).catch(err -> 0) + output: {"result": 0} diff --git a/internal/bloblang2/spec/tests/imports/basic_import.yaml b/internal/bloblang2/spec/tests/imports/basic_import.yaml new file mode 100644 index 000000000..3fd4d40b7 --- /dev/null +++ b/internal/bloblang2/spec/tests/imports/basic_import.yaml @@ -0,0 +1,141 @@ +description: "Imports: basic namespace import, calling imported maps, passing arguments" + +files: + "helpers.blobl": | + map double(x) { x * 2 } + map greet(name) { "hello " + name } + map add(a, b) { a + b } + map constant() { 42 } + +tests: + # --- Basic namespace-qualified calls --- + + - name: "import and call zero-param map" + mapping: | + import "helpers.blobl" as h + output.v = h::constant() + output: {"v": 42} + + - name: "import and call single-param map" + mapping: | + import "helpers.blobl" as h + output.v = h::double(21) + output: {"v": 42} + + - name: "import and call two-param map" + mapping: | + import "helpers.blobl" as h + output.v = h::add(3, 7) + output: {"v": 10} + + - name: "import and call map with string arg" + mapping: | + import "helpers.blobl" as h + output.v = h::greet("world") + output: {"v": "hello world"} + + - name: "call imported map multiple times" + mapping: | + import "helpers.blobl" as h + output.a = h::double(5) + output.b = h::double(10) + output: {"a": 10, "b": 20} + + - name: "call multiple imported maps" + mapping: | + import "helpers.blobl" as h + output.sum = h::add(h::double(3), h::constant()) + output: {"sum": 48} + + # --- Import with local maps --- + + - name: "imported maps coexist with local maps" + mapping: | + import "helpers.blobl" as h + map triple(x) { x * 3 } + output.v = h::double(5) + triple(2) + output: {"v": 16} + + - name: "local map can call imported map" + mapping: | + import "helpers.blobl" as h + map quad(x) { h::double(h::double(x)) } + output.v = quad(3) + output: {"v": 12} + + # --- Import with input data --- + + - name: "imported map processes input data" + mapping: | + import "helpers.blobl" as h + output.v = h::double(input.x) + input: {"x": 7} + output: {"v": 14} + + # --- Error cases --- + + - name: "calling non-existent map in namespace is error" + mapping: | + import "helpers.blobl" as h + output.v = h::nonexistent(1) + compile_error: "nonexistent" + + - name: "file not found is error" + mapping: | + import "missing.blobl" as m + output.v = m::foo(1) + compile_error: "missing" + + - name: "statements in imported file are compile error" + files: + "bad_lib.blobl": | + $x = 42 + map foo(a) { a + $x } + mapping: | + import "bad_lib.blobl" as lib + output.v = lib::foo(1) + compile_error: "statement" + + - name: "calling imported map without namespace is compile error" + mapping: | + import "helpers.blobl" as h + output.v = double(5) + compile_error: "double" + + # --- Qualified map references in higher-order methods --- + + - name: "qualified map reference in .map()" + mapping: | + import "helpers.blobl" as h + output.v = [1, 2, 3].map(h::double) + output: {"v": [2, 4, 6]} + + - name: "qualified map reference in .filter()" + files: + "predicates.blobl": | + map is_positive(x) { x > 0 } + mapping: | + import "predicates.blobl" as p + output.v = [-1, 2, -3, 4].filter(p::is_positive) + output: {"v": [2, 4]} + + - name: "qualified map reference in .sort_by()" + files: + "keys.blobl": | + map name(item) { item.name } + mapping: | + import "keys.blobl" as k + $items = [{"name": "Charlie"}, {"name": "Alice"}, {"name": "Bob"}] + output.v = $items.sort_by(k::name).map(x -> x.name) + output: {"v": ["Alice", "Bob", "Charlie"]} + + - name: "qualified reference to non-existent namespace is compile error" + mapping: | + output.v = [1, 2].map(bad::double) + compile_error: "namespace" + + - name: "qualified reference to non-existent map is compile error" + mapping: | + import "helpers.blobl" as h + output.v = [1, 2].map(h::nonexistent) + compile_error: "nonexistent" diff --git a/internal/bloblang2/spec/tests/imports/circular_import.yaml b/internal/bloblang2/spec/tests/imports/circular_import.yaml new file mode 100644 index 000000000..0af964070 --- /dev/null +++ b/internal/bloblang2/spec/tests/imports/circular_import.yaml @@ -0,0 +1,92 @@ +description: "Imports: circular import detection — compile-time error" + +files: + "a.blobl": | + import "b.blobl" as b + map foo(x) { b::bar(x) } + + "b.blobl": | + import "a.blobl" as a + map bar(x) { a::foo(x) } + + "c_start.blobl": | + import "c_mid.blobl" as mid + map start(x) { mid::middle(x) } + + "c_mid.blobl": | + import "c_end.blobl" as e + map middle(x) { e::finish(x) } + + "c_end.blobl": | + import "c_start.blobl" as s + map finish(x) { s::start(x) } + + "self.blobl": | + import "self.blobl" as me + map echo(x) { me::echo(x) } + +tests: + # --- Direct circular import (A -> B -> A) --- + + - name: "direct circular import is compile error" + mapping: | + import "a.blobl" as a + output.v = a::foo(1) + compile_error: "circular" + + - name: "circular import detected from other entry point" + mapping: | + import "b.blobl" as b + output.v = b::bar(1) + compile_error: "circular" + + # --- Transitive circular import (A -> B -> C -> A) --- + + - name: "transitive circular import is compile error" + mapping: | + import "c_start.blobl" as cs + output.v = cs::start(1) + compile_error: "circular" + + - name: "transitive circular from middle entry point" + mapping: | + import "c_mid.blobl" as cm + output.v = cm::middle(1) + compile_error: "circular" + + # --- Self-import --- + + - name: "self-import is compile error" + mapping: | + import "self.blobl" as me + output.v = me::echo(1) + compile_error: "circular" + + # --- Main file importing itself --- + + - name: "main file importing file that imports back is circular" + files: + "back.blobl": | + import "entry.blobl" as e + map bounce(x) { e::go(x) } + "entry.blobl": | + import "back.blobl" as b + map go(x) { b::bounce(x) } + mapping: | + import "entry.blobl" as e + output.v = e::go(1) + compile_error: "circular" + + # --- Non-circular is fine (control test) --- + + - name: "non-circular chain compiles successfully" + files: + "leaf.blobl": | + map leaf_fn(x) { x * 2 } + "mid.blobl": | + import "leaf.blobl" as leaf + map mid_fn(x) { leaf::leaf_fn(x) + 1 } + mapping: | + import "mid.blobl" as m + output.v = m::mid_fn(5) + output: {"v": 11} diff --git a/internal/bloblang2/spec/tests/imports/duplicate_namespace.yaml b/internal/bloblang2/spec/tests/imports/duplicate_namespace.yaml new file mode 100644 index 000000000..ba2d7d4e7 --- /dev/null +++ b/internal/bloblang2/spec/tests/imports/duplicate_namespace.yaml @@ -0,0 +1,81 @@ +description: "Imports: duplicate namespace name — compile-time error" + +files: + "helpers.blobl": | + map double(x) { x * 2 } + + "utils.blobl": | + map triple(x) { x * 3 } + + "more_helpers.blobl": | + map quad(x) { x * 4 } + +tests: + # --- Same namespace name for different files --- + + - name: "duplicate namespace from two different files is compile error" + mapping: | + import "helpers.blobl" as lib + import "utils.blobl" as lib + output.v = lib::double(5) + compile_error: "duplicate" + + - name: "duplicate namespace with three imports" + mapping: | + import "helpers.blobl" as a + import "utils.blobl" as b + import "more_helpers.blobl" as a + output.v = a::double(1) + compile_error: "duplicate" + + # --- Same file imported twice with same namespace --- + + - name: "same file imported twice with same namespace is compile error" + mapping: | + import "helpers.blobl" as h + import "helpers.blobl" as h + output.v = h::double(5) + compile_error: "duplicate" + + # --- Same file imported twice with different namespaces is ok --- + + - name: "same file imported with different namespaces is valid" + mapping: | + import "helpers.blobl" as h1 + import "helpers.blobl" as h2 + output.a = h1::double(3) + output.b = h2::double(4) + output: {"a": 6, "b": 8} + + # --- Different files with distinct namespaces is ok --- + + - name: "different files with distinct namespaces is valid" + mapping: | + import "helpers.blobl" as h + import "utils.blobl" as u + output.a = h::double(5) + output.b = u::triple(5) + output: {"a": 10, "b": 15} + + # --- Duplicate namespace in nested import --- + + - name: "duplicate namespace within imported file is compile error" + files: + "bad_imports.blobl": | + import "helpers.blobl" as x + import "utils.blobl" as x + map wrapper(v) { x::double(v) } + mapping: | + import "bad_imports.blobl" as bi + output.v = bi::wrapper(5) + compile_error: "duplicate" + + # --- Namespace shadows local map name (still valid — different resolution) --- + + - name: "namespace name can differ from local map names" + mapping: | + import "helpers.blobl" as helpers + map local_double(x) { x * 2 } + output.a = helpers::double(5) + output.b = local_double(5) + output: {"a": 10, "b": 10} diff --git a/internal/bloblang2/spec/tests/imports/nested_import.yaml b/internal/bloblang2/spec/tests/imports/nested_import.yaml new file mode 100644 index 000000000..4fc7ed5bb --- /dev/null +++ b/internal/bloblang2/spec/tests/imports/nested_import.yaml @@ -0,0 +1,104 @@ +description: "Imports: nested import chains (A imports B, B imports C)" + +files: + "math_core.blobl": | + map square(x) { x * x } + map inc(x) { x + 1 } + + "math_utils.blobl": | + import "math_core.blobl" as core + map square_plus_one(x) { core::inc(core::square(x)) } + map double_square(x) { core::square(x) * 2 } + + "app_helpers.blobl": | + import "math_utils.blobl" as utils + import "math_core.blobl" as core + map transform(x) { utils::square_plus_one(x) + core::inc(x) } + +tests: + # --- Two-level chain --- + + - name: "import file that imports another file" + mapping: | + import "math_utils.blobl" as mu + output.v = mu::square_plus_one(5) + output: {"v": 26} + + - name: "nested import calls inner map through wrapper" + mapping: | + import "math_utils.blobl" as mu + output.v = mu::double_square(3) + output: {"v": 18} + + # --- Three-level chain --- + + - name: "three-level import chain" + mapping: | + import "app_helpers.blobl" as app + output.v = app::transform(4) + output: {"v": 22} + + # --- Diamond import (A imports B and C, B imports C) --- + + - name: "diamond import — same file imported by two paths" + mapping: | + import "math_utils.blobl" as mu + import "math_core.blobl" as mc + output.a = mu::square_plus_one(3) + output.b = mc::square(3) + output: {"a": 10, "b": 9} + + # --- Nested import with multiple maps --- + + - name: "call multiple maps from nested import" + mapping: | + import "math_utils.blobl" as mu + output.a = mu::square_plus_one(2) + output.b = mu::double_square(2) + output: {"a": 5, "b": 8} + + # --- Cannot access transitive namespace --- + + - name: "cannot access transitively imported namespace" + mapping: | + import "math_utils.blobl" as mu + output.v = core::square(5) + compile_error: "core" + + # --- Nested import with local maps --- + + - name: "local map wraps nested-imported map" + mapping: | + import "math_utils.blobl" as mu + map process(x) { mu::square_plus_one(x) * 2 } + output.v = process(3) + output: {"v": 20} + + # --- Deep chain with input --- + + - name: "nested import processes input data" + mapping: | + import "math_utils.blobl" as mu + output.v = mu::square_plus_one(input.n) + input: {"n": 6} + output: {"v": 37} + + # --- Error: non-existent map in nested import --- + + - name: "non-existent map in nested namespace is compile error" + mapping: | + import "math_utils.blobl" as mu + output.v = mu::nonexistent(1) + compile_error: "nonexistent" + + # --- Error: nested file not found --- + + - name: "nested import with missing file is error" + files: + "bad_chain.blobl": | + import "nonexistent.blobl" as missing + map foo(x) { missing::bar(x) } + mapping: | + import "bad_chain.blobl" as bc + output.v = bc::foo(1) + compile_error: "nonexistent" diff --git a/internal/bloblang2/spec/tests/input_output/conditional_deletion.yaml b/internal/bloblang2/spec/tests/input_output/conditional_deletion.yaml new file mode 100644 index 000000000..51686d51e --- /dev/null +++ b/internal/bloblang2/spec/tests/input_output/conditional_deletion.yaml @@ -0,0 +1,194 @@ +description: > + Conditional deletion — using if/match to conditionally delete fields, + array elements, and metadata keys. Also tests deletion in iterator + lambdas and the interaction between deleted() and void in assignments. + +tests: + # --- Conditional field deletion via if --- + + - name: "if-true deletes field, if-false preserves via void" + mapping: | + output.a = "keep" + output.b = "drop" + output.b = if true { deleted() } + output: {"a": "keep"} + + - name: "if-false produces void — field preserved" + mapping: | + output.a = "keep" + output.a = if false { deleted() } + output: {"a": "keep"} + + - name: "if-else conditional field deletion" + mapping: | + output.a = 1 + output.b = 2 + output.b = if input.remove_b { deleted() } else { output.b } + cases: + - name: "deletes field when true" + input: {"remove_b": true} + output: {"a": 1} + - name: "keeps field when false" + input: {"remove_b": false} + output: {"a": 1, "b": 2} + + # --- Conditional field deletion via match --- + + - name: "match deletes field on matching case" + mapping: | + output.a = "keep" + output.b = "conditional" + output.b = match "remove" { + "remove" => deleted(), + _ => output.b, + } + output: {"a": "keep"} + + - name: "match preserves field on non-matching case" + mapping: | + output.a = "keep" + output.b = "conditional" + output.b = match "keep" { + "remove" => deleted(), + _ => output.b, + } + output: {"a": "keep", "b": "conditional"} + + - name: "match without wildcard — void preserves field" + mapping: | + output.status = "active" + output.status = match "nope" { + "remove" => deleted(), + } + output: {"status": "active"} + + # --- Conditional deletion in array literals --- + + - name: "if-expression conditionally omits array element" + mapping: | + $include = false + output.v = [1, if $include { 2 } else { deleted() }, 3] + output: {"v": [1, 3]} + + - name: "if-expression includes array element when true" + mapping: | + $include = true + output.v = [1, if $include { 2 } else { deleted() }, 3] + output: {"v": [1, 2, 3]} + + - name: "match conditionally omits array element" + mapping: | + $mode = "sparse" + output.v = [ + 1, + match $mode { "full" => 2, _ => deleted() }, + 3, + ] + output: {"v": [1, 3]} + + # --- Conditional deletion in object literals --- + + - name: "if-expression conditionally omits object field" + mapping: | + $include_debug = false + output.v = { + "name": "Alice", + "debug": if $include_debug { "trace-123" } else { deleted() }, + } + output: {"v": {"name": "Alice"}} + + - name: "if-expression includes object field when true" + mapping: | + $include_debug = true + output.v = { + "name": "Alice", + "debug": if $include_debug { "trace-123" } else { deleted() }, + } + output: {"v": {"name": "Alice", "debug": "trace-123"}} + + # --- Deletion in .map() iterator --- + + - name: "map lambda returning deleted omits element" + mapping: | + output.v = [1, 2, 3, 4, 5].map(x -> + if x % 2 == 0 { deleted() } else { x } + ) + output: {"v": [1, 3, 5]} + + - name: "map lambda deleting all elements produces empty array" + mapping: | + output.v = [1, 2, 3].map(x -> deleted()) + output: {"v": []} + + - name: "map lambda conditionally transforms or deletes" + mapping: | + output.v = [10, -5, 20, -3, 15].map(x -> + if x > 0 { x * 2 } else { deleted() } + ) + output: {"v": [20, 40, 30]} + + # --- Array element deletion with negative indices --- + + - name: "delete last array element with negative index" + mapping: | + $arr = [10, 20, 30, 40] + $arr[-1] = deleted() + output.v = $arr + output: {"v": [10, 20, 30]} + + - name: "delete second-to-last array element" + mapping: | + $arr = [10, 20, 30, 40] + $arr[-2] = deleted() + output.v = $arr + output: {"v": [10, 20, 40]} + + - name: "delete first element via negative index" + mapping: | + $arr = ["a", "b", "c"] + $arr[-3] = deleted() + output.v = $arr + output: {"v": ["b", "c"]} + + # --- Deletion of nested variable fields --- + + - name: "delete deeply nested variable field" + mapping: | + $data = {"a": {"b": {"c": 1, "d": 2}}} + $data.a.b.c = deleted() + output.v = $data + output: {"v": {"a": {"b": {"d": 2}}}} + + - name: "delete all fields of nested object" + mapping: | + $data = {"inner": {"x": 1, "y": 2}} + $data.inner.x = deleted() + $data.inner.y = deleted() + output.v = $data + output: {"v": {"inner": {}}} + + # --- Conditional metadata deletion --- + + - name: "conditionally delete metadata key" + input: {} + input_metadata: {"source": "kafka", "debug": "true"} + mapping: | + output@ = input@ + $remove_debug = true + if $remove_debug { + output@.debug = deleted() + } + output: {} + output_metadata: {"source": "kafka"} + + - name: "conditionally keep metadata key" + input: {} + input_metadata: {"source": "kafka", "debug": "true"} + mapping: | + output@ = input@ + $remove_debug = false + if $remove_debug { + output@.debug = deleted() + } + output: {} + output_metadata: {"source": "kafka", "debug": "true"} diff --git a/internal/bloblang2/spec/tests/input_output/deletion.yaml b/internal/bloblang2/spec/tests/input_output/deletion.yaml new file mode 100644 index 000000000..0566bdadb --- /dev/null +++ b/internal/bloblang2/spec/tests/input_output/deletion.yaml @@ -0,0 +1,247 @@ +description: "Deletion: output = deleted() drops message, field deletion, array element deletion, deleted in literals, nested deletion, operations on deleted error" + +tests: + # --- output = deleted() drops entire message --- + + - name: "output = deleted() drops message" + mapping: | + output = deleted() + deleted: true + + - name: "output = deleted() discards prior assignments" + mapping: | + output.name = "Alice" + output.age = 30 + output = deleted() + deleted: true + + - name: "output = deleted() stops execution immediately" + mapping: | + output = deleted() + output.should_not_exist = "never reached" + deleted: true + + - name: "output = deleted() after root assignment still drops" + mapping: | + output = {"complex": "structure"} + output = deleted() + deleted: true + + # --- Field deletion --- + + - name: "delete output field" + mapping: | + output.a = 1 + output.b = 2 + output.c = 3 + output.b = deleted() + output: {"a": 1, "c": 3} + + - name: "delete nested output field" + mapping: | + output.user.name = "Alice" + output.user.age = 30 + output.user.age = deleted() + output: {"user": {"name": "Alice"}} + + - name: "delete non-existent output field is no-op" + mapping: | + output.a = 1 + output.missing = deleted() + output: {"a": 1} + + - name: "delete deeply nested field" + mapping: | + output.a.b.c = "deep" + output.a.b.d = "also deep" + output.a.b.c = deleted() + output: {"a": {"b": {"d": "also deep"}}} + + # --- Array element deletion --- + + - name: "delete array element shifts remaining" + mapping: | + output.items = [10, 20, 30, 40] + output.items[1] = deleted() + output: {"items": [10, 30, 40]} + + - name: "delete first array element" + mapping: | + output.items = [10, 20, 30] + output.items[0] = deleted() + output: {"items": [20, 30]} + + - name: "delete last array element" + mapping: | + output.items = [10, 20, 30] + output.items[2] = deleted() + output: {"items": [10, 20]} + + # --- Variable deletion errors and behavior --- + + - name: "assign deleted() to variable is runtime error" + mapping: | + $var = deleted() + error: "deleted" + + - name: "delete field from variable object" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3} + $obj.b = deleted() + output.v = $obj + output: {"v": {"a": 1, "c": 3}} + + - name: "delete element from variable array shifts remaining" + mapping: | + $arr = [10, 20, 30, 40] + $arr[0] = deleted() + output.v = $arr + output: {"v": [20, 30, 40]} + + # --- Deleted in array literals --- + + - name: "deleted() in array literal omits element" + mapping: | + output.v = [1, deleted(), 3] + output: {"v": [1, 3]} + + - name: "multiple deleted() in array literal" + mapping: | + output.v = [deleted(), 1, deleted(), 2, deleted()] + output: {"v": [1, 2]} + + - name: "all deleted() in array literal produces empty array" + mapping: | + output.v = [deleted(), deleted(), deleted()] + output: {"v": []} + + - name: "deleted() at beginning of array literal" + mapping: | + output.v = [deleted(), "a", "b"] + output: {"v": ["a", "b"]} + + - name: "deleted() at end of array literal" + mapping: | + output.v = ["a", "b", deleted()] + output: {"v": ["a", "b"]} + + # --- Deleted in object literals --- + + - name: "deleted() value in object literal omits field" + mapping: | + output.v = {"a": 1, "b": deleted(), "c": 3} + output: {"v": {"a": 1, "c": 3}} + + - name: "all deleted() values in object literal produces empty object" + mapping: | + output.v = {"a": deleted(), "b": deleted()} + output: {"v": {}} + + - name: "deleted() in nested object literal" + mapping: | + output.v = {"outer": {"keep": 1, "drop": deleted()}} + output: {"v": {"outer": {"keep": 1}}} + + # --- Deletion propagates at each level independently --- + + - name: "deleted in nested array within object" + mapping: | + output.v = {"items": [1, deleted(), 3]} + output: {"v": {"items": [1, 3]}} + + - name: "deleted in object within array" + mapping: | + output.v = [{"a": 1, "b": deleted()}, {"c": 3}] + output: {"v": [{"a": 1}, {"c": 3}]} + + # --- Operations on deleted() are errors --- + + - name: "arithmetic on deleted() is error" + mapping: | + output.v = deleted() + 1 + error: "deleted" + + - name: "comparison on deleted() is error" + mapping: | + output.v = deleted() == null + error: "deleted" + + - name: "method call on deleted() is error" + mapping: | + output.v = deleted().string() + error: "deleted" + + - name: "field access on deleted() is error" + mapping: | + output.v = deleted().field + error: "deleted" + + # --- .or() and .catch() rescue deleted --- + + - name: "or rescues deleted()" + mapping: | + output.v = deleted().or("fallback") + output: {"v": "fallback"} + + - name: "catch passes through deleted()" + mapping: | + output.v = deleted().catch(err -> "caught") + output: {} + + # --- Deletion through a non-existent path (Section 9.2 auto-creation) --- + + - name: "delete missing field auto-creates intermediates then no-ops" + mapping: | + output.a.b.c = deleted() + output: {"a": {"b": {}}} + + - name: "delete missing intermediate field is a no-op at depth" + mapping: | + output.x.y = deleted() + output: {"x": {}} + + - name: "delete existing field still removes it" + mapping: | + output.a = {"b": {"c": 1, "d": 2}} + output.a.b.c = deleted() + output: {"a": {"b": {"d": 2}}} + + - name: "delete on variable auto-creates intermediates then no-ops" + mapping: | + $v = {} + $v.x.y = deleted() + output = $v + output: {"x": {}} + + - name: "delete on variable existing field removes it" + mapping: | + $v = {"x": {"y": 1, "z": 2}} + $v.x.y = deleted() + output = $v + output: {"x": {"z": 2}} + + - name: "delete index on auto-created empty array is out of bounds" + mapping: | + output.xs[0] = deleted() + error: "" + + - name: "delete through a type collision is a runtime error" + mapping: | + output.a = "string" + output.a.b = deleted() + error: "" + + - name: "delete missing metadata key auto-creates intermediate object" + mapping: | + output@.routing.region = deleted() + output.done = true + output: {"done": true} + output_metadata: {"routing": {}} + + - name: "delete existing nested metadata key removes it" + mapping: | + output@ = {"routing": {"region": "us-west", "priority": 10}} + output@.routing.region = deleted() + output.done = true + output: {"done": true} + output_metadata: {"routing": {"priority": 10}} diff --git a/internal/bloblang2/spec/tests/input_output/dynamic_metadata.yaml b/internal/bloblang2/spec/tests/input_output/dynamic_metadata.yaml new file mode 100644 index 000000000..f2b24d7cf --- /dev/null +++ b/internal/bloblang2/spec/tests/input_output/dynamic_metadata.yaml @@ -0,0 +1,95 @@ +description: > + Dynamic metadata access — computed keys for reading and writing metadata, + metadata in expressions, and metadata COW with variables. + +tests: + # --- Dynamic metadata write with variable key --- + + - name: "write metadata with variable key" + mapping: | + $key = "source" + output@[$key] = "kafka" + output: {} + output_metadata: {"source": "kafka"} + + - name: "write multiple metadata keys from loop data" + mapping: | + $keys = ["env", "region"] + $vals = ["prod", "us-east"] + output@[$keys[0]] = $vals[0] + output@[$keys[1]] = $vals[1] + output: {} + output_metadata: {"env": "prod", "region": "us-east"} + + # --- Dynamic metadata read with variable key --- + + - name: "read metadata with variable key" + input: {} + input_metadata: {"source": "kafka", "topic": "events"} + mapping: | + $key = "topic" + output.v = input@[$key] + output: {"v": "events"} + + - name: "read missing metadata with variable key returns null" + input: {} + input_metadata: {"source": "kafka"} + mapping: | + $key = "missing" + output.v = input@[$key] + output: {"v": null} + + # --- Dynamic metadata delete --- + + - name: "delete metadata key with variable" + input: {} + input_metadata: {"keep": "yes", "drop": "no"} + mapping: | + output@ = input@ + $remove_key = "drop" + output@[$remove_key] = deleted() + output: {} + output_metadata: {"keep": "yes"} + + # --- Metadata values used in expressions --- + + - name: "metadata value in arithmetic" + input: {} + input_metadata: {"count": 5} + mapping: | + output.doubled = input@.count * 2 + output: {"doubled": 10} + + - name: "metadata value in string concatenation" + input: {} + input_metadata: {"prefix": "hello"} + mapping: | + output.greeting = input@.prefix + " world" + output: {"greeting": "hello world"} + + - name: "metadata value as condition" + input: {} + input_metadata: {"debug": true} + mapping: | + output.level = if input@.debug { "trace" } else { "info" } + output: {"level": "trace"} + + # --- Metadata round-trip through variable --- + + - name: "metadata to variable to output metadata" + input: {} + input_metadata: {"trace_id": "abc-123"} + mapping: | + $trace = input@.trace_id + output@.trace_id = $trace + output: {} + output_metadata: {"trace_id": "abc-123"} + + # --- Metadata from computed expression key --- + + - name: "write metadata with expression key" + mapping: | + $prefix = "x" + output@[$prefix + "_header"] = "value" + output: {} + output_metadata: {"x_header": "value"} diff --git a/internal/bloblang2/spec/tests/input_output/input_access.yaml b/internal/bloblang2/spec/tests/input_output/input_access.yaml new file mode 100644 index 000000000..d908e0103 --- /dev/null +++ b/internal/bloblang2/spec/tests/input_output/input_access.yaml @@ -0,0 +1,163 @@ +description: "Input access: reading fields, metadata, various input types, immutability guarantees, non-existent fields return null" + +tests: + # --- Basic input field access --- + + - name: "access top-level input field" + input: {"name": "Alice"} + mapping: | + output.v = input.name + output: {"v": "Alice"} + + - name: "access nested input field" + input: {"user": {"name": "Bob", "age": 30}} + mapping: | + output.name = input.user.name + output.age = input.user.age + output: {"name": "Bob", "age": 30} + + - name: "access deeply nested input field" + input: {"a": {"b": {"c": {"d": "deep"}}}} + mapping: | + output.v = input.a.b.c.d + output: {"v": "deep"} + + - name: "access input array element by index" + input: [10, 20, 30] + mapping: | + output.first = input[0] + output.second = input[1] + output.third = input[2] + output: {"first": 10, "second": 20, "third": 30} + + - name: "access nested array in input object" + input: {"items": ["a", "b", "c"]} + mapping: | + output.v = input.items[1] + output: {"v": "b"} + + - name: "access object inside input array" + input: [{"name": "Alice"}, {"name": "Bob"}] + mapping: | + output.v = input[1].name + output: {"v": "Bob"} + + # --- Input types --- + + - name: "input passthrough for various types" + mapping: | + output.v = input + cases: + - name: "string" + input: "hello world" + output: {"v": "hello world"} + - name: "number" + input: 42 + output: {"v": 42} + - name: "float" + input: 3.14 + output: {"v": 3.14} + - name: "boolean" + input: true + output: {"v": true} + - name: "null" + input: null + output: {"v": null} + - name: "empty object" + input: {} + output: {"v": {}} + - name: "empty array" + input: [] + output: {"v": []} + - name: "defaults to null when not specified" + output: {"v": null} + + # --- Non-existent fields return null --- + + - name: "non-existent top-level field returns null" + input: {"name": "Alice"} + mapping: | + output.v = input.missing + output: {"v": null} + + - name: "non-existent nested field returns null" + input: {"user": {}} + mapping: | + output.v = input.user.name + output: {"v": null} + + - name: "deep path through null intermediate is error" + input: {"a": 1} + mapping: | + output.v = input.x.y.z + error: "null" + + - name: "out of bounds array index is error" + input: [1, 2, 3] + mapping: | + output.v = input[10] + error: "out of bounds" + + # --- Input metadata access --- + + - name: "read single input metadata key" + input: {"data": 1} + input_metadata: {"source": "kafka"} + mapping: | + output.v = input@.source + output: {"v": "kafka"} + + - name: "read all input metadata as object" + input: {"data": 1} + input_metadata: {"source": "kafka", "topic": "events"} + mapping: | + output.v = input@ + output: {"v": {"source": "kafka", "topic": "events"}} + + - name: "undefined metadata key returns null" + input: {"data": 1} + input_metadata: {"source": "kafka"} + mapping: | + output.v = input@.missing + output: {"v": null} + + - name: "metadata with no input_metadata is empty object" + input: {"data": 1} + mapping: | + output.v = input@ + output: {"v": {}} + + - name: "metadata value can be any type" + input: {} + input_metadata: {"count": 42, "active": true, "tags": ["a", "b"]} + mapping: | + output.count = input@.count + output.active = input@.active + output.tags = input@.tags + output: {"count": 42, "active": true, "tags": ["a", "b"]} + + - name: "nested metadata path access" + input: {} + input_metadata: {"routing": {"region": "us-west", "zone": "a"}} + mapping: | + output.v = input@.routing.region + output: {"v": "us-west"} + + # --- Input immutability --- + + - name: "input is not modified by output assignment from input" + input: {"name": "Alice", "age": 30} + mapping: | + output = input + output.name = "Bob" + output.original = input.name + output: {"name": "Bob", "age": 30, "original": "Alice"} + + - name: "input array is not modified by variable mutation" + input: {"items": [1, 2, 3]} + mapping: | + $copy = input.items + $copy[0] = 99 + output.input_first = input.items[0] + output.copy_first = $copy[0] + output: {"input_first": 1, "copy_first": 99} diff --git a/internal/bloblang2/spec/tests/input_output/metadata.yaml b/internal/bloblang2/spec/tests/input_output/metadata.yaml new file mode 100644 index 000000000..469d7e09e --- /dev/null +++ b/internal/bloblang2/spec/tests/input_output/metadata.yaml @@ -0,0 +1,187 @@ +description: "Metadata: read/write/delete metadata keys, clear all, copy from input, nested paths, type restrictions, input metadata access" + +tests: + # --- Write metadata --- + + - name: "write single metadata key" + mapping: | + output@.source = "kafka" + output: {} + output_metadata: {"source": "kafka"} + + - name: "write multiple metadata keys" + mapping: | + output@.source = "kafka" + output@.topic = "events" + output@.partition = 3 + output: {} + output_metadata: {"source": "kafka", "topic": "events", "partition": 3} + + - name: "metadata value can be any type" + mapping: | + output@.str = "hello" + output@.num = 42 + output@.flag = true + output@.arr = [1, 2, 3] + output@.obj = {"nested": "value"} + output@.nothing = null + output: {} + output_metadata: {"str": "hello", "num": 42, "flag": true, "arr": [1, 2, 3], "obj": {"nested": "value"}, "nothing": null} + + - name: "overwrite metadata key" + mapping: | + output@.key = "first" + output@.key = "second" + output: {} + output_metadata: {"key": "second"} + + # --- Nested metadata paths with auto-creation --- + + - name: "nested metadata path auto-creates intermediate objects" + mapping: | + output@.routing.region = "us-west" + output: {} + output_metadata: {"routing": {"region": "us-west"}} + + - name: "deeply nested metadata path" + mapping: | + output@.a.b.c = "deep" + output: {} + output_metadata: {"a": {"b": {"c": "deep"}}} + + - name: "sibling nested metadata paths" + mapping: | + output@.routing.region = "us-west" + output@.routing.zone = "a" + output: {} + output_metadata: {"routing": {"region": "us-west", "zone": "a"}} + + # --- Delete metadata key --- + + - name: "delete metadata key with deleted()" + mapping: | + output@.keep = "yes" + output@.remove = "no" + output@.remove = deleted() + output: {} + output_metadata: {"keep": "yes"} + + - name: "delete non-existent metadata key is no-op" + mapping: | + output@.key = "value" + output@.missing = deleted() + output: {} + output_metadata: {"key": "value"} + + # --- Clear all metadata --- + + - name: "clear all metadata with empty object" + mapping: | + output@.a = 1 + output@.b = 2 + output@ = {} + output: {} + output_metadata: {} + + - name: "clear then set new metadata" + mapping: | + output@.old = "stale" + output@ = {} + output@.fresh = "new" + output: {} + output_metadata: {"fresh": "new"} + + # --- Copy all metadata from input --- + + - name: "copy all metadata from input" + input: {} + input_metadata: {"source": "kafka", "topic": "events"} + mapping: | + output@ = input@ + output: {} + output_metadata: {"source": "kafka", "topic": "events"} + + - name: "copy metadata from input then add more" + input: {} + input_metadata: {"source": "kafka"} + mapping: | + output@ = input@ + output@.extra = "added" + output: {} + output_metadata: {"source": "kafka", "extra": "added"} + + - name: "copy metadata from input then overwrite key" + input: {} + input_metadata: {"source": "kafka"} + mapping: | + output@ = input@ + output@.source = "http" + output: {} + output_metadata: {"source": "http"} + + - name: "copy metadata from input is COW" + input: {} + input_metadata: {"key": "original"} + mapping: | + output@ = input@ + output@.key = "modified" + output.input_meta = input@.key + output: {"input_meta": "original"} + output_metadata: {"key": "modified"} + + # --- Type restrictions --- + + - name: "output@ = deleted() is error" + mapping: | + output@ = deleted() + error: "metadata" + + - name: "output@ = string is error" + mapping: | + output@ = "not an object" + error: "metadata" + + - name: "output@ = integer is error" + mapping: | + output@ = 42 + error: "metadata" + + - name: "output@ = array is error" + mapping: | + output@ = [1, 2, 3] + error: "metadata" + + - name: "output@ = boolean is error" + mapping: | + output@ = true + error: "metadata" + + # --- Read input metadata --- + + - name: "read input metadata key" + input: {} + input_metadata: {"source": "kafka"} + mapping: | + output.v = input@.source + output: {"v": "kafka"} + + - name: "read all input metadata" + input: {} + input_metadata: {"a": 1, "b": 2} + mapping: | + output.v = input@ + output: {"v": {"a": 1, "b": 2}} + + - name: "undefined input metadata key returns null" + input: {} + input_metadata: {} + mapping: | + output.v = input@.missing + output: {"v": null} + + - name: "read nested input metadata value" + input: {} + input_metadata: {"config": {"timeout": 30}} + mapping: | + output.v = input@.config.timeout + output: {"v": 30} diff --git a/internal/bloblang2/spec/tests/input_output/output_assignment.yaml b/internal/bloblang2/spec/tests/input_output/output_assignment.yaml new file mode 100644 index 000000000..c02aaf6c0 --- /dev/null +++ b/internal/bloblang2/spec/tests/input_output/output_assignment.yaml @@ -0,0 +1,159 @@ +description: "Output assignment: building output incrementally, auto-creation of intermediate objects/arrays, collision errors, gap filling, sequential references" + +tests: + # --- Basic incremental building --- + + - name: "assign single field to output" + mapping: | + output.name = "Alice" + output: {"name": "Alice"} + + - name: "assign multiple fields to output" + mapping: | + output.name = "Alice" + output.age = 30 + output.active = true + output: {"name": "Alice", "age": 30, "active": true} + + - name: "output starts as empty object" + mapping: | + output.v = output + output: {"v": {}} + + # --- Auto-creation of intermediate objects --- + + - name: "auto-create nested object" + mapping: | + output.user.name = "Alice" + output: {"user": {"name": "Alice"}} + + - name: "auto-create deeply nested object" + mapping: | + output.user.address.city = "London" + output: {"user": {"address": {"city": "London"}}} + + - name: "auto-create very deeply nested object" + mapping: | + output.a.b.c.d.e = 42 + output: {"a": {"b": {"c": {"d": {"e": 42}}}}} + + - name: "auto-create and add sibling fields" + mapping: | + output.user.name = "Alice" + output.user.age = 30 + output: {"user": {"name": "Alice", "age": 30}} + + # --- Auto-creation with array index --- + + - name: "auto-create array with index zero" + mapping: | + output.items[0] = "first" + output: {"items": ["first"]} + + - name: "auto-create array with object elements" + mapping: | + output.items[0].name = "first" + output: {"items": [{"name": "first"}]} + + - name: "auto-create array then add more elements" + mapping: | + output.items[0] = "a" + output.items[1] = "b" + output.items[2] = "c" + output: {"items": ["a", "b", "c"]} + + # --- Dynamic index: string creates object, int creates array --- + + - name: "dynamic string index creates object field" + mapping: | + $key = "name" + output.data[$key] = "Alice" + output: {"data": {"name": "Alice"}} + + - name: "dynamic int index creates array element" + mapping: | + $idx = 0 + output.data[$idx] = "first" + output: {"data": ["first"]} + + # --- Array gap filling --- + + - name: "gap filling with null" + mapping: | + output.items[2] = "x" + output: {"items": [null, null, "x"]} + + - name: "gap filling after existing elements" + mapping: | + output.items[0] = "a" + output.items[3] = "d" + output: {"items": ["a", null, null, "d"]} + + - name: "gap filling with zero-based first element then gap" + mapping: | + output.arr[0] = 10 + output.arr[5] = 50 + output: {"arr": [10, null, null, null, null, 50]} + + # --- Sequential references to earlier output --- + + - name: "reference previously assigned output field" + mapping: | + output.x = 10 + output.y = output.x + 5 + output: {"x": 10, "y": 15} + + - name: "reference nested output field" + mapping: | + output.user.name = "Alice" + output.greeting = "Hello, " + output.user.name + output: {"user": {"name": "Alice"}, "greeting": "Hello, Alice"} + + - name: "reference output array element" + mapping: | + output.items[0] = 100 + output.items[1] = output.items[0] * 2 + output: {"items": [100, 200]} + + - name: "overwrite previously assigned field" + mapping: | + output.status = "pending" + output.status = "done" + output: {"status": "done"} + + # --- Collision errors --- + + - name: "collision: field access on string value" + mapping: | + output.user = "Alice" + output.user.name = "Alice" + error: "field" + + - name: "collision: field access on integer value" + mapping: | + output.count = 42 + output.count.value = 42 + error: "field" + + - name: "collision: field access on boolean value" + mapping: | + output.flag = true + output.flag.sub = false + error: "field" + + - name: "collision: index access on string value" + mapping: | + output.data = "hello" + output.data[0] = "H" + error: "index" + + # --- Mixed nesting --- + + - name: "build complex nested structure incrementally" + mapping: | + output.users[0].name = "Alice" + output.users[0].age = 30 + output.users[1].name = "Bob" + output.users[1].age = 25 + output.count = 2 + output: {"users": [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}], "count": 2} diff --git a/internal/bloblang2/spec/tests/input_output/output_root.yaml b/internal/bloblang2/spec/tests/input_output/output_root.yaml new file mode 100644 index 000000000..18754a1db --- /dev/null +++ b/internal/bloblang2/spec/tests/input_output/output_root.yaml @@ -0,0 +1,145 @@ +description: "Output root assignment: output = expr replaces entire output, any type, COW from input, continued building after root assignment" + +tests: + # --- Root assignment replaces entire output --- + + - name: "root assignment replaces empty output with string" + mapping: | + output = "hello" + output: "hello" + + - name: "root assignment replaces empty output with integer" + mapping: | + output = 42 + output: 42 + + - name: "root assignment replaces empty output with float" + mapping: | + output = 3.14 + output: 3.14 + + - name: "root assignment replaces empty output with boolean" + mapping: | + output = true + output: true + + - name: "root assignment replaces empty output with null" + mapping: | + output = null + output: null + + - name: "root assignment replaces empty output with array" + mapping: | + output = [1, 2, 3] + output: [1, 2, 3] + + - name: "root assignment replaces empty output with object" + mapping: | + output = {"name": "Alice", "age": 30} + output: {"name": "Alice", "age": 30} + + # --- Previous assignments discarded --- + + - name: "root assignment discards previous field assignments" + mapping: | + output.name = "Alice" + output.age = 30 + output = "replaced" + output: "replaced" + + - name: "root assignment discards complex previous structure" + mapping: | + output.user.name = "Alice" + output.user.address.city = "London" + output.items[0] = "a" + output = 99 + output: 99 + + - name: "multiple root assignments keep last one" + mapping: | + output = "first" + output = "second" + output = "third" + output: "third" + + # --- COW from input --- + + - name: "output = input copies various types" + mapping: | + output = input + cases: + - name: "object" + input: {"name": "Alice", "age": 30} + output: {"name": "Alice", "age": 30} + - name: "array" + input: [1, 2, 3] + output: [1, 2, 3] + - name: "string" + input: "hello" + output: "hello" + - name: "null" + input: null + output: null + + - name: "output = input is logical copy with COW" + input: {"name": "Alice", "score": 100} + mapping: | + output = input + output.name = "Bob" + output.original = input.name + output: {"name": "Bob", "score": 100, "original": "Alice"} + + # --- Continued building after root assignment --- + + - name: "continue building after root assignment to object" + mapping: | + output = {"name": "Alice"} + output.age = 30 + output: {"name": "Alice", "age": 30} + + - name: "continue building nested after root assignment" + mapping: | + output = {} + output.user.name = "Alice" + output.user.age = 30 + output: {"user": {"name": "Alice", "age": 30}} + + - name: "root assign from input then extend" + input: {"name": "Alice"} + mapping: | + output = input + output.greeting = "Hello, " + input.name + output: {"name": "Alice", "greeting": "Hello, Alice"} + + - name: "root assign object then overwrite field" + mapping: | + output = {"status": "pending", "count": 0} + output.status = "done" + output.count = 1 + output: {"status": "done", "count": 1} + + # --- Root assignment to non-object then field access is error --- + + - name: "field assignment after root assign to string is error" + mapping: | + output = "hello" + output.field = "x" + error: "field" + + - name: "field assignment after root assign to integer is error" + mapping: | + output = 42 + output.field = "x" + error: "field" + + - name: "field assignment after root assign to array is error" + mapping: | + output = [1, 2, 3] + output.field = "x" + error: "field" + + - name: "index assignment after root assign to array works" + mapping: | + output = [10, 20, 30] + output[1] = 99 + output: [10, 99, 30] diff --git a/internal/bloblang2/spec/tests/lambdas/basic.yaml b/internal/bloblang2/spec/tests/lambdas/basic.yaml new file mode 100644 index 000000000..133039ebe --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/basic.yaml @@ -0,0 +1,151 @@ +description: "Lambda expressions — single/multi param, block bodies, nested expressions, higher-order methods" + +tests: + # --- Single parameter lambdas --- + + - name: "single param lambda in map" + mapping: | + output.result = [1, 2, 3].map(x -> x * 2) + output: {"result": [2, 4, 6]} + + - name: "single param lambda in filter" + mapping: | + output.result = [1, 2, 3, 4, 5].filter(x -> x > 3) + output: {"result": [4, 5]} + + - name: "single param lambda in sort_by" + mapping: | + output.result = [{"n": 3}, {"n": 1}, {"n": 2}].sort_by(x -> x.n) + output: {"result": [{"n": 1}, {"n": 2}, {"n": 3}]} + + - name: "single param lambda with string method" + mapping: | + output.result = ["hello", "world"].map(s -> s.uppercase()) + output: {"result": ["HELLO", "WORLD"]} + + - name: "single param lambda accessing nested fields" + mapping: | + $items = [{"price": 10, "qty": 2}, {"price": 5, "qty": 4}] + output.totals = $items.map(item -> item.price * item.qty) + output: {"totals": [20, 20]} + + # --- Multi parameter lambdas --- + + - name: "two param lambda in fold" + mapping: | + output.sum = [1, 2, 3, 4].fold(0, (tally, x) -> tally + x) + output: {"sum": 10} + + - name: "two param lambda in map_entries" + mapping: | + output.result = {"a": 1, "b": 2}.map_entries((k, v) -> {"key": k.uppercase(), "value": v * 10}) + output: {"result": {"A": 10, "B": 20}} + + - name: "two param lambda in filter_entries" + mapping: | + output.result = {"a": 1, "b": 5, "c": 3}.filter_entries((k, v) -> v > 2) + output: {"result": {"b": 5, "c": 3}} + + - name: "fold with string accumulator" + mapping: | + output.result = ["a", "b", "c"].fold("", (acc, x) -> acc + x) + output: {"result": "abc"} + + # --- Block body lambdas --- + + - name: "block body with variable declarations" + mapping: | + $items = [{"price": 10, "qty": 3}, {"price": 20, "qty": 1}] + output.result = $items.map(item -> { + $base = item.price * item.qty + $tax = $base * 0.1 + $base + $tax + }) + output: {"result": [33.0, 22.0]} + + - name: "block body with multiple variables" + mapping: | + output.result = [1, 2, 3].map(x -> { + $doubled = x * 2 + $tripled = x * 3 + $doubled + $tripled + }) + output: {"result": [5, 10, 15]} + + - name: "block body must end with expression" + mapping: | + output.result = [1].map(x -> { + $y = x + 1 + }) + compile_error: "expression" + + - name: "block body with conditional expression" + mapping: | + output.result = [1, 2, 3, 4].map(x -> { + $label = if x > 2 { "big" } else { "small" } + $label + ":" + x.string() + }) + output: {"result": ["small:1", "small:2", "big:3", "big:4"]} + + # --- Nested lambdas --- + + - name: "nested lambda — map inside map" + mapping: | + $matrix = [[1, 2], [3, 4]] + output.result = $matrix.map(row -> row.map(x -> x * 10)) + output: {"result": [[10, 20], [30, 40]]} + + - name: "filter inside map" + mapping: | + $groups = [[1, 2, 3], [4, 5, 6]] + output.result = $groups.map(g -> g.filter(x -> x % 2 == 0)) + output: {"result": [[2], [4, 6]]} + + # --- Passing map names to higher-order methods --- + + - name: "pass map name directly to .map()" + mapping: | + map double(x) { x * 2 } + output.result = [1, 2, 3].map(double) + output: {"result": [2, 4, 6]} + + - name: "pass map name directly to .filter()" + mapping: | + map is_positive(x) { x > 0 } + output.result = [-1, 2, -3, 4].filter(is_positive) + output: {"result": [2, 4]} + + - name: "pass map name directly to .sort_by()" + mapping: | + map neg(x) { -x } + output.result = [3, 1, 2].sort_by(neg) + output: {"result": [3, 2, 1]} + + # --- Lambda is not a value --- + + - name: "cannot store lambda in variable" + mapping: | + $fn = x -> x * 2 + compile_error: "lambda" + + - name: "cannot assign lambda to output" + mapping: | + output.fn = x -> x * 2 + compile_error: "lambda" + + # --- Parameter is read-only --- + + - name: "lambda parameter cannot be assigned to" + mapping: | + output.result = [1].map(x -> { + x = 10 + x + }) + compile_error: "assign" + + # --- map_values with single param lambda --- + + - name: "map_values with lambda" + mapping: | + output.result = {"x": 1, "y": 2}.map_values(v -> v + 100) + output: {"result": {"x": 101, "y": 102}} diff --git a/internal/bloblang2/spec/tests/lambdas/complex_iterators.yaml b/internal/bloblang2/spec/tests/lambdas/complex_iterators.yaml new file mode 100644 index 000000000..910b6ae52 --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/complex_iterators.yaml @@ -0,0 +1,132 @@ +description: > + Complex iterator patterns — nested iterator chains, lambdas with control + flow, and iterators operating on map call results. + +tests: + # --- Nested iterator chains --- + + - name: "map then filter chain" + mapping: | + output.v = [1, 2, 3, 4, 5] + .map(x -> x * x) + .filter(x -> x > 5) + output: {"v": [9, 16, 25]} + + - name: "filter then map then sort" + mapping: | + output.v = [5, 1, 4, 2, 3] + .filter(x -> x > 2) + .map(x -> x * 10) + .sort() + output: {"v": [30, 40, 50]} + + - name: "nested map produces 2D array" + mapping: | + output.v = [1, 2].map(x -> [10, 20].map(y -> x + y)) + output: {"v": [[11, 21], [12, 22]]} + + - name: "flat map via map + flatten" + mapping: | + output.v = [[1, 2], [3, 4]].map(arr -> arr.map(x -> x * 10)).flatten() + output: {"v": [10, 20, 30, 40]} + + # --- Lambda with control flow --- + + - name: "map with if expression in lambda" + mapping: | + output.v = [1, 2, 3, 4].map(x -> + if x > 2 { "big" } else { "small" } + ) + output: {"v": ["small", "small", "big", "big"]} + + - name: "filter with match expression in lambda" + mapping: | + output.v = ["apple", "banana", "avocado", "cherry"].filter(s -> + match s.slice(0, 1) { + "a" => true, + _ => false, + } + ) + output: {"v": ["apple", "avocado"]} + + - name: "map with block body and local variables" + mapping: | + output.v = [1, 2, 3].map(x -> { + $doubled = x * 2 + $label = "item_" + $doubled.string() + {"value": $doubled, "label": $label} + }) + output: + v: + - value: 2 + label: "item_2" + - value: 4 + label: "item_4" + - value: 6 + label: "item_6" + + # --- Iterators on map call results --- + + - name: "map call result piped to iterator" + mapping: | + map get_items(data) { data.items } + output.v = get_items(input).map(x -> x * 2) + input: {"items": [1, 2, 3]} + output: {"v": [2, 4, 6]} + + - name: "chained map calls with iterators" + mapping: | + map extract(data) { data.values } + map sum(arr) { arr.fold(0, (acc, x) -> acc + x) } + output.v = sum(extract(input)) + input: {"values": [10, 20, 30]} + output: {"v": 60} + + # --- Fold with complex accumulator --- + + - name: "fold concatenates strings" + mapping: | + output.v = ["a", "b", "c"].fold("", (acc, s) -> + if acc == "" { s } else { acc + "," + s } + ) + output: {"v": "a,b,c"} + + - name: "fold builds object from array" + mapping: | + output.v = [ + {"k": "a", "v": 1}, + {"k": "b", "v": 2}, + ].fold({}, (acc, item) -> { + $acc = acc + $acc[item.k] = item.v + $acc + }) + output: {"v": {"a": 1, "b": 2}} + + - name: "fold with string accumulator" + mapping: | + output.v = ["hello", "world"].fold("", (acc, s) -> + if acc == "" { s } else { acc + " " + s } + ) + output: {"v": "hello world"} + + # --- any/all with complex predicates --- + + - name: "any with method chain in predicate" + mapping: | + $items = [ + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 17}, + ] + output.has_minor = $items.any(p -> p.age < 18) + output: {"has_minor": true} + + - name: "all with outer variable in predicate" + mapping: | + $min_age = 18 + $items = [ + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 30}, + ] + output.all_adult = $items.all(p -> p.age >= $min_age) + output: {"all_adult": true} diff --git a/internal/bloblang2/spec/tests/lambdas/defaults.yaml b/internal/bloblang2/spec/tests/lambdas/defaults.yaml new file mode 100644 index 000000000..d33fc4185 --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/defaults.yaml @@ -0,0 +1,83 @@ +description: "Default parameter values in lambda expressions" + +tests: + # --- Basic defaults --- + + - name: "single param with default — value provided" + mapping: | + output.result = [1, 2, 3].fold(0, (tally, x = 10) -> tally + x) + output: {"result": 6} + + - name: "lambda default integer literal" + mapping: | + output.result = [1, 2, 3].map((x, y = 10) -> x + y) + output: {"result": [11, 12, 13]} + + - name: "lambda default string literal" + mapping: | + output.result = {"a": 1, "b": 2}.map_entries((k, v, suffix = "_key") -> {"key": k + suffix, "value": v}) + output: {"result": {"a_key": 1, "b_key": 2}} + + - name: "lambda default boolean literal" + mapping: | + output.result = [1, -2, 3].map((x, negate = false) -> if negate { -x } else { x }) + output: {"result": [1, -2, 3]} + + - name: "lambda default null literal" + mapping: | + output.result = [1, 2, 3].map((x, extra = null) -> x + extra.or(0)) + output: {"result": [1, 2, 3]} + + - name: "lambda default float literal" + mapping: | + output.result = [10, 20].map((x, rate = 0.1) -> x * rate) + output: {"result": [1.0, 2.0]} + + # --- Positional omission --- + + - name: "trailing default params omitted" + mapping: | + output.result = [5, 10].map((x, multiplier = 2, offset = 0) -> x * multiplier + offset) + output: {"result": [10, 20]} + + # --- Defaults must come after required --- + + - name: "default before required is compile error" + mapping: | + output.result = [1].map((x = 0, y) -> x + y) + compile_error: "default" + + # --- Default values must be literals --- + + - name: "default value expression is compile error" + mapping: | + output.result = [1].map((x, y = 1 + 2) -> x + y) + compile_error: "literal" + + - name: "default value variable reference is compile error" + mapping: | + $val = 5 + output.result = [1].map((x, y = $val) -> x + y) + compile_error: "literal" + + - name: "default value function call is compile error" + mapping: | + output.result = [1].map((x, y = uuid_v4()) -> x) + compile_error: "literal" + + # --- Multiple defaults --- + + - name: "multiple default params all using defaults" + mapping: | + output.result = [100].map((x, tax_rate = 0.1, discount = 0) -> { + $subtotal = x - discount + $subtotal + $subtotal * tax_rate + }) + output: {"result": [110.0]} + + # --- Discard cannot have default --- + + - name: "discard param with default is compile error" + mapping: | + output.result = [1].map((_ = 0, x) -> x) + compile_error: "discard" diff --git a/internal/bloblang2/spec/tests/lambdas/discard_params.yaml b/internal/bloblang2/spec/tests/lambdas/discard_params.yaml new file mode 100644 index 000000000..7d553cc6a --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/discard_params.yaml @@ -0,0 +1,82 @@ +description: "Discard parameters (_) in lambda expressions" + +tests: + # --- Basic discard --- + + - name: "discard key in map_entries" + mapping: | + output.result = {"a": 1}.map_entries((_, v) -> {"key": "x", "value": v * 2}) + output: {"result": {"x": 2}} + + - name: "discard value in map_entries" + mapping: | + output.result = {"a": 1, "b": 2}.map_entries((k, _) -> {"key": k.uppercase(), "value": 0}) + output: {"result": {"A": 0, "B": 0}} + + - name: "discard accumulator in fold" + mapping: | + output.result = [10, 20, 30].fold(0, (_, x) -> x) + output: {"result": 30} + + - name: "discard element in fold" + mapping: | + output.result = [10, 20, 30].fold(0, (acc, _) -> acc + 1) + output: {"result": 3} + + - name: "discard in filter_entries — use value only" + mapping: | + output.result = {"a": 1, "b": 5, "c": 3}.filter_entries((_, v) -> v > 2) + output: {"result": {"b": 5, "c": 3}} + + - name: "discard in filter_entries — use key only" + mapping: | + output.result = {"aa": 1, "b": 5, "cc": 3}.filter_entries((k, _) -> k.length() > 1) + output: {"result": {"aa": 1, "cc": 3}} + + # --- Multiple discards --- + + - name: "both params discarded returns constant" + mapping: | + output.result = {"a": 1, "b": 2}.map_entries((_, _) -> {"key": "z", "value": 99}) + output: {"result": {"z": 99}} + + - name: "both params discarded in fold" + mapping: | + output.result = [1, 2, 3].fold(0, (_, _) -> 42) + output: {"result": 42} + + # --- Referencing _ in body is compile error --- + + - name: "referencing discarded param is compile error" + mapping: | + output.result = {"a": 1}.map_entries((_, v) -> {"key": _, "value": v}) + compile_error: "_" + + - name: "referencing _ when both discarded is compile error" + mapping: | + output.result = [1, 2].fold(0, (_, _) -> _) + compile_error: "_" + + - name: "referencing _ in single param discard" + mapping: | + output.result = [1, 2].map(_ -> _ * 2) + compile_error: "_" + + # --- Discard with non-discard params --- + + - name: "discard first keep second in two-param lambda" + mapping: | + output.result = [10, 20, 30].fold("start", (_, elem) -> elem.string()) + output: {"result": "30"} + + - name: "keep first discard second in two-param lambda" + mapping: | + output.result = [10, 20, 30].fold(0, (acc, _) -> acc + 100) + output: {"result": 300} + + # --- Discard in map (single param) --- + + - name: "discard single param in map returns constant array" + mapping: | + output.result = [1, 2, 3].map(_ -> "x") + output: {"result": ["x", "x", "x"]} diff --git a/internal/bloblang2/spec/tests/lambdas/fold_patterns.yaml b/internal/bloblang2/spec/tests/lambdas/fold_patterns.yaml new file mode 100644 index 000000000..6ae8373ae --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/fold_patterns.yaml @@ -0,0 +1,114 @@ +description: > + Fold patterns that stress variable slot management — building objects + from arrays, nested folds, fold calling maps, fold with conditional + accumulation, and fold combined with other iterators. + +tests: + # --- Build object from key-value pairs --- + + - name: "fold builds object from pairs array" + mapping: | + output.v = [ + {"k": "name", "v": "Alice"}, + {"k": "age", "v": 30}, + ].fold({}, (acc, pair) -> { + $a = acc + $a[pair.k] = pair.v + $a + }) + output: {"v": {"name": "Alice", "age": 30}} + + - name: "fold builds object with computed keys" + mapping: | + output.v = ["x", "y", "z"].enumerate().fold({}, (acc, e) -> { + $a = acc + $a["key_" + e.value] = e.index + 1 + $a + }) + output: {"v": {"key_x": 1, "key_y": 2, "key_z": 3}} + + # --- Fold with conditional accumulation --- + + - name: "fold conditionally adds to accumulator" + mapping: | + output.v = [1, -2, 3, -4, 5].fold([], (acc, x) -> + if x > 0 { acc.concat([x]) } else { acc } + ) + output: {"v": [1, 3, 5]} + + - name: "fold with match in body" + mapping: | + output.v = ["apple", "banana", "avocado", "cherry"].fold( + {"a": [], "other": []}, + (acc, s) -> { + $a = acc + $bucket = match s.slice(0, 1) { "a" => "a", _ => "other" } + $a[$bucket] = $a[$bucket].concat([s]) + $a + } + ) + output: {"v": {"a": ["apple", "avocado"], "other": ["banana", "cherry"]}} + + # --- Nested fold --- + + - name: "nested fold — outer builds rows, inner sums columns" + mapping: | + $matrix = [[1, 2, 3], [4, 5, 6]] + output.v = $matrix.fold([], (rows, row) -> { + $total = row.fold(0, (sum, x) -> sum + x) + rows.concat([$total]) + }) + output: {"v": [6, 15]} + + - name: "fold inside map inside fold" + mapping: | + $groups = [ + {"name": "a", "items": [1, 2]}, + {"name": "b", "items": [3, 4, 5]}, + ] + output.v = $groups.fold({}, (acc, g) -> { + $a = acc + $a[g.name] = g.items.fold(0, (sum, x) -> sum + x) + $a + }) + output: {"v": {"a": 3, "b": 12}} + + # --- Fold calling user maps --- + + - name: "fold body calls user map" + mapping: | + map transform(val) { val * val } + output.v = [1, 2, 3, 4].fold(0, (acc, x) -> acc + transform(x)) + output: {"v": 30} + + - name: "fold body calls user map that returns object" + mapping: | + map make_entry(key, val) { + {"key": key, "value": val} + } + output.v = ["a", "b", "c"].enumerate().fold({}, (acc, e) -> { + $entry = make_entry(e.value, e.index) + $a = acc + $a[$entry.key] = $entry.value + $a + }) + output: {"v": {"a": 0, "b": 1, "c": 2}} + + # --- Fold preserves accumulator across iterations --- + + - name: "fold accumulator carries forward correctly" + mapping: | + output.v = [10, 20, 30].fold({"sum": 0, "count": 0}, (acc, x) -> { + $a = acc + $a.sum = $a.sum + x + $a.count = $a.count + 1 + $a + }) + output: {"v": {"sum": 60, "count": 3}} + + - name: "fold with string accumulator and separator" + mapping: | + output.v = ["a", "b", "c"].fold("", (acc, s) -> + if acc == "" { s } else { acc + ", " + s } + ) + output: {"v": "a, b, c"} diff --git a/internal/bloblang2/spec/tests/lambdas/outer_capture.yaml b/internal/bloblang2/spec/tests/lambdas/outer_capture.yaml new file mode 100644 index 000000000..b53c00e5d --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/outer_capture.yaml @@ -0,0 +1,97 @@ +description: > + Lambda capture of outer variables — lambdas reading variables from enclosing + scopes, including nested lambdas and lambdas inside maps. + +tests: + # --- Simple outer variable capture --- + + - name: "lambda reads single outer variable" + mapping: | + $factor = 10 + output.v = [1, 2, 3].map(x -> x * $factor) + output: {"v": [10, 20, 30]} + + - name: "lambda reads multiple outer variables" + mapping: | + $offset = 100 + $scale = 3 + output.v = [1, 2].map(x -> x * $scale + $offset) + output: {"v": [103, 106]} + + - name: "lambda reads outer variable in filter condition" + mapping: | + $threshold = 5 + output.v = [1, 3, 7, 9, 2].filter(x -> x > $threshold) + output: {"v": [7, 9]} + + - name: "lambda reads outer variable in fold" + mapping: | + $base = 100 + output.v = [1, 2, 3].fold($base, (acc, x) -> acc + x) + output: {"v": 106} + + # --- Nested lambda capture --- + + - name: "inner lambda captures outer lambda parameter" + mapping: | + output.v = [1, 2].map(x -> [10, 20].map(y -> x * 100 + y)) + output: {"v": [[110, 120], [210, 220]]} + + - name: "inner lambda captures top-level variable through outer lambda" + mapping: | + $prefix = "item" + output.v = [1, 2].map(x -> $prefix + "_" + x.string()) + output: {"v": ["item_1", "item_2"]} + + - name: "triple nested lambda captures all levels" + mapping: | + $base = 1000 + output.v = [1].map(a -> [2].map(b -> [3].map(c -> $base + a * 100 + b * 10 + c))) + output: {"v": [[[1123]]]} + + # --- Capture with shadowing --- + + - name: "lambda parameter does not modify outer variable" + mapping: | + $x = "outer" + $result = ["inner"].map(x -> x + "_mapped") + output.outer = $x + output.mapped = $result + output: + outer: "outer" + mapped: ["inner_mapped"] + + - name: "lambda local variable does not modify outer variable" + mapping: | + $x = "outer" + $result = [1].map(n -> { + $x = "shadowed" + $x + "_" + n.string() + }) + output.outer = $x + output.result = $result + output: + outer: "outer" + result: ["shadowed_1"] + + # --- Capture across iterator chains --- + + - name: "chained iterators both capture outer variable" + mapping: | + $min = 2 + $scale = 10 + output.v = [1, 2, 3, 4, 5] + .filter(x -> x >= $min) + .map(x -> x * $scale) + output: {"v": [20, 30, 40, 50]} + + - name: "sort_by with outer variable in key function" + mapping: | + $field = "priority" + $items = [ + {"name": "c", "priority": 3}, + {"name": "a", "priority": 1}, + {"name": "b", "priority": 2}, + ] + output.v = $items.sort_by(x -> x[$field]).map(x -> x.name) + output: {"v": ["a", "b", "c"]} diff --git a/internal/bloblang2/spec/tests/lambdas/position_restriction.yaml b/internal/bloblang2/spec/tests/lambdas/position_restriction.yaml new file mode 100644 index 000000000..570cf040d --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/position_restriction.yaml @@ -0,0 +1,128 @@ +description: "Lambda position restriction — lambdas only appear as call arguments, never as general expressions (Sections 3.4 and 10)" + +tests: + # --- Parse errors: lambda in non-argument positions --- + + - name: "lambda as variable declaration RHS is a parse/compile error" + mapping: | + $fn = x -> x * 2 + output.v = 1 + compile_error: "lambda" + + - name: "lambda as output assignment RHS is a parse/compile error" + mapping: | + output.fn = x -> x * 2 + compile_error: "lambda" + + - name: "lambda in array literal is a parse/compile error" + mapping: | + output.xs = [1, x -> x, 3] + compile_error: "lambda" + + - name: "lambda as object value is a parse/compile error" + mapping: | + output.obj = {"a": 1, "b": x -> x} + compile_error: "lambda" + + - name: "lambda as operator operand is a parse/compile error" + mapping: | + output.v = 5 + x -> x * 2 + compile_error: "lambda" + + - name: "lambda as unary operand is a parse/compile error" + mapping: | + output.v = !(x -> x) + compile_error: "lambda" + + - name: "lambda in paren_expr is a parse/compile error" + mapping: | + output.fn = (x -> x * 2) + compile_error: "lambda" + + - name: "lambda in if branch is a parse/compile error" + mapping: | + output.fn = if true { x -> x } else { x -> x } + compile_error: "lambda" + + - name: "lambda in match arm is a parse/compile error" + mapping: | + output.fn = match input.k { + "a" => x -> x, + _ => x -> x, + } + input: {"k": "a"} + compile_error: "lambda" + + # --- Semantic errors: lambda passed to a callee that doesn't accept one --- + + - name: "lambda passed to .or() is a compile error" + mapping: | + output.v = input.name.or(x -> x) + input: {"name": null} + compile_error: "lambda" + + - name: "lambda passed to a user map is a compile error" + mapping: | + map wrap(v) { v + 1 } + output.v = wrap(x -> x) + compile_error: "lambda" + + - name: "lambda passed to throw() is a compile error" + mapping: | + output.v = throw(x -> x) + compile_error: "lambda" + + # --- Per-position: a lambda is rejected at a non-lambda param position + # even when the method accepts one elsewhere in its signature. --- + + - name: "lambda at fold() initial position is a compile error" + mapping: | + output.v = [1, 2, 3].fold(x -> x, 0) + compile_error: "lambda" + + - name: "lambda at fold() initial by name is a compile error" + mapping: | + output.v = [1, 2, 3].fold(initial: x -> x, fn: (a, b) -> a + b) + compile_error: "lambda" + + - name: "lambda at slice() low position is a compile error" + mapping: | + output.v = [1, 2, 3].slice(x -> x, 2) + compile_error: "lambda" + + - name: "valid lambda at fold() fn position (value at initial) still works" + mapping: | + output.total = [1, 2, 3, 4].fold(0, (tally, x) -> tally + x) + output: {"total": 10} + + # --- Positive cases: lambdas in legal argument positions --- + + - name: "lambda as positional method argument is valid" + mapping: | + output.xs = [1, 2, 3].map(x -> x * 2) + output: {"xs": [2, 4, 6]} + + - name: "lambda as named method argument is valid" + mapping: | + output.xs = [1, 2, 3].map(fn: x -> x * 2) + output: {"xs": [2, 4, 6]} + + - name: "multi-line lambda block as method argument is valid" + mapping: | + output.xs = [1, 2, 3].map(x -> { + $y = x * 10 + $y + 1 + }) + output: {"xs": [11, 21, 31]} + + - name: "multi-param lambda as method argument is valid" + mapping: | + output.total = [1, 2, 3, 4].fold(0, (tally, x) -> tally + x) + output: {"total": 10} + + # --- Distinguishing lambda params from paren_expr at parser level --- + + - name: "parens around a single param for lambda are valid" + mapping: | + output.xs = [1, 2].map((x) -> x + 1) + output: {"xs": [2, 3]} diff --git a/internal/bloblang2/spec/tests/lambdas/purity.yaml b/internal/bloblang2/spec/tests/lambdas/purity.yaml new file mode 100644 index 000000000..74ffbb64e --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/purity.yaml @@ -0,0 +1,141 @@ +description: "Lambda purity — no output assignment, variable shadowing, context inheritance" + +tests: + # --- Cannot assign to output from lambda --- + + - name: "lambda cannot assign to output field" + mapping: | + output.result = [1, 2, 3].map(x -> { + output.side = x + x * 2 + }) + compile_error: "output" + + - name: "lambda cannot assign to output root" + mapping: | + output.result = [1].map(x -> { + output = x + x + }) + compile_error: "output" + + - name: "lambda cannot assign to output metadata" + mapping: | + output.result = [1].map(x -> { + output@.key = x.string() + x + }) + compile_error: "output" + + - name: "nested lambda cannot assign to output" + mapping: | + output.result = [[1, 2]].map(row -> row.map(x -> { + output.inner = x + x + })) + compile_error: "output" + + # --- Variable shadowing (not mutation) in expression context --- + + - name: "lambda shadows outer variable" + mapping: | + $x = 100 + output.inner = [1].map(item -> { + $x = 999 + $x + }) + output.outer = $x + output: {"inner": [999], "outer": 100} + + - name: "lambda shadows — outer unchanged after map" + mapping: | + $count = 0 + output.mapped = [1, 2, 3].map(x -> { + $count = $count + x + $count + }) + output.count = $count + output: {"mapped": [1, 2, 3], "count": 0} + + - name: "fold lambda shadows outer variable" + mapping: | + $total = "untouched" + output.sum = [1, 2, 3].fold(0, (acc, x) -> { + $total = "modified" + acc + x + }) + output.total = $total + output: {"sum": 6, "total": "untouched"} + + - name: "block body variable does not leak" + mapping: | + output.result = [1].map(x -> { + $temp = x * 10 + $temp + }) + output.leaked = $temp + compile_error: "temp" + + # --- Context inheritance: top-level lambda reads input and output --- + + - name: "top-level lambda reads input" + input: {"multiplier": 10} + mapping: | + output.result = [1, 2, 3].map(x -> x * input.multiplier) + output: {"result": [10, 20, 30]} + + - name: "top-level lambda reads output" + mapping: | + output.base = 100 + output.result = [1, 2, 3].map(x -> x + output.base) + output: {"base": 100, "result": [101, 102, 103]} + + - name: "top-level lambda reads input metadata" + input: null + input_metadata: {"scale": "5"} + mapping: | + output.result = [1, 2].map(x -> x.string() + input@.scale) + output: {"result": ["15", "25"]} + + - name: "top-level lambda reads outer variable" + mapping: | + $factor = 3 + output.result = [10, 20].map(x -> x * $factor) + output: {"result": [30, 60]} + + # --- Lambda inside map cannot access input --- + + - name: "lambda inside map cannot read input" + input: {"value": 42} + mapping: | + map transform(items) { + items.map(x -> x * input.value) + } + output.result = transform([1, 2]) + compile_error: "input" + + - name: "lambda inside map cannot read output" + mapping: | + output.base = 10 + map transform(items) { + items.map(x -> x + output.base) + } + output.result = transform([1, 2]) + compile_error: "output" + + - name: "lambda inside map can read map parameter" + mapping: | + map scale(items, factor) { + items.map(x -> x * factor) + } + output.result = scale([1, 2, 3], 5) + output: {"result": [5, 10, 15]} + + - name: "lambda inside map can read map local variable" + mapping: | + map transform(items) { + $offset = 100 + items.map(x -> x + $offset) + } + output.result = transform([1, 2, 3]) + output: {"result": [101, 102, 103]} diff --git a/internal/bloblang2/spec/tests/lambdas/return_values.yaml b/internal/bloblang2/spec/tests/lambdas/return_values.yaml new file mode 100644 index 000000000..a2547dbe4 --- /dev/null +++ b/internal/bloblang2/spec/tests/lambdas/return_values.yaml @@ -0,0 +1,143 @@ +description: "Lambda return values — void errors, deleted() omission, filter boolean requirement, catch handler semantics" + +tests: + # --- Void in .map() is error --- + + - name: "map lambda returning void is error" + mapping: | + output.result = [1, 2, 3].map(x -> if x > 10 { x }) + error: "void" + + - name: "map lambda void from match is error" + mapping: | + output.result = [1, 2].map(x -> match x { 99 => "found" }) + error: "void" + + # --- Void in .filter() is error --- + + - name: "filter lambda returning void is error" + mapping: | + output.result = [1, 2, 3].filter(x -> if x > 10 { true }) + error: "void" + + # --- Filter requires boolean --- + + - name: "filter lambda returning non-boolean is error" + mapping: | + output.result = [1, 2, 3].filter(x -> x * 2) + error: "bool" + + - name: "filter lambda returning string is error" + mapping: | + output.result = [1, 2, 3].filter(x -> "yes") + error: "bool" + + - name: "filter lambda returning null is error" + mapping: | + output.result = [1, 2, 3].filter(x -> null) + error: "bool" + + - name: "filter lambda returning boolean works" + mapping: | + output.result = [1, 2, 3, 4].filter(x -> x % 2 == 0) + output: {"result": [2, 4]} + + # --- deleted() in .map() omits element --- + + - name: "map lambda returning deleted omits element" + mapping: | + output.result = [1, -2, 3, -4].map(x -> if x > 0 { x } else { deleted() }) + output: {"result": [1, 3]} + + - name: "map lambda all deleted returns empty array" + mapping: | + output.result = [1, 2, 3].map(x -> deleted()) + output: {"result": []} + + - name: "map lambda deleted from match" + mapping: | + output.result = ["a", "bb", "c", "dd"].map(s -> match { + s.length() > 1 => s.uppercase(), + _ => deleted(), + }) + output: {"result": ["BB", "DD"]} + + # --- deleted() in .map_values() omits entry --- + + - name: "map_values deleted omits entry" + mapping: | + output.result = {"a": 1, "b": -2, "c": 3}.map_values(v -> if v > 0 { v } else { deleted() }) + output: {"result": {"a": 1, "c": 3}} + + # --- .catch() handler returning deleted --- + + - name: "catch handler returning deleted removes field" + mapping: | + output.value = "not a number".int64().catch(err -> deleted()) + output: {} + + - name: "catch handler returning deleted with prior value" + mapping: | + output.value = "prior" + output.value = "not a number".int64().catch(err -> deleted()) + output: {} + + # --- .catch() handler returning void --- + + - name: "catch handler returning void skips assignment" + mapping: | + output.value = "prior" + output.value = "not a number".int64().catch(err -> if false { 0 }) + output: {"value": "prior"} + + - name: "catch handler void on fresh field leaves it absent" + mapping: | + output.value = "not a number".int64().catch(err -> if false { 0 }) + output: {} + + # --- .catch() handler normal value --- + + - name: "catch handler provides fallback value" + mapping: | + output.value = "not a number".int64().catch(err -> -1) + output: {"value": {_type: "int64", value: "-1"}} + + - name: "catch handler can access error message" + mapping: | + output.msg = throw("custom error").catch(err -> err.what) + output: {"msg": "custom error"} + + # --- Void in .fold() is error --- + + - name: "fold lambda returning void is error" + mapping: | + output.result = [1, 2, 3].fold(0, (acc, x) -> if x > 10 { acc + x }) + error: "void" + + # --- Void in .map_values() is error --- + + - name: "map_values lambda returning void is error" + mapping: | + output.result = {"a": 1}.map_values(v -> if v > 10 { v }) + error: "void" + + # --- Void in .sort_by() is error --- + + - name: "sort_by lambda returning void is error" + mapping: | + output.result = [1, 2].sort_by(x -> if x > 10 { x }) + error: "void" + + # --- .catch() passes through on success --- + + - name: "catch not triggered on success" + mapping: | + output.value = "42".int64().catch(err -> -1) + output: {"value": {_type: "int64", value: "42"}} + + # --- deleted() in filter is error (not boolean) --- + + - name: "filter lambda returning deleted is error" + mapping: | + output.result = [1, 2, 3].filter(x -> deleted()) + error: "bool" diff --git a/internal/bloblang2/spec/tests/maps/basic.yaml b/internal/bloblang2/spec/tests/maps/basic.yaml new file mode 100644 index 000000000..503e3d269 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/basic.yaml @@ -0,0 +1,192 @@ +description: "Maps: zero/single/multi parameter definitions, hoisting, return values, duplicate name errors" + +tests: + # --- Zero parameters --- + + - name: "zero parameter map returns literal object" + mapping: | + map headers() { + {"content_type": "json"} + } + output.h = headers() + output: {"h": {"content_type": "json"}} + + - name: "zero parameter map returns literal string" + mapping: | + map greeting() { + "hello world" + } + output.msg = greeting() + output: {"msg": "hello world"} + + - name: "zero parameter map returns literal integer" + mapping: | + map answer() { + 42 + } + output.v = answer() + output: {"v": 42} + + - name: "zero parameter map called multiple times" + mapping: | + map tag() { + "v1" + } + output.a = tag() + output.b = tag() + output: {"a": "v1", "b": "v1"} + + # --- Single parameter --- + + - name: "single parameter map" + mapping: | + map double(x) { + x * 2 + } + output.v = double(5) + output: {"v": 10} + + - name: "single parameter map with string" + mapping: | + map shout(msg) { + msg.uppercase() + } + output.v = shout("hello") + output: {"v": "HELLO"} + + - name: "single parameter map with object access" + mapping: | + map get_name(data) { + data.name + } + output.v = get_name({"name": "Alice", "age": 30}) + output: {"v": "Alice"} + + # --- Multiple parameters --- + + - name: "two parameter map" + mapping: | + map add(a, b) { + a + b + } + output.v = add(3, 7) + output: {"v": 10} + + - name: "three parameter map" + mapping: | + map calc(x, y, z) { + x + y * z + } + output.v = calc(1, 2, 3) + output: {"v": 7} + + - name: "map with variables in body" + mapping: | + map total(subtotal, tax_rate) { + $tax = subtotal * tax_rate + subtotal + $tax + } + output.v = total(100, 0.1) + output: {"v": 110.0} + + - name: "map with multiple variables in body" + mapping: | + map combine(a, b, c) { + $ab = a + b + $abc = $ab + c + $abc + } + output.v = combine("hello", " ", "world") + output: {"v": "hello world"} + + # --- Return value semantics --- + + - name: "map returns last expression value" + mapping: | + map last_wins(x) { + $unused = x + 1 + x * 10 + } + output.v = last_wins(5) + output: {"v": 50} + + - name: "map returns array" + mapping: | + map wrap(x) { + [x, x, x] + } + output.v = wrap(7) + output: {"v": [7, 7, 7]} + + - name: "map returns null" + mapping: | + map nothing(x) { + null + } + output.v = nothing(42) + output: {"v": null} + + - name: "map returns boolean" + mapping: | + map is_positive(x) { + x > 0 + } + output.v = is_positive(5) + output: {"v": true} + + # --- Hoisting --- + + - name: "map called before declaration" + mapping: | + output.v = double(21) + map double(x) { + x * 2 + } + output: {"v": 42} + + - name: "map called before declaration with multiple maps" + mapping: | + output.v = add(triple(2), 1) + map triple(x) { x * 3 } + map add(a, b) { a + b } + output: {"v": 7} + + # --- Duplicate map name --- + + - name: "duplicate map name in same file is compile error" + mapping: | + map foo(x) { x } + map foo(y) { y * 2 } + output.v = foo(1) + compile_error: "duplicate" + + # --- Parameter is read-only --- + + - name: "assigning to parameter is compile error" + mapping: | + map bad(data) { + data = 42 + data + } + output.v = bad(1) + compile_error: "assign" + + # --- Arity errors --- + + - name: "too few positional arguments is error" + mapping: | + map need_two(a, b) { a + b } + output.v = need_two(1) + compile_error: "arity" + + - name: "too many positional arguments is error" + mapping: | + map need_one(a) { a } + output.v = need_one(1, 2) + compile_error: "arity" + + - name: "zero args to parameterised map is error" + mapping: | + map need_one(x) { x } + output.v = need_one() + compile_error: "arity" diff --git a/internal/bloblang2/spec/tests/maps/defaults.yaml b/internal/bloblang2/spec/tests/maps/defaults.yaml new file mode 100644 index 000000000..80d754ef3 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/defaults.yaml @@ -0,0 +1,176 @@ +description: "Default parameter values: literal defaults, positional/named omission, non-literal default errors, dynamic defaults pattern" + +tests: + # --- Basic default values --- + + - name: "single default parameter omitted uses default" + mapping: | + map greet(name, greeting = "Hello") { + greeting + ", " + name + } + output.v = greet("Alice") + output: {"v": "Hello, Alice"} + + - name: "single default parameter provided overrides default" + mapping: | + map greet(name, greeting = "Hello") { + greeting + ", " + name + } + output.v = greet("Alice", "Hi") + output: {"v": "Hi, Alice"} + + - name: "multiple default parameters all omitted" + mapping: | + map fmt(amount, currency = "USD", decimals = 2) { + currency + " " + amount.round(decimals).string() + } + output.v = fmt(99.99) + output: {"v": "USD 99.99"} + + - name: "multiple defaults first provided second omitted" + mapping: | + map fmt(amount, currency = "USD", decimals = 2) { + currency + " " + amount.round(decimals).string() + } + output.v = fmt(99.99, "EUR") + output: {"v": "EUR 99.99"} + + - name: "multiple defaults all provided" + mapping: | + map fmt(amount, currency = "USD", decimals = 2) { + currency + " " + amount.round(decimals).string() + } + output.v = fmt(99.999, "EUR", 0) + output: {"v": "EUR 100.0"} + + # --- Default value types --- + + - name: "default integer literal" + mapping: | + map inc(x, step = 1) { + x + step + } + output.v = inc(10) + output: {"v": 11} + + - name: "default string literal" + mapping: | + map tag(value, prefix = "tag") { + prefix + ":" + value + } + output.v = tag("foo") + output: {"v": "tag:foo"} + + - name: "default true literal" + mapping: | + map check(x, strict = true) { + if strict { x > 0 } else { x >= 0 } + } + output.v = check(0) + output: {"v": false} + + - name: "default false literal" + mapping: | + map check(x, lenient = false) { + if lenient { true } else { x > 0 } + } + output.v = check(-1) + output: {"v": false} + + - name: "default null literal" + mapping: | + map maybe(x, fallback = null) { + x.or(fallback) + } + output.v = maybe(null, "backup") + output: {"v": "backup"} + + # --- Named arguments with defaults --- + + - name: "named args skip middle optional parameter" + mapping: | + map fmt(amount, currency = "USD", decimals = 2) { + currency + " " + amount.round(decimals).string() + } + output.v = fmt(amount: 99.99, decimals: 0) + output: {"v": "USD 100.0"} + + - name: "named args provide only required" + mapping: | + map fmt(amount, currency = "USD", decimals = 2) { + currency + " " + amount.round(decimals).string() + } + output.v = fmt(amount: 99.99) + output: {"v": "USD 99.99"} + + - name: "named args override all defaults" + mapping: | + map fmt(amount, currency = "USD", decimals = 2) { + currency + " " + amount.round(decimals).string() + } + output.v = fmt(amount: 99.999, currency: "GBP", decimals: 1) + output: {"v": "GBP 100.0"} + + # --- Non-literal defaults are compile errors --- + + - name: "expression as default value is compile error" + mapping: | + map bad(x, y = 1 + 2) { x + y } + output.v = bad(1) + compile_error: "literal" + + - name: "function call as default value is compile error" + mapping: | + map bad(x, y = now()) { x } + output.v = bad(1) + compile_error: "literal" + + - name: "variable reference as default value is compile error" + mapping: | + $z = 10 + map bad(x, y = $z) { x + y } + output.v = bad(1) + compile_error: "literal" + + - name: "parameter reference as default value is compile error" + mapping: | + map bad(x, y = x) { x + y } + output.v = bad(1) + compile_error: "literal" + + # --- Default before required is compile error --- + + - name: "default parameter before required parameter is compile error" + mapping: | + map bad(x = 1, y) { x + y } + output.v = bad(1, 2) + compile_error: "default" + + # --- Dynamic defaults pattern --- + + - name: "dynamic default with null and or" + mapping: | + map connect(host, port = null) { + $p = port.or(if host.has_prefix("https") { 443 } else { 80 }) + host + ":" + $p.string() + } + output.v = connect("https://example.com") + output: {"v": "https://example.com:443"} + + - name: "dynamic default overridden by caller" + mapping: | + map connect(host, port = null) { + $p = port.or(if host.has_prefix("https") { 443 } else { 80 }) + host + ":" + $p.string() + } + output.v = connect("http://example.com", 8080) + output: {"v": "http://example.com:8080"} + + - name: "dynamic default with http fallback" + mapping: | + map connect(host, port = null) { + $p = port.or(if host.has_prefix("https") { 443 } else { 80 }) + host + ":" + $p.string() + } + output.v = connect("http://example.com") + output: {"v": "http://example.com:80"} diff --git a/internal/bloblang2/spec/tests/maps/discard_params.yaml b/internal/bloblang2/spec/tests/maps/discard_params.yaml new file mode 100644 index 000000000..c31cd13e2 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/discard_params.yaml @@ -0,0 +1,141 @@ +description: "Discard parameters (_): ignoring arguments, referencing _ error, multiple _, no defaults, named call restriction" + +tests: + # --- Basic discard --- + + - name: "single discard parameter ignores first arg" + mapping: | + map second(_, b) { + b + } + output.v = second("ignored", "kept") + output: {"v": "kept"} + + - name: "discard last parameter" + mapping: | + map first(a, _) { + a + } + output.v = first("kept", "ignored") + output: {"v": "kept"} + + - name: "discard middle parameter" + mapping: | + map ends(a, _, c) { + a + c + } + output.v = ends("hello", "ignored", " world") + output: {"v": "hello world"} + + # --- Multiple discards --- + + - name: "multiple discard parameters" + mapping: | + map handle(_, _, payload) { + payload.uppercase() + } + output.v = handle("x", "y", "data") + output: {"v": "DATA"} + + - name: "all parameters discarded except one" + mapping: | + map pick_last(_, _, _, x) { + x + } + output.v = pick_last(1, 2, 3, 42) + output: {"v": 42} + + - name: "all parameters discarded" + mapping: | + map constant(_, _) { + "fixed" + } + output.v = constant(1, 2) + output: {"v": "fixed"} + + # --- Referencing _ is compile error --- + + - name: "referencing _ in body is compile error" + mapping: | + map bad(_, b) { + _ + b + } + output.v = bad(1, 2) + compile_error: "_" + + - name: "referencing _ as method target in body is compile error" + mapping: | + map bad(_) { + _.string() + } + output.v = bad(42) + compile_error: "_" + + - name: "referencing _ in variable declaration is compile error" + mapping: | + map bad(_, x) { + $v = _ + x + } + output.v = bad(1, 2) + compile_error: "_" + + # --- Discard with defaults is compile error --- + + - name: "discard parameter with default value is compile error" + mapping: | + map bad(_ = 1, x) { + x + } + output.v = bad(1, 2) + compile_error: "default" + + - name: "discard after default parameters is compile error" + mapping: | + map bad(x, _ = null) { + x + } + output.v = bad(1, 2) + compile_error: "default" + + # --- Named call to map with discard is compile error --- + + - name: "named call to map with discard parameter is compile error" + mapping: | + map handle(_, payload) { + payload + } + output.v = handle(payload: "data") + compile_error: "named" + + - name: "named call to map with multiple discards is compile error" + mapping: | + map handle(_, _, payload) { + payload + } + output.v = handle(payload: "data") + compile_error: "named" + + # --- Positional call still works --- + + - name: "positional call to map with discard works normally" + mapping: | + map extract(_, _, data) { + {"result": data} + } + output = extract("a", "b", 42) + output: {"result": 42} + + # --- Arity still enforced with discards --- + + - name: "too few args with discard is error" + mapping: | + map need_three(_, _, x) { x } + output.v = need_three(1, 2) + compile_error: "arity" + + - name: "too many args with discard is error" + mapping: | + map need_two(_, x) { x } + output.v = need_two(1, 2, 3) + compile_error: "arity" diff --git a/internal/bloblang2/spec/tests/maps/higher_order.yaml b/internal/bloblang2/spec/tests/maps/higher_order.yaml new file mode 100644 index 000000000..57f5cca5b --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/higher_order.yaml @@ -0,0 +1,125 @@ +description: "Higher-order maps: maps as arguments to .map()/.filter()/.sort_by(), store-in-variable error, bare-name error" + +tests: + # --- Maps as argument to .map() --- + + - name: "map name passed to .map() method" + mapping: | + map double(x) { x * 2 } + output.v = [1, 2, 3].map(double) + output: {"v": [2, 4, 6]} + + - name: "map name passed to .map() is same as lambda" + mapping: | + map inc(x) { x + 1 } + output.direct = [10, 20].map(inc) + output.lambda = [10, 20].map(x -> inc(x)) + output: {"direct": [11, 21], "lambda": [11, 21]} + + - name: "map with string transformation passed to .map()" + mapping: | + map shout(s) { s.uppercase() } + output.v = ["hello", "world"].map(shout) + output: {"v": ["HELLO", "WORLD"]} + + - name: "map returning object passed to .map()" + mapping: | + map wrap(x) { {"value": x} } + output.v = [1, 2].map(wrap) + output: {"v": [{"value": 1}, {"value": 2}]} + + - name: "map passed to .map() from input" + input: {"items": [5, 10, 15]} + mapping: | + map halve(x) { x / 2 } + output.v = input.items.map(halve) + output: {"v": [2.5, 5.0, 7.5]} + + # --- Maps as argument to .filter() --- + + - name: "map name passed to .filter() method" + mapping: | + map is_positive(x) { x > 0 } + output.v = [-1, 0, 1, 2, -3].filter(is_positive) + output: {"v": [1, 2]} + + - name: "map name passed to .filter() with strings" + mapping: | + map is_long(s) { s.length() > 3 } + output.v = ["hi", "hello", "yo", "world"].filter(is_long) + output: {"v": ["hello", "world"]} + + # --- Maps as argument to .sort_by() --- + + - name: "map name passed to .sort_by() method" + mapping: | + map get_age(person) { person.age } + output.v = [{"name": "B", "age": 30}, {"name": "A", "age": 20}].sort_by(get_age) + output: {"v": [{"name": "A", "age": 20}, {"name": "B", "age": 30}]} + + # --- Chaining higher-order calls --- + + - name: "chained .filter() and .map() with map names" + mapping: | + map is_even(x) { x % 2 == 0 } + map double(x) { x * 2 } + output.v = [1, 2, 3, 4, 5].filter(is_even).map(double) + output: {"v": [4, 8]} + + # --- Map defined after use (hoisting with higher-order) --- + + - name: "map name used in .map() before declaration" + mapping: | + output.v = [1, 2, 3].map(triple) + map triple(x) { x * 3 } + output: {"v": [3, 6, 9]} + + # --- Cannot store map reference in variable --- + + - name: "storing map name in variable is compile error" + mapping: | + map double(x) { x * 2 } + $fn = double + output.v = $fn(5) + compile_error: "variable" + + - name: "storing map name in variable without calling is compile error" + mapping: | + map double(x) { x * 2 } + $fn = double + output.v = 1 + compile_error: "variable" + + # --- Cannot use bare map name as expression --- + + - name: "bare map name assigned to output is compile error" + mapping: | + map double(x) { x * 2 } + output.x = double + compile_error: "expression" + + - name: "bare map name in array literal is compile error" + mapping: | + map double(x) { x * 2 } + output.v = [double] + compile_error: "expression" + + - name: "bare map name in object value is compile error" + mapping: | + map double(x) { x * 2 } + output.v = {"fn": double} + compile_error: "expression" + + # --- Only single-param maps work with higher-order (arity mismatch) --- + + - name: "multi-param map passed to .map() is error" + mapping: | + map add(a, b) { a + b } + output.v = [1, 2, 3].map(add) + compile_error: "arity" + + - name: "zero-param map passed to .map() is error" + mapping: | + map constant() { 42 } + output.v = [1, 2, 3].map(constant) + compile_error: "arity" diff --git a/internal/bloblang2/spec/tests/maps/isolation.yaml b/internal/bloblang2/spec/tests/maps/isolation.yaml new file mode 100644 index 000000000..ee6f27335 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/isolation.yaml @@ -0,0 +1,158 @@ +description: "Map isolation: no access to input, output, or top-level $variables from map body; lambdas inside maps also isolated" + +tests: + # --- Cannot access input --- + + - name: "accessing input in map body is compile error" + input: {"x": 42} + mapping: | + map bad(v) { + input.x + v + } + output.v = bad(1) + compile_error: "input" + + - name: "accessing input directly in map body is compile error" + input: "hello" + mapping: | + map bad() { + input + } + output.v = bad() + compile_error: "input" + + # --- Cannot access output --- + + - name: "accessing output in map body is compile error" + mapping: | + output.x = 10 + map bad(v) { + output.x + v + } + output.v = bad(1) + compile_error: "output" + + - name: "assigning to output in map body is compile error" + mapping: | + map bad(v) { + output.x = v + v + } + output.v = bad(1) + compile_error: "output" + + # --- Cannot access top-level variables --- + + - name: "accessing top-level variable in map body is compile error" + mapping: | + $global = 100 + map bad(v) { + $global + v + } + output.v = bad(1) + compile_error: "undeclared" + + - name: "top-level variable with same name not accessible in map" + mapping: | + $x = 99 + map get(v) { + $x + } + output.v = get(1) + compile_error: "undeclared" + + # --- Local variables inside map are fine --- + + - name: "local variables declared inside map work" + mapping: | + map compute(x) { + $doubled = x * 2 + $doubled + 1 + } + output.v = compute(5) + output: {"v": 11} + + - name: "multiple local variables inside map work" + mapping: | + map transform(a, b) { + $sum = a + b + $product = a * b + {"sum": $sum, "product": $product} + } + output = transform(3, 4) + output: {"sum": 7, "product": 12} + + # --- Parameters are accessible --- + + - name: "parameters are accessible by name" + mapping: | + map echo(msg) { + msg + } + output.v = echo("test") + output: {"v": "test"} + + - name: "all parameters accessible in multi-param map" + mapping: | + map triple(a, b, c) { + [a, b, c] + } + output.v = triple(1, 2, 3) + output: {"v": [1, 2, 3]} + + # --- Lambdas inside maps are also isolated --- + + - name: "lambda inside map cannot access input" + input: {"multiplier": 10} + mapping: | + map scale(items) { + items.map(x -> x * input.multiplier) + } + output.v = scale([1, 2, 3]) + compile_error: "input" + + - name: "lambda inside map cannot access output" + mapping: | + output.factor = 5 + map scale(items) { + items.map(x -> x * output.factor) + } + output.v = scale([1, 2, 3]) + compile_error: "output" + + - name: "lambda inside map cannot access top-level variable" + mapping: | + $factor = 10 + map scale(items) { + items.map(x -> x * $factor) + } + output.v = scale([1, 2, 3]) + compile_error: "undeclared" + + - name: "lambda inside map can access map parameters" + mapping: | + map scale(items, factor) { + items.map(x -> x * factor) + } + output.v = scale([1, 2, 3], 10) + output: {"v": [10, 20, 30]} + + - name: "lambda inside map can access map local variables" + mapping: | + map process(items) { + $prefix = "item" + items.map(x -> $prefix + ":" + x) + } + output.v = process(["a", "b"]) + output: {"v": ["item:a", "item:b"]} + + # --- Passing external context as parameter works --- + + - name: "pass input data as parameter to map" + input: {"items": [1, 2, 3], "multiplier": 10} + mapping: | + map scale(items, multiplier) { + items.map(x -> x * multiplier) + } + output.v = scale(input.items, input.multiplier) + output: {"v": [10, 20, 30]} diff --git a/internal/bloblang2/spec/tests/maps/named_args.yaml b/internal/bloblang2/spec/tests/maps/named_args.yaml new file mode 100644 index 000000000..8a408bac1 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/named_args.yaml @@ -0,0 +1,141 @@ +description: "Named arguments: syntax, order independence, mixing error, duplicate error, unknown arg error" + +tests: + # --- Basic named argument syntax --- + + - name: "single named argument" + mapping: | + map double(x) { x * 2 } + output.v = double(x: 21) + output: {"v": 42} + + - name: "two named arguments" + mapping: | + map add(a, b) { a + b } + output.v = add(a: 10, b: 20) + output: {"v": 30} + + - name: "three named arguments" + mapping: | + map calc(x, y, z) { x + y * z } + output.v = calc(x: 1, y: 2, z: 3) + output: {"v": 7} + + # --- Order independence --- + + - name: "named arguments in reverse order" + mapping: | + map sub(a, b) { a - b } + output.v = sub(b: 3, a: 10) + output: {"v": 7} + + - name: "named arguments in arbitrary order" + mapping: | + map calc(x, y, z) { x + y * z } + output.v = calc(z: 3, x: 1, y: 2) + output: {"v": 7} + + - name: "named arguments shuffled with three params" + mapping: | + map join(sep, first, second) { + first + sep + second + } + output.v = join(first: "hello", second: "world", sep: " ") + output: {"v": "hello world"} + + # --- Named with defaults --- + + - name: "named args with some defaults omitted" + mapping: | + map tag(value, prefix = "item", suffix = "end") { + prefix + ":" + value + ":" + suffix + } + output.v = tag(value: "foo") + output: {"v": "item:foo:end"} + + - name: "named args override specific default" + mapping: | + map tag(value, prefix = "item", suffix = "end") { + prefix + ":" + value + ":" + suffix + } + output.v = tag(value: "foo", suffix: "done") + output: {"v": "item:foo:done"} + + - name: "named args override all defaults" + mapping: | + map tag(value, prefix = "item", suffix = "end") { + prefix + ":" + value + ":" + suffix + } + output.v = tag(value: "foo", prefix: "x", suffix: "y") + output: {"v": "x:foo:y"} + + # --- Mixing positional and named is compile error --- + + - name: "mixing positional and named arguments is compile error" + mapping: | + map add(a, b) { a + b } + output.v = add(1, b: 2) + compile_error: "mix" + + - name: "mixing named then positional is compile error" + mapping: | + map add(a, b) { a + b } + output.v = add(a: 1, 2) + compile_error: "mix" + + # --- Duplicate named argument is compile error --- + + - name: "duplicate named argument is compile error" + mapping: | + map add(a, b) { a + b } + output.v = add(a: 1, a: 2) + compile_error: "duplicate" + + - name: "duplicate named argument second param is compile error" + mapping: | + map add(a, b) { a + b } + output.v = add(a: 1, b: 2, b: 3) + compile_error: "duplicate" + + # --- Unknown named argument is error --- + + - name: "unknown named argument is error" + mapping: | + map add(a, b) { a + b } + output.v = add(a: 1, c: 2) + compile_error: "unknown" + + - name: "all unknown named arguments is error" + mapping: | + map identity(x) { x } + output.v = identity(y: 42) + compile_error: "unknown" + + # --- Missing required named argument is error --- + + - name: "missing required named argument is error" + mapping: | + map add(a, b) { a + b } + output.v = add(a: 1) + compile_error: "arity" + + - name: "missing all required named arguments is error" + mapping: | + map add(a, b) { a + b } + output.v = add() + compile_error: "arity" + + # --- Named arguments with expressions --- + + - name: "named argument values can be expressions" + mapping: | + map add(a, b) { a + b } + output.v = add(a: 3 * 2, b: 10 - 3) + output: {"v": 13} + + - name: "named argument values from input" + input: {"x": 5, "y": 10} + mapping: | + map add(a, b) { a + b } + output.v = add(a: input.x, b: input.y) + output: {"v": 15} diff --git a/internal/bloblang2/spec/tests/maps/parameter_shadowing.yaml b/internal/bloblang2/spec/tests/maps/parameter_shadowing.yaml new file mode 100644 index 000000000..d6dcac431 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/parameter_shadowing.yaml @@ -0,0 +1,106 @@ +description: > + Parameter shadowing — parameter names shadow map names within the map + body. The parameter always wins. Lambda parameters can also shadow. + +tests: + # --- Parameter shadows map name --- + + - name: "parameter with same name as another map shadows it" + mapping: | + map double(x) { x * 2 } + map apply(double) { double + 1 } + output.v = apply(10) + output: {"v": 11} + + - name: "shadowed map name not callable in map body" + mapping: | + map helper(x) { x * 10 } + map run(helper) { + helper + 1 + } + output.v = run(5) + output: {"v": 6} + + - name: "non-shadowed maps still callable" + mapping: | + map add_one(x) { x + 1 } + map mul_two(x) { x * 2 } + map process(add_one) { + mul_two(add_one) + } + output.v = process(5) + output: {"v": 10} + + - name: "parameter shadows map but map callable outside" + mapping: | + map double(x) { x * 2 } + map use_param(double) { double + 100 } + output.shadowed = use_param(5) + output.original = double(5) + output: {"shadowed": 105, "original": 10} + + # --- Lambda parameter shadows outer variable --- + + - name: "lambda parameter shadows outer variable" + mapping: | + $x = 100 + output.v = [1, 2, 3].map(x -> x * 2) + output: {"v": [2, 4, 6]} + + - name: "lambda parameter shadows map parameter" + mapping: | + map process(x) { + [1, 2, 3].map(x -> x * 10) + } + output.v = process(999) + output: {"v": [10, 20, 30]} + + - name: "nested lambda parameters shadow each level" + mapping: | + output.v = [1, 2].map(x -> [10, 20].map(x -> x * 100)) + output: {"v": [[1000, 2000], [1000, 2000]]} + + # --- Parameter shadow does not affect caller --- + + - name: "shadowing does not leak to caller" + mapping: | + map double(x) { x * 2 } + map wrapper(double) { + double + 1 + } + output.wrapped = wrapper(10) + output.doubled = double(10) + output: {"wrapped": 11, "doubled": 20} + + # --- Discard parameter does not shadow --- + + - name: "discard parameter does not shadow anything" + mapping: | + map double(x) { x * 2 } + map run(_, x) { + double(x) + } + output.v = run("ignored", 5) + output: {"v": 10} + + # --- Multiple parameters, one shadows --- + + - name: "one parameter shadows map, other does not" + mapping: | + map helper(x) { x + 100 } + map run(helper, val) { + helper + val + } + output.v = run(1, 2) + output: {"v": 3} + + # --- Parameter read-only --- + + - name: "parameter is read-only in map body" + mapping: | + map bad(x) { + x = 5 + x + } + output.v = bad(10) + compile_error: "assign" diff --git a/internal/bloblang2/spec/tests/maps/recursion.yaml b/internal/bloblang2/spec/tests/maps/recursion.yaml new file mode 100644 index 000000000..57bf447e8 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/recursion.yaml @@ -0,0 +1,142 @@ +description: "Recursion: self-recursion, mutual recursion, depth limits, uncatchable recursion errors" + +tests: + # --- Self recursion --- + + - name: "simple self recursion with base case" + mapping: | + map factorial(n) { + if n <= 1 { 1 } else { n * factorial(n - 1) } + } + output.v = factorial(5) + output: {"v": 120} + + - name: "self recursion base case returns immediately" + mapping: | + map factorial(n) { + if n <= 1 { 1 } else { n * factorial(n - 1) } + } + output.v = factorial(0) + output: {"v": 1} + + - name: "recursive countdown to build array" + mapping: | + map countdown(n) { + if n <= 0 { [] } else { [n].concat(countdown(n - 1)) } + } + output.v = countdown(5) + output: {"v": [5, 4, 3, 2, 1]} + + - name: "recursive sum of array" + mapping: | + map sum_list(items, idx) { + if idx >= items.length() { 0 } else { items[idx] + sum_list(items, idx + 1) } + } + output.v = sum_list([10, 20, 30], 0) + output: {"v": 60} + + - name: "recursive string repeat" + mapping: | + map repeat(s, n) { + if n <= 0 { "" } else { s + repeat(s, n - 1) } + } + output.v = repeat("ab", 3) + output: {"v": "ababab"} + + # --- Mutual recursion --- + + - name: "mutual recursion is_even and is_odd" + mapping: | + map is_even(n) { + if n == 0 { true } else { is_odd(n - 1) } + } + map is_odd(n) { + if n == 0 { false } else { is_even(n - 1) } + } + output.even = is_even(4) + output.odd = is_odd(3) + output: {"even": true, "odd": true} + + - name: "mutual recursion with larger values" + mapping: | + map is_even(n) { + if n == 0 { true } else { is_odd(n - 1) } + } + map is_odd(n) { + if n == 0 { false } else { is_even(n - 1) } + } + output.v = is_even(100) + output: {"v": true} + + - name: "mutual recursion called before declaration" + mapping: | + output.v = ping(5) + map ping(n) { + if n <= 0 { "done" } else { pong(n - 1) } + } + map pong(n) { + if n <= 0 { "done" } else { ping(n - 1) } + } + output: {"v": "done"} + + # --- Recursion depth support (at least 1000) --- + + - name: "recursion supports 1000 levels deep" + mapping: | + map deep(n) { + if n <= 0 { 0 } else { 1 + deep(n - 1) } + } + output.v = deep(1000) + output: {"v": 1000} + + # --- Recursion depth limit exceeded --- + + - name: "exceeding recursion limit is runtime error" + mapping: | + map infinite(n) { + 1 + infinite(n) + } + output.v = infinite(0) + error: "recursion" + + - name: "mutual recursion exceeding limit is runtime error" + mapping: | + map ping(n) { pong(n) } + map pong(n) { ping(n) } + output.v = ping(0) + error: "recursion" + + # --- Recursion limit error cannot be caught --- + + - name: "recursion limit error cannot be caught with catch" + mapping: | + map infinite(n) { + 1 + infinite(n) + } + output.v = infinite(0).catch(err -> "recovered") + error: "recursion" + + - name: "recursion limit in nested call cannot be caught" + mapping: | + map infinite(n) { + infinite(n) + } + map wrapper(x) { + infinite(x).catch(err -> "safe") + } + output.v = wrapper(0) + error: "recursion" + + # --- Recursion with match expression --- + + - name: "recursive map using match" + mapping: | + map fib(n) { + match { + n <= 0 => 0, + n == 1 => 1, + _ => fib(n - 1) + fib(n - 2), + } + } + output.v = fib(10) + output: {"v": 55} diff --git a/internal/bloblang2/spec/tests/maps/recursion_advanced.yaml b/internal/bloblang2/spec/tests/maps/recursion_advanced.yaml new file mode 100644 index 000000000..d7d8b32b5 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/recursion_advanced.yaml @@ -0,0 +1,185 @@ +description: > + Advanced recursion patterns — mutual recursion with accumulator, + recursion through match, recursive data structure processing, + recursion with lambdas, and recursion limit interaction with + map composition. + +tests: + # --- Mutual recursion with accumulator --- + + - name: "mutual recursion with accumulator" + mapping: | + map collatz_steps(n, count) { + if n == 1 { count } else { + collatz_next(n, count) + } + } + map collatz_next(n, count) { + if n % 2 == 0 { + collatz_steps(n / 2, count + 1) + } else { + collatz_steps(n * 3 + 1, count + 1) + } + } + output.steps = collatz_steps(6, 0) + output: {"steps": 8} + + # --- Recursive data structure traversal --- + + - name: "recursive tree depth calculation" + mapping: | + map depth(node) { + if node == null { 0 } else { + $left = depth(node.left) + $right = depth(node.right) + 1 + (if $left > $right { $left } else { $right }) + } + } + $tree = { + "val": 1, + "left": {"val": 2, "left": {"val": 4, "left": null, "right": null}, "right": null}, + "right": {"val": 3, "left": null, "right": null}, + } + output.d = depth($tree) + output: {"d": 3} + + - name: "recursive tree node count" + mapping: | + map count_nodes(node) { + if node == null { 0 } else { + 1 + count_nodes(node.left) + count_nodes(node.right) + } + } + $tree = { + "val": 1, + "left": {"val": 2, "left": null, "right": null}, + "right": {"val": 3, "left": null, "right": {"val": 4, "left": null, "right": null}}, + } + output.count = count_nodes($tree) + output: {"count": 4} + + # --- Recursion through match --- + + - name: "recursive map using match with multiple arms" + mapping: | + map gcd(a, b) { + match { + b == 0 => a, + _ => gcd(b, a % b), + } + } + output.v = gcd(48, 18) + output: {"v": 6} + + - name: "recursive map using equality match" + mapping: | + map describe(n) { + match n { + 0 => "zero", + 1 => "one", + _ => describe(n - 1) + "+", + } + } + output.v = describe(3) + output: {"v": "one++"} + + # --- Recursion with iterators --- + + - name: "recursive map called from within lambda" + mapping: | + map sum_nested(arr) { + arr.fold(0, (acc, item) -> + acc + (if item.type() == "array" { sum_nested(item) } else { item }) + ) + } + output.v = sum_nested([1, [2, 3], [4, [5, 6]]]) + output: {"v": 21} + + # --- Recursion with local variables --- + + - name: "recursive map with local variables per frame" + mapping: | + map sum_tree(node) { + $val = node.value + $left = if node.left != null { sum_tree(node.left) } else { 0 } + $right = if node.right != null { sum_tree(node.right) } else { 0 } + $val + $left + $right + } + $tree = { + "value": 10, + "left": {"value": 20, "left": null, "right": null}, + "right": {"value": 30, "left": {"value": 5, "left": null, "right": null}, "right": null}, + } + output.total = sum_tree($tree) + output: {"total": 65} + + - name: "recursive flatten with fold and dynamic keys" + mapping: | + map flatten_obj(obj, prefix) { + $entries = obj.iter() + $entries.fold({}, (acc, e) -> { + $key = if prefix == "" { e.key } else { prefix + "." + e.key } + $acc = acc + if e.value.type() == "object" { + $nested = flatten_obj(e.value, $key) + $nested.iter().fold($acc, (a, ne) -> { + $a2 = a + $a2[ne.key] = ne.value + $a2 + }) + } else { + $acc[$key] = e.value + $acc + } + }) + } + output = flatten_obj({"a": 1, "b": {"c": 2, "d": 3}}, "") + output: {"a": 1, "b.c": 2, "b.d": 3} + + # --- Mutual recursion depth limit --- + + - name: "three-way mutual recursion works within depth limit" + mapping: | + map a(n) { if n <= 0 { "done" } else { b(n - 1) } } + map b(n) { if n <= 0 { "done" } else { c(n - 1) } } + map c(n) { if n <= 0 { "done" } else { a(n - 1) } } + output.v = a(9) + output: {"v": "done"} + + - name: "three-way mutual recursion exceeds limit" + mapping: | + map a(n) { b(n) } + map b(n) { c(n) } + map c(n) { a(n) } + output.v = a(0) + error: "recursion" + + # --- Recursion limit cannot be caught even deeply nested --- + + - name: "recursion limit from mutual recursion is uncatchable" + mapping: | + map a(n) { b(n) } + map b(n) { a(n) } + output.v = a(0).catch(err -> "safe") + error: "recursion" + + - name: "recursion limit in lambda context is uncatchable" + mapping: | + map infinite(n) { infinite(n) } + output.v = [1].map(x -> infinite(x)).catch(err -> "safe") + error: "recursion" + + # --- Recursion called before declaration (hoisting) --- + + - name: "map hoisting allows forward reference" + mapping: | + output.v = double(5) + map double(x) { x * 2 } + output: {"v": 10} + + - name: "mutual recursion with both maps declared after use" + mapping: | + output.v = a(6) + map a(n) { if n <= 0 { true } else { b(n - 1) } } + map b(n) { if n <= 0 { false } else { a(n - 1) } } + output: {"v": true} diff --git a/internal/bloblang2/spec/tests/maps/recursive_with_iterators.yaml b/internal/bloblang2/spec/tests/maps/recursive_with_iterators.yaml new file mode 100644 index 000000000..b50d880f4 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/recursive_with_iterators.yaml @@ -0,0 +1,103 @@ +description: > + Recursive maps combined with iterators — recursive map called from + within fold, map, and filter lambdas. These stress variable stack + frame isolation during recursion. + +tests: + # --- Recursive map called from map lambda --- + + - name: "recursive deep_values extracts all leaf values" + mapping: | + map deep_values(obj) { + obj.iter().fold([], (acc, e) -> + if e.value.type() == "object" { + acc.concat(deep_values(e.value)) + } else { + acc.concat([e.value]) + } + ) + } + output.v = deep_values({"a": 1, "b": {"c": 2, "d": 3}}).sort() + output: {"v": [1, 2, 3]} + + - name: "recursive deep_values with three levels" + mapping: | + map deep_values(obj) { + obj.iter().fold([], (acc, e) -> + if e.value.type() == "object" { + acc.concat(deep_values(e.value)) + } else { + acc.concat([e.value]) + } + ) + } + output.v = deep_values({"a": {"b": {"c": 1}}, "d": 2}).sort() + output: {"v": [1, 2]} + + # --- Recursive map with map_values --- + + - name: "recursive double_all doubles nested values" + mapping: | + map double_all(data) { + if data.type() == "object" { + data.map_values(v -> double_all(v)) + } else if data.type() == "array" { + data.map(v -> double_all(v)) + } else if data.type() == "int64" || data.type() == "float64" { + data * 2 + } else { + data + } + } + output = double_all({"a": 1, "b": [2, 3], "c": {"d": 4}}) + output: {"a": 2, "b": [4, 6], "c": {"d": 8}} + + # --- Recursive count with filter --- + + - name: "recursive count_strings counts string leaves" + mapping: | + map count_strings(data) { + if data.type() == "string" { + 1 + } else if data.type() == "object" { + data.iter().fold(0, (acc, e) -> acc + count_strings(e.value)) + } else if data.type() == "array" { + data.fold(0, (acc, item) -> acc + count_strings(item)) + } else { + 0 + } + } + output.v = count_strings({"name": "Alice", "age": 30, "tags": ["admin", "user"], "meta": {"role": "lead"}}) + output: {"v": 4} + + # --- Recursive with local variables in fold --- + + - name: "recursive sum_nested with accumulator variable" + mapping: | + map sum_nested(data) { + if data.type() == "array" { + data.fold(0, (acc, item) -> { + $sub = sum_nested(item) + acc + $sub + }) + } else { + data + } + } + output.v = sum_nested([1, [2, [3, 4]], 5]) + output: {"v": 15} + + # --- Recursive map returning modified copy --- + + - name: "recursive redact replaces strings with ***" + mapping: | + map redact(data) { + match data.type() { + "string" => "***", + "object" => data.map_values(v -> redact(v)), + "array" => data.map(v -> redact(v)), + _ => data, + } + } + output = redact({"name": "Alice", "age": 30, "contacts": [{"email": "a@b.com"}]}) + output: {"name": "***", "age": 30, "contacts": [{"email": "***"}]} diff --git a/internal/bloblang2/spec/tests/maps/transitive_calls.yaml b/internal/bloblang2/spec/tests/maps/transitive_calls.yaml new file mode 100644 index 000000000..bcf5446b1 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/transitive_calls.yaml @@ -0,0 +1,92 @@ +description: > + Transitive map calls — maps calling other maps, maps called from lambdas, + and maps with complex parameter passing patterns. + +tests: + # --- Map calling another map --- + + - name: "map A calls map B" + mapping: | + map double(x) { x * 2 } + map quad(x) { double(double(x)) } + output.v = quad(3) + output: {"v": 12} + + - name: "three-level transitive call" + mapping: | + map inc(x) { x + 1 } + map double_inc(x) { inc(x) * 2 } + map process(x) { double_inc(x) + 100 } + output.v = process(5) + output: {"v": 112} + + - name: "map passes its parameter to another map" + mapping: | + map format(s) { "[" + s + "]" } + map wrap(s) { format(s) } + output.v = wrap("hello") + output: {"v": "[hello]"} + + # --- Map called from lambda --- + + - name: "top-level lambda calls a map" + mapping: | + map double(x) { x * 2 } + output.v = [1, 2, 3].map(n -> double(n)) + output: {"v": [2, 4, 6]} + + - name: "lambda calls map that calls another map" + mapping: | + map inc(x) { x + 1 } + map double_inc(x) { inc(x) * 2 } + output.v = [1, 2, 3].map(n -> double_inc(n)) + output: {"v": [4, 6, 8]} + + - name: "filter lambda calls a map" + mapping: | + map is_big(x) { x > 5 } + output.v = [1, 3, 7, 9, 2].filter(n -> is_big(n)) + output: {"v": [7, 9]} + + # --- Map returning complex values --- + + - name: "map returns object used in lambda" + mapping: | + map make_pair(k, v) { + {"key": k, "value": v} + } + output.v = ["a", "b"].map(s -> make_pair(s, s.length())) + output: + v: + - key: "a" + value: 1 + - key: "b" + value: 1 + + # --- Map with local variables calling another map --- + + - name: "map with locals calls another map" + mapping: | + map add(a, b) { a + b } + map process(x) { + $doubled = x * 2 + add($doubled, 10) + } + output.v = process(5) + output: {"v": 20} + + # --- Frame isolation under transitive calls --- + + - name: "transitive calls do not share local variables" + mapping: | + map inner(x) { + $local = x + 100 + $local + } + map outer(x) { + $local = x + 1 + $result = inner($local) + $local * 1000 + $result + } + output.v = outer(5) + output: {"v": 6106} diff --git a/internal/bloblang2/spec/tests/maps/void_returns.yaml b/internal/bloblang2/spec/tests/maps/void_returns.yaml new file mode 100644 index 000000000..4c80c9ea5 --- /dev/null +++ b/internal/bloblang2/spec/tests/maps/void_returns.yaml @@ -0,0 +1,114 @@ +description: > + Map void returns — when a map body's final expression produces void + (from if-without-else or match-without-wildcard), the map call produces + void. Void propagates to the calling context with normal semantics. + +tests: + # --- Map returns void from if-without-else --- + + - name: "map returns void — output assignment skipped" + mapping: | + map maybe(val) { + if val > 10 { val * 2 } + } + output.v = "prior" + output.v = maybe(5) + output: {"v": "prior"} + + - name: "map returns value when condition true" + mapping: | + map maybe(val) { + if val > 10 { val * 2 } + } + output.v = maybe(20) + output: {"v": 40} + + - name: "map returns void — variable declaration errors" + mapping: | + map maybe(val) { + if val > 10 { val * 2 } + } + $x = maybe(5) + error: "void" + + - name: "map returns void — variable reassignment skipped" + mapping: | + map maybe(val) { + if val > 10 { val * 2 } + } + $x = "original" + $x = maybe(5) + output.v = $x + output: {"v": "original"} + + # --- Map returns void from non-exhaustive match --- + + - name: "map returns void from match — assignment skipped" + mapping: | + map classify(val) { + match val { + "a" => "alpha", + "b" => "beta", + } + } + output.v = "default" + output.v = classify("c") + output: {"v": "default"} + + - name: "map returns value from match when case matches" + mapping: | + map classify(val) { + match val { + "a" => "alpha", + "b" => "beta", + } + } + output.v = classify("a") + output: {"v": "alpha"} + + # --- Void from map in collection literal errors --- + + - name: "void from map in array literal is error" + mapping: | + map maybe(val) { + if val > 10 { val } + } + output.v = [maybe(5)] + error: "void" + + - name: "void from map in object literal is error" + mapping: | + map maybe(val) { + if val > 10 { val } + } + output.v = {"key": maybe(5)} + error: "void" + + # --- Void from map rescued with .or() --- + + - name: "void from map rescued with or" + mapping: | + map maybe(val) { + if val > 10 { val * 2 } + } + output.v = maybe(5).or(0) + output: {"v": 0} + + - name: "non-void from map not rescued by or" + mapping: | + map maybe(val) { + if val > 10 { val * 2 } + } + output.v = maybe(20).or(0) + output: {"v": 40} + + # --- Void from map as argument to another map errors --- + + - name: "void from map as argument to another map is error" + mapping: | + map maybe(val) { + if val > 10 { val } + } + map double(val) { val * 2 } + output.v = double(maybe(5)) + error: "void" diff --git a/internal/bloblang2/spec/tests/operators/arithmetic.yaml b/internal/bloblang2/spec/tests/operators/arithmetic.yaml new file mode 100644 index 000000000..f4a657290 --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/arithmetic.yaml @@ -0,0 +1,200 @@ +description: "Arithmetic operators (+, -, *, /, %) with all type combos, division/modulo by zero, null errors" + +tests: + # --- Addition --- + + - name: "add int64 + int64" + mapping: | + output.result = 5 + 3 + output: {"result": 8} + + - name: "add float64 + float64" + mapping: | + output.result = 2.5 + 3.5 + output: {"result": 6.0} + + - name: "add int32 + int32 stays int32" + mapping: | + output.result = 5.int32() + 3.int32() + output: {"result": {_type: "int32", value: "8"}} + + - name: "add uint64 + uint64 stays uint64" + mapping: | + output.result = 10.uint64() + 20.uint64() + output: {"result": {_type: "uint64", value: "30"}} + + - name: "add float32 + float32 stays float32" + mapping: | + output.result = 1.5.float32() + 2.5.float32() + output: {"result": {_type: "float32", value: "4.0"}} + + # --- Subtraction --- + + - name: "subtract int64 - int64" + mapping: | + output.result = 10 - 3 + output: {"result": 7} + + - name: "subtract resulting in negative int64" + mapping: | + output.result = 3 - 10 + output: {"result": -7} + + # --- Multiplication --- + + - name: "multiply int64 * int64" + mapping: | + output.result = 6 * 7 + output: {"result": 42} + + - name: "multiply float64 * float64" + mapping: | + output.result = 2.5 * 4.0 + output: {"result": 10.0} + + # --- Division (always produces float) --- + + - name: "divide int64 / int64 produces float64" + mapping: | + output.result = 7 / 2 + output: {"result": 3.5} + + - name: "divide int64 / int64 exact still produces float64" + mapping: | + output.result = 10 / 2 + output: {"result": 5.0} + + - name: "divide float32 / float32 produces float32" + mapping: | + output.result = 10.0.float32() / 4.0.float32() + output: {"result": {_type: "float32", value: "2.5"}} + + - name: "divide int32 / int32 produces float64 not int" + mapping: | + output.result = 7.int32() / 2.int32() + output: {"result": 3.5} + + - name: "divide negative int64" + mapping: | + output.result = -7 / 2 + output: {"result": -3.5} + + - name: "chained division is left-associative" + mapping: | + output.result = 20 / 4 / 2 + output: {"result": 2.5} + + # --- Modulo (follows standard promotion, NOT division rule) --- + + - name: "modulo int64 % int64 produces int64" + mapping: | + output.result = 7 % 2 + output: {"result": 1} + + - name: "modulo float64 % float64 (fmod)" + mapping: | + output.result = 7.5 % 2.0 + output: {"result": 1.5} + + - name: "modulo int64 % float64 promotes to float64" + mapping: | + output.result = 7 % 2.0 + output: {"result": 1.0} + + - name: "modulo negative dividend (truncated division remainder)" + mapping: | + output.result = -7 % 2 + output: {"result": -1} + + - name: "modulo negative float dividend (fmod semantics)" + mapping: | + output.result = -7.5 % 2.0 + output: {"result": -1.5} + + # --- Division by zero --- + + - name: "division by zero int64" + mapping: | + output.result = 7 / 0 + error: "division by zero" + + - name: "division by zero float64 (no Infinity)" + mapping: | + output.result = 7.0 / 0.0 + error: "division by zero" + + # --- Modulo by zero --- + + - name: "modulo by zero int64" + mapping: | + output.result = 7 % 0 + error: "modulo by zero" + + - name: "modulo by zero float64" + mapping: | + output.result = 7.0 % 0.0 + error: "modulo by zero" + + # --- Integer overflow --- + + - name: "int64 addition overflow" + mapping: | + output.result = 9223372036854775807 + 1 + error: "overflow" + + - name: "int64 multiplication overflow" + mapping: | + output.result = 9223372036854775807 * 2 + error: "overflow" + + - name: "int32 overflow" + mapping: | + output.result = 2147483647.int32() + 1.int32() + error: "overflow" + + - name: "uint64 no overflow where int64 would" + mapping: | + output.result = 9223372036854775807.uint64() + 1.uint64() + output: {"result": {_type: "uint64", value: "9223372036854775808"}} + + # --- Null in arithmetic --- + + - name: "null + int64 error" + mapping: | + output.result = null + 5 + error: "cannot add" + + - name: "null * int64 error" + mapping: | + output.result = null * 5 + error: "arithmetic" + + # --- Non-numeric operands --- + + - name: "int64 + string error" + mapping: | + output.result = 5 + "3" + error: "cannot add" + + - name: "int64 * bool error" + mapping: | + output.result = 5 * true + error: "arithmetic" + + - name: "string - string error" + mapping: | + output.result = "hello" - "world" + error: "arithmetic" + + # --- NaN and Infinity arithmetic --- + + - name: "special float addition" + mapping: | + output.result = input.val + 1.0 + cases: + - name: "NaN + float64 produces NaN" + input: {val: {_type: "float64", value: "NaN"}} + output: {"result": {_type: "float64", value: "NaN"}} + - name: "Infinity + float64 stays Infinity" + input: {val: {_type: "float64", value: "Infinity"}} + output: {"result": {_type: "float64", value: "Infinity"}} diff --git a/internal/bloblang2/spec/tests/operators/comparison.yaml b/internal/bloblang2/spec/tests/operators/comparison.yaml new file mode 100644 index 000000000..a1a82f4bc --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/comparison.yaml @@ -0,0 +1,209 @@ +description: "Comparison operators (>, >=, <, <=) for numeric, string, timestamp, bytes; cross-family errors, null errors, NaN" + +tests: + # --- Numeric comparisons (int64) --- + + - name: "int64 less than true" + mapping: | + output.result = 3 < 5 + output: {"result": true} + + - name: "int64 less than false" + mapping: | + output.result = 5 < 3 + output: {"result": false} + + - name: "int64 greater than" + mapping: | + output.result = 10 > 3 + output: {"result": true} + + - name: "int64 greater than or equal (equal case)" + mapping: | + output.result = 5 >= 5 + output: {"result": true} + + - name: "int64 less than or equal (less case)" + mapping: | + output.result = 4 <= 5 + output: {"result": true} + + - name: "int64 less than or equal false" + mapping: | + output.result = 6 <= 5 + output: {"result": false} + + # --- Numeric with promotion --- + + - name: "int32 < int64 with promotion" + mapping: | + output.result = 3.int32() < 5 + output: {"result": true} + + - name: "int64 > float64 with promotion" + mapping: | + output.result = 5 > 4.5 + output: {"result": true} + + - name: "float32 <= float64 with promotion" + mapping: | + output.result = 3.0.float32() <= 3.0 + output: {"result": true} + + - name: "uint32 >= int32 with promotion to int64" + mapping: | + output.result = 10.uint32() >= 10.int32() + output: {"result": true} + + # --- String comparisons (lexicographic by codepoint) --- + + - name: "string less than lexicographic" + mapping: | + output.result = "apple" < "banana" + output: {"result": true} + + - name: "string equal not less than" + mapping: | + output.result = "hello" <= "hello" + output: {"result": true} + + - name: "string prefix is less than longer string" + mapping: | + output.result = "abc" < "abcd" + output: {"result": true} + + - name: "string uppercase A < lowercase a (codepoint ordering)" + mapping: | + output.result = "A" < "a" + output: {"result": true} + + # --- Bytes comparisons (lexicographic by byte) --- + + - name: "bytes less than" + mapping: | + output.result = "abc".bytes() < "abd".bytes() + output: {"result": true} + + - name: "bytes greater than" + mapping: | + output.result = "xyz".bytes() > "abc".bytes() + output: {"result": true} + + - name: "bytes prefix is less than longer bytes" + mapping: | + output.result = "ab".bytes() < "abc".bytes() + output: {"result": true} + + # --- Timestamp comparisons --- + + - name: "earlier timestamp less than later" + input: + a: {_type: "timestamp", value: "2024-01-01T00:00:00Z"} + b: {_type: "timestamp", value: "2024-06-15T00:00:00Z"} + mapping: | + output.result = input.a < input.b + output: {"result": true} + + - name: "same timestamp greater than or equal" + input: + a: {_type: "timestamp", value: "2024-01-01T00:00:00Z"} + mapping: | + output.result = input.a >= input.a + output: {"result": true} + + # --- Cross-family comparisons: error (not just false) --- + + - name: "int64 < string is error" + mapping: | + output.result = 5 < "hello" + error: "cannot compare" + + - name: "string > int64 is error" + mapping: | + output.result = "hello" > 5 + error: "cannot compare" + + - name: "int64 < bool is error" + mapping: | + output.result = 5 < true + error: "cannot compare" + + - name: "string < bytes is error" + mapping: | + output.result = "hello" < "hello".bytes() + error: "cannot compare" + + - name: "timestamp < int64 is error" + input: + a: {_type: "timestamp", value: "2024-01-01T00:00:00Z"} + mapping: | + output.result = input.a < 5 + error: "cannot compare" + + - name: "bool < bool is error (not comparable)" + mapping: | + output.result = true < false + error: "cannot compare" + + - name: "array < array is error (not comparable)" + mapping: | + output.result = [1, 2] < [3, 4] + error: "cannot compare" + + # --- Null in comparison: error --- + + - name: "null < int64 is error" + mapping: | + output.result = null < 5 + error: "cannot compare" + + - name: "int64 > null is error" + mapping: | + output.result = 5 > null + error: "cannot compare" + + - name: "null <= null is error" + mapping: | + output.result = null <= null + error: "cannot compare" + + # --- NaN comparisons: all return false --- + + - name: "NaN < float64 is false" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + output.result = input.val < 1.0 + output: {"result": false} + + - name: "NaN > float64 is false" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + output.result = input.val > 1.0 + output: {"result": false} + + - name: "NaN >= NaN is false" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + output.result = input.val >= input.val + output: {"result": false} + + # --- Infinity comparisons --- + + - name: "Infinity > float64" + input: {val: {_type: "float64", value: "Infinity"}} + mapping: | + output.result = input.val > 999999.0 + output: {"result": true} + + - name: "negative Infinity < float64" + input: {val: {_type: "float64", value: "-Infinity"}} + mapping: | + output.result = input.val < -999999.0 + output: {"result": true} + + # --- Non-associative chaining is compile error --- + + - name: "chained comparison is compile error" + mapping: | + output.result = 1 < 2 < 3 + compile_error: "cannot chain" diff --git a/internal/bloblang2/spec/tests/operators/division_modulo.yaml b/internal/bloblang2/spec/tests/operators/division_modulo.yaml new file mode 100644 index 000000000..1b23968aa --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/division_modulo.yaml @@ -0,0 +1,118 @@ +description: > + Division and modulo semantics — division always produces float, + modulo uses truncated division (fmod) semantics, division by zero + errors, and negative zero handling. + +tests: + # --- Division always produces float --- + + - name: "integer / integer produces float" + mapping: | + output.v = 7 / 2 + output: {"v": 3.5} + + - name: "even integer division produces float" + mapping: | + output.v = 6 / 2 + output: {"v": 3.0} + + - name: "large integer division produces float" + mapping: | + output.v = 1000000 / 3 + output: {"v": 333333.33333333331} + + - name: "float / float produces float" + mapping: | + output.v = 7.5 / 2.5 + output: {"v": 3.0} + + - name: "integer / float produces float" + mapping: | + output.v = 10 / 2.5 + output: {"v": 4.0} + + - name: "float / integer produces float" + mapping: | + output.v = 7.5 / 3 + output: {"v": 2.5} + + # --- Division by zero --- + + - name: "integer division by zero is error" + mapping: | + output.v = 5 / 0 + error: "division by zero" + + - name: "float division by zero is error" + mapping: | + output.v = 5.0 / 0.0 + error: "division by zero" + + - name: "zero divided by zero is error" + mapping: | + output.v = 0 / 0 + error: "division by zero" + + # --- Modulo --- + + - name: "integer modulo" + mapping: | + output.v = 7 % 3 + output: {"v": 1} + + - name: "integer modulo evenly divisible" + mapping: | + output.v = 9 % 3 + output: {"v": 0} + + - name: "negative dividend modulo (truncated division)" + mapping: | + output.v = -7 % 3 + output: {"v": -1} + + - name: "negative divisor modulo" + mapping: | + output.v = 7 % -3 + output: {"v": 1} + + - name: "both negative modulo" + mapping: | + output.v = -7 % -3 + output: {"v": -1} + + - name: "float modulo (fmod semantics)" + mapping: | + output.v = 7.5 % 2.5 + output: {"v": 0.0} + + - name: "float modulo with remainder" + mapping: | + output.v = 7.0 % 2.5 + output: {"v": 2.0} + + - name: "negative float modulo" + mapping: | + output.v = -7.5 % 2.5 + output: {"v": -0.0} + + - name: "modulo by zero is error" + mapping: | + output.v = 5 % 0 + error: "modulo by zero" + + - name: "float modulo by zero is error" + mapping: | + output.v = 5.0 % 0.0 + error: "modulo by zero" + + # --- Negative zero --- + + - name: "negative zero equals positive zero" + mapping: | + output.v = -0.0 == 0.0 + output: {"v": true} + + - name: "negative zero is not less than positive zero" + mapping: | + output.v = -0.0 < 0.0 + output: {"v": false} diff --git a/internal/bloblang2/spec/tests/operators/equality.yaml b/internal/bloblang2/spec/tests/operators/equality.yaml new file mode 100644 index 000000000..8af9ca5dc --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/equality.yaml @@ -0,0 +1,215 @@ +description: "Equality operators (==, !=) for all type combos, NaN, -0, cross-family, collections" + +tests: + # --- Numeric equality (same type) --- + + - name: "int64 == int64 true" + mapping: | + output.result = 5 == 5 + output: {"result": true} + + - name: "int64 != int64 true" + mapping: | + output.result = 5 != 6 + output: {"result": true} + + # --- Numeric equality with promotion --- + + - name: "int64 == float64 with promotion true" + mapping: | + output.result = 5 == 5.0 + output: {"result": true} + + - name: "int64 == float64 with promotion false" + mapping: | + output.result = 5 == 5.5 + output: {"result": false} + + - name: "int32 == int64 with promotion true" + mapping: | + output.result = 5.int32() == 5 + output: {"result": true} + + - name: "uint32 == int64 with promotion true" + mapping: | + output.result = 42.uint32() == 42 + output: {"result": true} + + # --- String equality --- + + - name: "string == string true" + mapping: | + output.result = "hello" == "hello" + output: {"result": true} + + - name: "empty string == empty string" + mapping: | + output.result = "" == "" + output: {"result": true} + + # --- Boolean equality --- + + - name: "true == true" + mapping: | + output.result = true == true + output: {"result": true} + + - name: "true == false" + mapping: | + output.result = true == false + output: {"result": false} + + # --- Null equality --- + + - name: "null == null true" + mapping: | + output.result = null == null + output: {"result": true} + + - name: "null != null false" + mapping: | + output.result = null != null + output: {"result": false} + + # --- Cross-family: always false, NOT error --- + + - name: "int64 == string is false (not error)" + mapping: | + output.result = 5 == "5" + output: {"result": false} + + - name: "int64 != string is true" + mapping: | + output.result = 5 != "5" + output: {"result": true} + + - name: "bool == int64 is false" + mapping: | + output.result = true == 1 + output: {"result": false} + + - name: "null == int64 is false" + mapping: | + output.result = null == 0 + output: {"result": false} + + - name: "string == bytes is false" + mapping: | + output.result = "hello" == "hello".bytes() + output: {"result": false} + + # --- NaN equality --- + + - name: "self-equality for special floats" + mapping: | + output.result = input.val == input.val + cases: + - name: "NaN == NaN is false" + input: {val: {_type: "float64", value: "NaN"}} + output: {"result": false} + - name: "Infinity == Infinity is true" + input: {val: {_type: "float64", value: "Infinity"}} + output: {"result": true} + + - name: "NaN != NaN is true" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + output.result = input.val != input.val + output: {"result": true} + + # --- Negative zero --- + + - name: "-0.0 == 0.0 is true" + input: {val: {_type: "float64", value: "-0.0"}} + mapping: | + output.result = input.val == 0.0 + output: {"result": true} + + - name: "Infinity != negative Infinity" + input: + a: {_type: "float64", value: "Infinity"} + b: {_type: "float64", value: "-Infinity"} + mapping: | + output.result = input.a != input.b + output: {"result": true} + + # --- Bytes equality --- + + - name: "bytes == bytes true" + mapping: | + output.result = "hello".bytes() == "hello".bytes() + output: {"result": true} + + # --- Timestamp equality --- + + - name: "same timestamp == true" + input: + a: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + b: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + mapping: | + output.result = input.a == input.b + output: {"result": true} + + - name: "different timestamp != true" + input: + a: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + b: {_type: "timestamp", value: "2024-06-15T10:30:00Z"} + mapping: | + output.result = input.a != input.b + output: {"result": true} + + # --- Array equality --- + + - name: "array == array same order true" + mapping: | + output.result = [1, 2, 3] == [1, 2, 3] + output: {"result": true} + + - name: "array == array different order false (order matters)" + mapping: | + output.result = [1, 2, 3] == [3, 2, 1] + output: {"result": false} + + - name: "array == array different length false" + mapping: | + output.result = [1, 2] == [1, 2, 3] + output: {"result": false} + + - name: "empty array == empty array true" + mapping: | + output.result = [] == [] + output: {"result": true} + + # --- Object equality (order-independent) --- + + - name: "object == object same keys true" + mapping: | + output.result = {"a": 1, "b": 2} == {"a": 1, "b": 2} + output: {"result": true} + + - name: "object == object different key order still true" + mapping: | + output.result = {"a": 1, "b": 2} == {"b": 2, "a": 1} + output: {"result": true} + + - name: "object == object different values false" + mapping: | + output.result = {"a": 1, "b": 2} == {"a": 1, "b": 3} + output: {"result": false} + + - name: "empty object == empty object true" + mapping: | + output.result = {} == {} + output: {"result": true} + + - name: "nested object deep equality" + mapping: | + output.result = {"a": {"x": 1}} == {"a": {"x": 1}} + output: {"result": true} + + # --- Non-associative chaining is compile error --- + + - name: "chained equality is compile error" + mapping: | + output.result = 1 == 1 == true + compile_error: "cannot chain" diff --git a/internal/bloblang2/spec/tests/operators/logical.yaml b/internal/bloblang2/spec/tests/operators/logical.yaml new file mode 100644 index 000000000..fb0c16c04 --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/logical.yaml @@ -0,0 +1,265 @@ +description: "Logical operators (&&, ||, !) — boolean requirement, short-circuit evaluation, associativity" + +tests: + # --- Basic AND --- + - name: "and_true_true" + mapping: | + output = true && true + output: true + + - name: "and_true_false" + mapping: | + output = true && false + output: false + + - name: "and_false_true" + mapping: | + output = false && true + output: false + + - name: "and_false_false" + mapping: | + output = false && false + output: false + + # --- Basic OR --- + - name: "or_true_true" + mapping: | + output = true || true + output: true + + - name: "or_true_false" + mapping: | + output = true || false + output: true + + - name: "or_false_true" + mapping: | + output = false || true + output: true + + - name: "or_false_false" + mapping: | + output = false || false + output: false + + # --- Basic NOT --- + - name: "not_true" + mapping: | + output = !true + output: false + + - name: "not_false" + mapping: | + output = !false + output: true + + - name: "double_negation" + mapping: | + output = !!true + output: true + + # --- Short-circuit AND --- + - name: "and_short_circuit_false_lhs" + mapping: | + output = false && throw("should not be evaluated") + output: false + + - name: "and_no_short_circuit_true_lhs" + mapping: | + output = true && throw("evaluated") + error: "evaluated" + + # --- Short-circuit OR --- + - name: "or_short_circuit_true_lhs" + mapping: | + output = true || throw("should not be evaluated") + output: true + + - name: "or_no_short_circuit_false_lhs" + mapping: | + output = false || throw("evaluated") + error: "evaluated" + + # --- Non-boolean operands: AND --- + - name: "and_int_lhs" + mapping: | + output = 5 && true + error: "bool" + + - name: "and_int_rhs" + mapping: | + output = true && 1 + error: "bool" + + - name: "and_string_lhs" + mapping: | + output = "yes" && true + error: "bool" + + - name: "and_null_lhs" + mapping: | + output = null && true + error: "bool" + + - name: "and_null_rhs" + mapping: | + output = true && null + error: "bool" + + - name: "and_array_lhs" + mapping: | + output = [1, 2] && true + error: "bool" + + - name: "and_object_lhs" + mapping: | + output = {"a": 1} && true + error: "bool" + + # --- Non-boolean operands: OR --- + - name: "or_int_lhs" + mapping: | + output = 0 || false + error: "bool" + + - name: "or_string_rhs" + mapping: | + output = false || "fallback" + error: "bool" + + - name: "or_float_lhs" + mapping: | + output = 1.0 || true + error: "bool" + + # --- Non-boolean operands: NOT --- + - name: "not_int" + mapping: | + output = !5 + error: "bool" + + - name: "not_string" + mapping: | + output = !"hello" + error: "bool" + + - name: "not_null" + mapping: | + output = !null + error: "bool" + + - name: "not_zero" + mapping: | + output = !0 + error: "bool" + + - name: "not_empty_string" + mapping: | + output = !"" + error: "bool" + + # --- Short-circuit skips type check on unevaluated side --- + - name: "and_short_circuit_skips_type_check" + mapping: | + output = false && "not a bool" + output: false + + - name: "or_short_circuit_skips_type_check" + mapping: | + output = true || 42 + output: true + + # --- Left-associativity of AND --- + - name: "and_left_associative" + mapping: | + output = true && false && true + output: false + + - name: "and_chain_all_true" + mapping: | + output = true && true && true + output: true + + # --- Left-associativity of OR --- + - name: "or_left_associative" + mapping: | + output = false || false || true + output: true + + - name: "or_chain_all_false" + mapping: | + output = false || false || false + output: false + + # --- AND has higher precedence than OR --- + - name: "and_binds_tighter_than_or" + mapping: | + output = true || false && false + output: true + + - name: "and_binds_tighter_than_or_reversed" + mapping: | + output = false && false || true + output: true + + - name: "parens_override_and_or_precedence" + mapping: | + output = true || false && false + output: true + + - name: "parens_force_or_first" + mapping: | + output = (true || false) && false + output: false + + # --- Combinations --- + - name: "not_with_and" + mapping: | + output = !false && true + output: true + + - name: "not_with_or" + mapping: | + output = !true || false + output: false + + - name: "complex_logical_expression" + mapping: | + output = (true || false) && !(false && true) + output: true + + - name: "not_binds_tighter_than_and" + mapping: | + output = !false && !false + output: true + + # --- Short-circuit preserves left-to-right with associativity --- + - name: "and_chain_short_circuits_at_first_false" + mapping: | + output = true && false && throw("should not reach") + output: false + + - name: "or_chain_short_circuits_at_first_true" + mapping: | + output = false || true || throw("should not reach") + output: true + + # --- Using input values --- + - name: "and_with_input_booleans" + mapping: | + output = input.a && input.b + input: {"a": true, "b": false} + output: false + + - name: "or_with_input_booleans" + mapping: | + output = input.a || input.b + input: {"a": false, "b": true} + output: true + + - name: "not_with_input_boolean" + mapping: | + output = !input.flag + input: {"flag": true} + output: false diff --git a/internal/bloblang2/spec/tests/operators/numeric_promotion.yaml b/internal/bloblang2/spec/tests/operators/numeric_promotion.yaml new file mode 100644 index 000000000..7002f3763 --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/numeric_promotion.yaml @@ -0,0 +1,186 @@ +description: "Numeric promotion rules: same type, width promotion, signed/unsigned, int/float, checked errors" + +tests: + # --- Same type: no promotion --- + + - name: "int64 + int64 stays int64" + mapping: | + output.result = 10 + 20 + output: {"result": 30} + + - name: "float64 + float64 stays float64" + mapping: | + output.result = 1.5 + 2.5 + output: {"result": 4.0} + + - name: "int32 + int32 stays int32" + mapping: | + output.result = 10.int32() + 20.int32() + output: {"result": {_type: "int32", value: "30"}} + + - name: "uint32 + uint32 stays uint32" + mapping: | + output.result = 10.uint32() + 20.uint32() + output: {"result": {_type: "uint32", value: "30"}} + + - name: "uint64 + uint64 stays uint64" + mapping: | + output.result = 10.uint64() + 20.uint64() + output: {"result": {_type: "uint64", value: "30"}} + + - name: "float32 + float32 stays float32" + mapping: | + output.result = 1.5.float32() + 2.5.float32() + output: {"result": {_type: "float32", value: "4.0"}} + + # --- Same signedness, different width: promote to wider --- + + - name: "int32 + int64 promotes to int64" + mapping: | + output.result = 5.int32() + 10 + output: {"result": 15} + + - name: "int64 + int32 promotes to int64" + mapping: | + output.result = 10 + 5.int32() + output: {"result": 15} + + - name: "uint32 + uint64 promotes to uint64" + mapping: | + output.result = 5.uint32() + 10.uint64() + output: {"result": {_type: "uint64", value: "15"}} + + - name: "float32 + float64 promotes to float64" + mapping: | + output.result = 1.5.float32() + 2.5 + output: {"result": 4.0} + + - name: "float64 + float32 promotes to float64" + mapping: | + output.result = 2.5 + 1.5.float32() + output: {"result": 4.0} + + # --- Signed + unsigned integer: promote to int64 --- + + - name: "int32 + uint32 promotes to int64" + mapping: | + output.result = 5.int32() + 10.uint32() + output: {"result": 15} + + - name: "int64 + uint32 promotes to int64" + mapping: | + output.result = 100 + 50.uint32() + output: {"result": 150} + + - name: "int32 + uint64 promotes to int64 when value fits" + mapping: | + output.result = 5.int32() + 10.uint64() + output: {"result": 15} + + - name: "int64 + uint64 promotes to int64 when value fits" + mapping: | + output.result = 5 + 10.uint64() + output: {"result": 15} + + - name: "uint64 exceeding int64 max causes error" + mapping: | + output.result = 5 + "9999999999999999999".uint64() + error: "uint64 value" + + - name: "uint64 at int64 max boundary succeeds" + mapping: | + output.result = 0 + 9223372036854775807.uint64() + output: {"result": 9223372036854775807} + + - name: "uint64 just above int64 max causes error" + mapping: | + output.result = 0 + "9223372036854775808".uint64() + error: "uint64 value" + + # --- Any integer + any float: promote to float64 --- + + - name: "int64 + float64 promotes to float64" + mapping: | + output.result = 5 + 3.0 + output: {"result": 8.0} + + - name: "float64 + int64 promotes to float64" + mapping: | + output.result = 3.0 + 5 + output: {"result": 8.0} + + - name: "int32 + float64 promotes to float64" + mapping: | + output.result = 5.int32() + 3.0 + output: {"result": 8.0} + + - name: "uint64 + float64 promotes to float64 when safe" + mapping: | + output.result = 100.uint64() + 1.5 + output: {"result": 101.5} + + - name: "int32 + float32 promotes to float64" + mapping: | + output.result = 5.int32() + 3.0.float32() + output: {"result": 8.0} + + - name: "uint32 + float32 promotes to float64" + mapping: | + output.result = 10.uint32() + 2.5.float32() + output: {"result": 12.5} + + # --- Checked promotion: integer magnitude > 2^53 to float --- + + - name: "int64 exceeding float64 exact range causes error" + mapping: | + output.result = 9007199254740993 + 1.0 + error: "float64 exact range" + + - name: "int64 at float64 exact limit succeeds" + mapping: | + output.result = 9007199254740992 + 1.0 + output: {"result": 9007199254740992.0} + + - name: "negative int64 exceeding float64 exact range causes error" + mapping: | + output.result = -9007199254740993 + 1.0 + error: "float64 exact range" + + # --- Promotion with comparison operators --- + + - name: "int32 < int64 promotes to int64 for comparison" + mapping: | + output.result = 5.int32() < 10 + output: {"result": true} + + - name: "int64 >= float64 promotes to float64 for comparison" + mapping: | + output.result = 5 >= 5.0 + output: {"result": true} + + - name: "uint32 > int32 promotes to int64 for comparison" + mapping: | + output.result = 10.uint32() > 5.int32() + output: {"result": true} + + # --- Promotion with equality --- + + - name: "int64 == float64 promotes to float64" + mapping: | + output.result = 5 == 5.0 + output: {"result": true} + + - name: "int32 == int64 promotes to int64" + mapping: | + output.result = 5.int32() == 5 + output: {"result": true} + + - name: "int32 == float64 promotes to float64" + mapping: | + output.result = 5.int32() == 5.0 + output: {"result": true} + + - name: "uint32 == int64 promotes to int64" + mapping: | + output.result = 42.uint32() == 42 + output: {"result": true} diff --git a/internal/bloblang2/spec/tests/operators/numeric_promotion_edge.yaml b/internal/bloblang2/spec/tests/operators/numeric_promotion_edge.yaml new file mode 100644 index 000000000..e6b3feadf --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/numeric_promotion_edge.yaml @@ -0,0 +1,179 @@ +description: > + Numeric promotion edge cases — uint64 boundary values with signed ops, + large integers with float, promotion in subtraction/multiplication/modulo, + and cross-type comparison edge cases. + +tests: + # --- uint64 boundary with signed: 2^63-1 is max safe value --- + + - name: "uint64 above int64 max in subtraction errors" + mapping: | + output.result = 0 - "9223372036854775808".uint64() + error: "uint64 value" + + - name: "uint64 value 1 in mixed arithmetic works" + mapping: | + output.result = 10 - 3.uint64() + output: {"result": 7} + + - name: "uint64 zero in mixed arithmetic works" + mapping: | + output.result = 5 + 0.uint64() + output: {"result": 5} + + - name: "int32 + uint64 exceeding int64 max errors" + mapping: | + output.result = 1.int32() + "9223372036854775808".uint64() + error: "uint64 value" + + # --- Large integers with float: 2^53 boundary --- + + - name: "int64 at 2^53 with float succeeds" + mapping: | + output.result = 9007199254740992 + 0.0 + output: {"result": 9007199254740992.0} + + - name: "int64 at 2^53+1 with float errors" + mapping: | + output.result = 9007199254740993 + 0.0 + error: "float64 exact range" + + - name: "negative int64 at -2^53 with float succeeds" + mapping: | + output.result = -9007199254740992 + 0.0 + output: {"result": -9007199254740992.0} + + - name: "negative int64 at -(2^53+1) with float errors" + mapping: | + output.result = -9007199254740993 + 0.0 + error: "float64 exact range" + + - name: "int64 * float64 checks promotion" + mapping: | + output.result = 9007199254740993 * 1.0 + error: "float64 exact range" + + - name: "int64 at safe range with float multiplication" + mapping: | + output.result = 1000000 * 1.5 + output: {"result": 1500000.0} + + # --- Promotion in subtraction --- + + - name: "int32 - uint32 promotes to int64" + mapping: | + output.result = 100.int32() - 30.uint32() + output: {"result": 70} + + - name: "uint32 - int32 promotes to int64" + mapping: | + output.result = 100.uint32() - 30.int32() + output: {"result": 70} + + - name: "float64 - int64 promotes to float64" + mapping: | + output.result = 10.5 - 3 + output: {"result": 7.5} + + - name: "int64 - float64 promotes to float64" + mapping: | + output.result = 10 - 3.5 + output: {"result": 6.5} + + # --- Promotion in multiplication --- + + - name: "int32 * int64 promotes to int64" + mapping: | + output.result = 5.int32() * 10 + output: {"result": 50} + + - name: "int32 * float64 promotes to float64" + mapping: | + output.result = 5.int32() * 2.5 + output: {"result": 12.5} + + - name: "uint32 * int64 promotes to int64" + mapping: | + output.result = 5.uint32() * 10 + output: {"result": 50} + + - name: "uint64 * uint64 stays uint64" + mapping: | + output.result = 5.uint64() * 10.uint64() + output: {"result": {_type: "uint64", value: "50"}} + + # --- Promotion in modulo --- + + - name: "int32 % int64 promotes to int64" + mapping: | + output.result = 7.int32() % 3 + output: {"result": 1} + + - name: "int64 % float64 promotes to float64" + mapping: | + output.result = 7 % 2.5 + output: {"result": 2.0} + + - name: "uint32 % int32 promotes to int64" + mapping: | + output.result = 7.uint32() % 3.int32() + output: {"result": 1} + + # --- Promotion in comparison edge cases --- + + - name: "uint64 at int64 max compared with int64" + mapping: | + output.result = 9223372036854775807.uint64() == 9223372036854775807 + output: {"result": true} + + - name: "uint64 above int64 max in comparison errors" + mapping: | + output.result = "9223372036854775808".uint64() > 0 + error: "uint64 value" + + - name: "int64 at float64 limit compared with float" + mapping: | + output.result = 9007199254740992 == 9007199254740992.0 + output: {"result": true} + + - name: "int64 above float64 limit compared with float errors" + mapping: | + output.result = 9007199254740993 == 9007199254740993.0 + error: "float64 exact range" + + # --- Cross-family comparison always false --- + + - name: "int vs string comparison is false" + mapping: | + output.result = 5 == "5" + output: {"result": false} + + - name: "float vs string comparison is false" + mapping: | + output.result = 5.0 == "5.0" + output: {"result": false} + + - name: "bool vs int comparison is false" + mapping: | + output.result = true == 1 + output: {"result": false} + + - name: "null vs empty string comparison is false" + mapping: | + output.result = null == "" + output: {"result": false} + + - name: "null vs zero comparison is false" + mapping: | + output.result = null == 0 + output: {"result": false} + + - name: "null vs false comparison is false" + mapping: | + output.result = null == false + output: {"result": false} + + - name: "array vs string comparison is false" + mapping: | + output.result = [] == "" + output: {"result": false} diff --git a/internal/bloblang2/spec/tests/operators/precedence.yaml b/internal/bloblang2/spec/tests/operators/precedence.yaml new file mode 100644 index 000000000..2208db1a2 --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/precedence.yaml @@ -0,0 +1,260 @@ +description: "Operator precedence, associativity, non-associative parse errors, unary minus + method call trap" + +tests: + # --- Multiplicative before additive --- + - name: "multiply_before_add" + mapping: | + output = 2 + 3 * 4 + output: 14 + + - name: "divide_before_subtract" + mapping: | + output = 10 - 6 / 2 + output: 7.0 + + - name: "modulo_before_add" + mapping: | + output = 10 + 7 % 3 + output: 11 + + - name: "parens_override_mult_add" + mapping: | + output = (2 + 3) * 4 + output: 20 + + # --- Additive before comparison --- + - name: "add_before_greater_than" + mapping: | + output = 3 + 2 > 4 + output: true + + - name: "add_before_less_than" + mapping: | + output = 1 + 1 < 3 + output: true + + - name: "subtract_before_greater_equal" + mapping: | + output = 10 - 5 >= 5 + output: true + + - name: "subtract_before_less_equal" + mapping: | + output = 10 - 5 <= 4 + output: false + + # --- Comparison before equality --- + - name: "comparison_before_equality" + mapping: | + output = 3 > 2 == true + output: true + + - name: "comparison_before_inequality" + mapping: | + output = 3 < 2 != true + output: true + + # --- Equality before logical AND --- + - name: "equality_before_and" + mapping: | + output = 1 == 1 && 2 == 2 + output: true + + - name: "inequality_before_and" + mapping: | + output = 1 != 2 && 3 != 4 + output: true + + # --- Logical AND before logical OR --- + - name: "and_before_or" + mapping: | + output = true || false && false + output: true + + - name: "and_before_or_reversed" + mapping: | + output = false && false || true + output: true + + - name: "parens_override_and_or" + mapping: | + output = (true || false) && false + output: false + + # --- Unary minus precedence --- + - name: "unary_minus_before_multiply" + mapping: | + output = -3 * 2 + output: -6 + + - name: "unary_minus_before_add" + mapping: | + output = -3 + 5 + output: 2 + + - name: "unary_not_before_and" + mapping: | + output = !false && true + output: true + + - name: "unary_not_before_or" + mapping: | + output = !true || true + output: true + + # --- Method calls bind tighter than unary minus --- + - name: "unary_minus_method_call_trap" + mapping: | + output = -10.string() + error: "string" + + - name: "unary_minus_method_call_with_parens" + mapping: | + output = (-10).string() + output: "-10" + + - name: "unary_minus_method_chain_trap" + mapping: | + output = -5.float64() + output: -5.0 + + # --- Field access / indexing binds tightest --- + - name: "field_access_before_arithmetic" + mapping: | + output = input.a + input.b * input.c + input: {"a": 2, "b": 3, "c": 4} + output: 14 + + - name: "indexing_before_arithmetic" + mapping: | + output = input.items[0] + input.items[1] + input: {"items": [10, 20]} + output: 30 + + - name: "method_call_before_arithmetic" + mapping: | + output = input.text.length() + 1 + input: {"text": "hello"} + output: 6 + + - name: "method_call_before_comparison" + mapping: | + output = input.text.length() > 3 + input: {"text": "hello"} + output: true + + # --- Left-associativity of arithmetic --- + - name: "subtraction_left_associative" + mapping: | + output = 10 - 5 - 2 + output: 3 + + - name: "division_left_associative" + mapping: | + output = 20 / 4 / 2 + output: 2.5 + + - name: "modulo_left_associative" + mapping: | + output = 17 % 10 % 5 + output: 2 + + - name: "addition_left_associative" + mapping: | + output = 1 + 2 + 3 + output: 6 + + - name: "multiplication_left_associative" + mapping: | + output = 2 * 3 * 4 + output: 24 + + # --- Non-associative: comparison chaining is parse error --- + - name: "chained_less_than" + mapping: | + output = 1 < 2 < 3 + compile_error: "chain" + + - name: "chained_greater_than" + mapping: | + output = 3 > 2 > 1 + compile_error: "chain" + + - name: "chained_less_equal" + mapping: | + output = 1 <= 2 <= 3 + compile_error: "chain" + + - name: "chained_greater_equal" + mapping: | + output = 3 >= 2 >= 1 + compile_error: "chain" + + - name: "chained_mixed_comparison" + mapping: | + output = 1 < 2 > 0 + compile_error: "chain" + + # --- Non-associative: equality chaining is parse error --- + - name: "chained_equality" + mapping: | + output = 1 == 1 == true + compile_error: "chain" + + - name: "chained_inequality" + mapping: | + output = 1 != 2 != 3 + compile_error: "chain" + + # --- Correct way to express range checks --- + - name: "range_check_with_and" + mapping: | + output = 1 < 2 && 2 < 3 + output: true + + - name: "equality_chain_with_and" + mapping: | + output = 1 == 1 && 2 == 2 + output: true + + # --- Complex precedence combinations --- + - name: "full_precedence_chain" + mapping: | + output = 2 + 3 * 4 > 10 == true && !false + output: true + + - name: "arithmetic_in_comparison_in_logical" + mapping: | + output = 1 + 2 >= 3 && 4 * 2 <= 8 || false + output: true + + - name: "nested_parens_override_everything" + mapping: | + output = ((2 + 3) * (4 - 1)) > ((10 / 2) + 1) + output: true + + - name: "unary_minus_in_complex_expression" + mapping: | + output = -2 * 3 + 4 + output: -2 + + - name: "double_unary_minus" + mapping: | + output = - -5 + output: 5 + + - name: "unary_minus_with_parens_and_method" + mapping: | + output = (-3 * 2).string() + output: "-6" + + # --- Logical left-associativity --- + - name: "and_left_associativity_with_short_circuit" + mapping: | + output = true && false && throw("should not reach") + output: false + + - name: "or_left_associativity_with_short_circuit" + mapping: | + output = false || true || throw("should not reach") + output: true diff --git a/internal/bloblang2/spec/tests/operators/string_concat.yaml b/internal/bloblang2/spec/tests/operators/string_concat.yaml new file mode 100644 index 000000000..2f104bb61 --- /dev/null +++ b/internal/bloblang2/spec/tests/operators/string_concat.yaml @@ -0,0 +1,201 @@ +description: "String and bytes concatenation with + operator — same-type concat, cross-type errors, explicit conversion" + +tests: + # --- String + String --- + - name: "concat_two_strings" + mapping: | + output = "hello" + " world" + output: "hello world" + + - name: "concat_empty_left" + mapping: | + output = "" + "world" + output: "world" + + - name: "concat_empty_right" + mapping: | + output = "hello" + "" + output: "hello" + + - name: "concat_both_empty" + mapping: | + output = "" + "" + output: "" + + - name: "concat_multiple_strings" + mapping: | + output = "a" + "b" + "c" + output: "abc" + + - name: "concat_unicode_strings" + mapping: | + output = "café" + " ☕" + output: "café ☕" + + - name: "concat_emoji_strings" + mapping: | + output = "hello " + "😀" + output: "hello 😀" + + - name: "concat_with_escape_sequences" + mapping: | + output = "line1\n" + "line2" + output: "line1\nline2" + + - name: "concat_from_input" + mapping: | + output = input.first + " " + input.last + input: {"first": "John", "last": "Doe"} + output: "John Doe" + + # --- Bytes + Bytes --- + - name: "concat_two_bytes" + mapping: | + output = "hello".bytes() + " world".bytes() + output: {_type: "bytes", value: "aGVsbG8gd29ybGQ="} + + - name: "concat_empty_bytes_left" + mapping: | + output = "".bytes() + "world".bytes() + output: {_type: "bytes", value: "d29ybGQ="} + + - name: "concat_empty_bytes_right" + mapping: | + output = "hello".bytes() + "".bytes() + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "concat_both_empty_bytes" + mapping: | + output = "".bytes() + "".bytes() + output: {_type: "bytes", value: ""} + + - name: "concat_multiple_bytes" + mapping: | + output = "a".bytes() + "b".bytes() + "c".bytes() + output: {_type: "bytes", value: "YWJj"} + + # --- Cross-type errors: string + number --- + - name: "string_plus_int" + mapping: | + output = "count: " + 5 + error: "cannot add" + + - name: "int_plus_string" + mapping: | + output = 5 + " items" + error: "cannot add" + + - name: "string_plus_float" + mapping: | + output = "value: " + 3.14 + error: "cannot add" + + - name: "float_plus_string" + mapping: | + output = 3.14 + " meters" + error: "cannot add" + + # --- Cross-type errors: string + bytes --- + - name: "string_plus_bytes" + mapping: | + output = "hello" + "world".bytes() + error: "cannot add" + + - name: "bytes_plus_string" + mapping: | + output = "hello".bytes() + "world" + error: "cannot add" + + # --- Cross-type errors: string + bool --- + - name: "string_plus_bool" + mapping: | + output = "flag: " + true + error: "cannot add" + + - name: "bool_plus_string" + mapping: | + output = true + " is set" + error: "cannot add" + + # --- Cross-type errors: string + null --- + - name: "string_plus_null" + mapping: | + output = "value: " + null + error: "cannot add" + + - name: "null_plus_string" + mapping: | + output = null + "value" + error: "cannot add" + + # --- Cross-type errors: string + array --- + - name: "string_plus_array" + mapping: | + output = "items: " + [1, 2, 3] + error: "cannot add" + + # --- Cross-type errors: string + object --- + - name: "string_plus_object" + mapping: | + output = "data: " + {"a": 1} + error: "cannot add" + + # --- Cross-type errors: bytes + number --- + - name: "bytes_plus_int" + mapping: | + output = "data".bytes() + 42 + error: "cannot add" + + - name: "int_plus_bytes" + mapping: | + output = 42 + "data".bytes() + error: "cannot add" + + # --- Cross-type errors: bytes + bool --- + - name: "bytes_plus_bool" + mapping: | + output = "data".bytes() + true + error: "cannot add" + + # --- Cross-type errors: bytes + null --- + - name: "bytes_plus_null" + mapping: | + output = "data".bytes() + null + error: "cannot add" + + # --- Explicit conversion with .string() --- + - name: "int_to_string_concat" + mapping: | + output = 5.string() + "3" + output: "53" + + - name: "string_concat_int_rhs_converted" + mapping: | + output = "count: " + 42.string() + output: "count: 42" + + - name: "float_to_string_concat" + mapping: | + output = 3.14.string() + " meters" + output: "3.14 meters" + + - name: "bool_to_string_concat" + mapping: | + output = "flag is " + true.string() + output: "flag is true" + + - name: "null_to_string_concat" + mapping: | + output = "value is " + null.string() + output: "value is null" + + - name: "multiple_conversions" + mapping: | + output = "sum: " + 2.string() + " + " + 3.string() + " = " + 5.string() + output: "sum: 2 + 3 = 5" + + # --- Left-associativity of string concat --- + - name: "concat_left_associative" + mapping: | + output = "a" + "b" + "c" + "d" + output: "abcd" diff --git a/internal/bloblang2/spec/tests/optimizations/constant_folding.yaml b/internal/bloblang2/spec/tests/optimizations/constant_folding.yaml new file mode 100644 index 000000000..154d35fec --- /dev/null +++ b/internal/bloblang2/spec/tests/optimizations/constant_folding.yaml @@ -0,0 +1,235 @@ +description: "Constant folding: literal-only expressions evaluated at compile time" + +tests: + # --- Integer arithmetic --- + + - name: "fold integer addition" + mapping: | + output.v = 2 + 3 + output: {"v": 5} + + - name: "fold integer subtraction" + mapping: | + output.v = 10 - 4 + output: {"v": 6} + + - name: "fold integer multiplication" + mapping: | + output.v = 3 * 7 + output: {"v": 21} + + - name: "fold integer modulo" + mapping: | + output.v = 17 % 5 + output: {"v": 2} + + - name: "fold chained integer arithmetic" + mapping: | + output.v = 2 + 3 * 4 + output: {"v": 14} + + - name: "fold left-associative subtraction" + mapping: | + output.v = 10 - 3 - 2 + output: {"v": 5} + + - name: "fold negative result" + mapping: | + output.v = 3 - 10 + output: {"v": -7} + + - name: "fold zero" + mapping: | + output.v = 5 - 5 + output: {"v": 0} + + # --- Integer overflow not folded (runtime error) --- + + - name: "integer overflow not folded" + mapping: | + output.v = 9223372036854775807 + 1 + error: "overflow" + + - name: "integer multiplication overflow not folded" + mapping: | + output.v = 9223372036854775807 * 2 + error: "overflow" + + # --- Float arithmetic --- + + - name: "fold float addition" + mapping: | + output.v = 1.5 + 2.5 + output: {"v": 4.0} + + - name: "fold float subtraction" + mapping: | + output.v = 10.0 - 3.5 + output: {"v": 6.5} + + - name: "fold float multiplication" + mapping: | + output.v = 2.0 * 3.5 + output: {"v": 7.0} + + - name: "fold float division" + mapping: | + output.v = 10.0 / 4.0 + output: {"v": 2.5} + + - name: "fold float modulo" + mapping: | + output.v = 7.5 % 2.0 + output: {"v": 1.5} + + - name: "fold mixed int and float addition" + mapping: | + output.v = 5 + 3.0 + output: {"v": 8.0} + + - name: "division by zero not folded" + mapping: | + output.v = 10.0 / 0.0 + error: "division by zero" + + # --- Large int + float precision loss not folded --- + + - name: "large int plus float not folded due to precision" + mapping: | + output.v = 9007199254740993 + 1.0 + error: "exact" + + - name: "safe int plus float is folded" + mapping: | + output.v = 100 + 1.5 + output: {"v": 101.5} + + # --- String concatenation --- + + - name: "fold string concatenation" + mapping: | + output.v = "hello" + " " + "world" + output: {"v": "hello world"} + + - name: "fold empty string concatenation" + mapping: | + output.v = "" + "abc" + output: {"v": "abc"} + + - name: "fold raw string concatenation" + mapping: | + output.v = `raw` + " normal" + output: {"v": "raw normal"} + + # --- Boolean logic --- + + - name: "fold true AND true" + mapping: | + output.v = true && true + output: {"v": true} + + - name: "fold true AND false" + mapping: | + output.v = true && false + output: {"v": false} + + - name: "fold false OR true" + mapping: | + output.v = false || true + output: {"v": true} + + - name: "fold false OR false" + mapping: | + output.v = false || false + output: {"v": false} + + - name: "fold boolean equality" + mapping: | + output.v = true == true + output: {"v": true} + + - name: "fold boolean inequality" + mapping: | + output.v = true != false + output: {"v": true} + + # --- Unary --- + + - name: "fold unary minus on int" + mapping: | + output.v = -42 + output: {"v": -42} + + - name: "fold unary minus on float" + mapping: | + output.v = -3.14 + output: {"v": -3.14} + + - name: "fold unary not on true" + mapping: | + output.v = !true + output: {"v": false} + + - name: "fold unary not on false" + mapping: | + output.v = !false + output: {"v": true} + + # --- Equality of same-type literals --- + + - name: "fold string equality" + mapping: | + output.v = "abc" == "abc" + output: {"v": true} + + - name: "fold string inequality" + mapping: | + output.v = "abc" != "xyz" + output: {"v": true} + + - name: "fold integer equality" + mapping: | + output.v = 42 == 42 + output: {"v": true} + + - name: "fold integer inequality" + mapping: | + output.v = 42 != 43 + output: {"v": true} + + - name: "fold null equality" + mapping: | + output.v = null == null + output: {"v": true} + + # --- Cross-type equality --- + + - name: "fold cross-type equality int vs string" + mapping: | + output.v = 42 == "42" + output: {"v": false} + + - name: "fold cross-type inequality bool vs null" + mapping: | + output.v = true != null + output: {"v": true} + + # --- Non-foldable expressions preserved --- + + - name: "runtime value prevents folding" + input: {"x": 5} + mapping: | + output.v = input.x + 3 + output: {"v": 8} + + - name: "one literal one runtime not folded" + input: {"x": 10} + mapping: | + output.v = 2 * input.x + output: {"v": 20} + + - name: "variable prevents folding" + mapping: | + $x = 5 + output.v = $x + 3 + output: {"v": 8} diff --git a/internal/bloblang2/spec/tests/optimizations/dead_code_elimination.yaml b/internal/bloblang2/spec/tests/optimizations/dead_code_elimination.yaml new file mode 100644 index 000000000..7df3b2ba1 --- /dev/null +++ b/internal/bloblang2/spec/tests/optimizations/dead_code_elimination.yaml @@ -0,0 +1,150 @@ +description: "Dead code elimination: unreachable if/match branches with literal boolean conditions are pruned" + +tests: + # --- If expression: literal true --- + + - name: "if true returns then branch" + mapping: | + output.v = if true { "yes" } else { "no" } + output: {"v": "yes"} + + - name: "if true with complex else (else not evaluated)" + mapping: | + output.v = if true { 42 } else { throw("should not reach") } + output: {"v": 42} + + # --- If expression: literal false --- + + - name: "if false returns else branch" + mapping: | + output.v = if false { "yes" } else { "no" } + output: {"v": "no"} + + - name: "if false with complex then (then not evaluated)" + mapping: | + output.v = if false { throw("should not reach") } else { 42 } + output: {"v": 42} + + - name: "if false without else produces void" + mapping: | + output.before = "exists" + output.v = if false { "value" } + output: {"before": "exists"} + + # --- Else-if chains --- + + - name: "else-if chain with literal true in first branch" + mapping: | + output.v = if true { "first" } else if true { "second" } else { "third" } + output: {"v": "first"} + + - name: "else-if chain with false then true" + mapping: | + output.v = if false { "first" } else if true { "second" } else { "third" } + output: {"v": "second"} + + - name: "else-if chain all false reaches else" + mapping: | + output.v = if false { "first" } else if false { "second" } else { "third" } + output: {"v": "third"} + + - name: "false branch with throw is eliminated" + mapping: | + output.v = if false { throw("dead") } else if true { "alive" } else { throw("also dead") } + output: {"v": "alive"} + + # --- If statement: literal true --- + + - name: "if true statement executes body" + mapping: | + if true { + output.v = "yes" + } + output: {"v": "yes"} + + - name: "if true statement eliminates else" + mapping: | + if true { + output.v = "yes" + } else { + output.v = "no" + } + output: {"v": "yes"} + + # --- If statement: literal false --- + + - name: "if false statement skips body" + mapping: | + output.before = "exists" + if false { + output.v = "yes" + } + output: {"before": "exists"} + + - name: "if false statement executes else" + mapping: | + if false { + output.v = "yes" + } else { + output.v = "no" + } + output: {"v": "no"} + + # --- Non-boolean literal conditions preserved for runtime error --- + + - name: "string condition is runtime error not eliminated" + mapping: | + output.v = if "hello" { 1 } else { 2 } + error: "bool" + + - name: "integer condition is runtime error not eliminated" + mapping: | + output.v = if 42 { 1 } else { 2 } + error: "bool" + + - name: "null condition is runtime error not eliminated" + mapping: | + output.v = if null { 1 } else { 2 } + error: "bool" + + - name: "string condition in statement is runtime error" + mapping: | + if 42 { + output.v = "yes" + } + error: "bool" + + # --- Dynamic conditions not affected --- + + - name: "dynamic condition works normally" + mapping: | + output.v = if input.flag { "yes" } else { "no" } + cases: + - name: "true branch" + input: {"flag": true} + output: {"v": "yes"} + - name: "false branch" + input: {"flag": false} + output: {"v": "no"} + + # --- Folded condition feeds into DCE --- + + - name: "folded true condition triggers DCE" + mapping: | + output.v = if !false { "yes" } else { throw("dead") } + output: {"v": "yes"} + + - name: "folded false condition triggers DCE" + mapping: | + output.v = if !true { throw("dead") } else { "no" } + output: {"v": "no"} + + - name: "folded boolean expression triggers DCE" + mapping: | + output.v = if true && true { "yes" } else { throw("dead") } + output: {"v": "yes"} + + - name: "folded false boolean expression triggers DCE" + mapping: | + output.v = if true && false { throw("dead") } else { "no" } + output: {"v": "no"} diff --git a/internal/bloblang2/spec/tests/optimizations/path_collapse.yaml b/internal/bloblang2/spec/tests/optimizations/path_collapse.yaml new file mode 100644 index 000000000..b11a0e898 --- /dev/null +++ b/internal/bloblang2/spec/tests/optimizations/path_collapse.yaml @@ -0,0 +1,167 @@ +description: "PathExpr collapse: chains of field access, indexing, and method calls are collapsed into flat path expressions" + +tests: + # --- Input field chains --- + + - name: "input single field" + input: {"x": 1} + mapping: | + output.v = input.x + output: {"v": 1} + + - name: "input two-level field chain" + input: {"a": {"b": 2}} + mapping: | + output.v = input.a.b + output: {"v": 2} + + - name: "input deep field chain" + input: {"a": {"b": {"c": {"d": {"e": 42}}}}} + mapping: | + output.v = input.a.b.c.d.e + output: {"v": 42} + + - name: "input field chain with index" + input: {"items": [10, 20, 30]} + mapping: | + output.v = input.items[1] + output: {"v": 20} + + - name: "input field chain with nested index" + input: {"data": {"items": [{"name": "Alice"}, {"name": "Bob"}]}} + mapping: | + output.v = input.data.items[1].name + output: {"v": "Bob"} + + - name: "input field chain with method call" + input: {"name": "Alice"} + mapping: | + output.v = input.name.uppercase() + output: {"v": "ALICE"} + + - name: "input field chain with method and further field access" + input: {"data": [3, 1, 2]} + mapping: | + output.v = input.data.sort()[0] + output: {"v": 1} + + - name: "input field chain with multiple methods" + input: {"text": " Hello World "} + mapping: | + output.v = input.text.trim().lowercase() + output: {"v": "hello world"} + + # --- Null-safe chains --- + + - name: "null-safe field chain short-circuits" + input: {"user": null} + mapping: | + output.v = input.user?.name?.trim() + output: {"v": null} + + - name: "null-safe index chain short-circuits" + input: {"items": null} + mapping: | + output.v = input.items?[0]?.name + output: {"v": null} + + - name: "null-safe method chain short-circuits" + input: {"value": null} + mapping: | + output.v = input.value?.uppercase() + output: {"v": null} + + - name: "mixed null-safe and regular access" + input: {"user": {"address": null}} + mapping: | + output.v = input.user.address?.city + output: {"v": null} + + # --- Output field chains --- + + - name: "output field chain read after write" + mapping: | + output.user = {"name": "Alice", "age": 30} + output.v = output.user.name + output: {"user": {"name": "Alice", "age": 30}, "v": "Alice"} + + - name: "output field chain with method" + mapping: | + output.data = {"name": "Alice"} + output.v = output.data.keys().length() + output: {"data": {"name": "Alice"}, "v": 1} + + # --- Variable field chains --- + + - name: "variable field chain" + mapping: | + $user = {"name": "Alice", "address": {"city": "London"}} + output.v = $user.address.city + output: {"v": "London"} + + - name: "variable field chain with index" + mapping: | + $items = [10, 20, 30] + output.v = $items[2] + output: {"v": 30} + + - name: "variable field chain with method" + mapping: | + $name = "hello" + output.v = $name.uppercase() + output: {"v": "HELLO"} + + # --- Input metadata chains --- + + - name: "input metadata field chain" + input_metadata: {"routing": {"region": "us-west"}} + mapping: | + output.v = input@.routing.region + output: {"v": "us-west"} + + - name: "input metadata with method" + input_metadata: {"topic": "events"} + mapping: | + output.v = input@.topic.uppercase() + output: {"v": "EVENTS"} + + # --- Mixed chain with dynamic index --- + + - name: "chain with dynamic string index" + input: {"data": {"x": 1, "y": 2}} + mapping: | + $key = "y" + output.v = input.data[$key] + output: {"v": 2} + + - name: "chain with negative index" + input: {"items": [1, 2, 3]} + mapping: | + output.v = input.items[-1] + output: {"v": 3} + + # --- Chain with lambda method --- + + - name: "chain with filter method" + input: {"items": [1, -2, 3, -4]} + mapping: | + output.v = input.items.filter(x -> x > 0) + output: {"v": [1, 3]} + + - name: "chain with map and further access" + input: {"items": [1, 2, 3]} + mapping: | + output.v = input.items.map(x -> x * 2).length() + output: {"v": 3} + + # --- Non-collapsible roots are preserved --- + + - name: "method call on literal is not collapsed" + mapping: | + output.v = "hello".uppercase() + output: {"v": "HELLO"} + + - name: "method call on function result is not collapsed" + mapping: | + output.v = uuid_v4().length() + output: {"v": 36} diff --git a/internal/bloblang2/spec/tests/stdlib/any_all_methods.yaml b/internal/bloblang2/spec/tests/stdlib/any_all_methods.yaml new file mode 100644 index 000000000..3c431805b --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/any_all_methods.yaml @@ -0,0 +1,100 @@ +description: > + .any() and .all() methods — short-circuit semantics (required), + empty array behavior, and boolean return enforcement. + +tests: + # --- .any() basic --- + + - name: "any returns true when one element matches" + mapping: | + output.v = [1, 2, 3, 4, 5].any(x -> x > 4) + output: {"v": true} + + - name: "any returns false when no element matches" + mapping: | + output.v = [1, 2, 3].any(x -> x > 10) + output: {"v": false} + + - name: "any on empty array returns false" + mapping: | + output.v = [].any(x -> true) + output: {"v": false} + + - name: "any returns true on first match" + mapping: | + output.v = [true, true, true].any(x -> x) + output: {"v": true} + + # --- .all() basic --- + + - name: "all returns true when all match" + mapping: | + output.v = [2, 4, 6].all(x -> x % 2 == 0) + output: {"v": true} + + - name: "all returns false when one doesn't match" + mapping: | + output.v = [2, 3, 6].all(x -> x % 2 == 0) + output: {"v": false} + + - name: "all on empty array returns true" + mapping: | + output.v = [].all(x -> false) + output: {"v": true} + + # --- Short-circuit required --- + + - name: "any short-circuits on first true (throw not reached)" + mapping: | + output.v = [1, 2, 3].any(x -> if x == 1 { true } else { throw("boom") }) + output: {"v": true} + + - name: "all short-circuits on first false (throw not reached)" + mapping: | + output.v = [1, 2, 3].all(x -> if x == 1 { false } else { throw("boom") }) + output: {"v": false} + + # --- Complex predicates --- + + - name: "any with object field access" + mapping: | + $items = [ + {"status": "active"}, + {"status": "pending"}, + {"status": "inactive"}, + ] + output.v = $items.any(item -> item.status == "pending") + output: {"v": true} + + - name: "all with method chain in predicate" + mapping: | + output.v = ["hello", "world", "test"].all(s -> s.length() >= 4) + output: {"v": true} + + - name: "all fails with short string" + mapping: | + output.v = ["hello", "hi", "test"].all(s -> s.length() >= 4) + output: {"v": false} + + # --- any and all with outer captures --- + + - name: "any with outer variable in predicate" + mapping: | + $min = 10 + output.v = [5, 15, 25].any(x -> x > $min) + output: {"v": true} + + - name: "all with outer variable in predicate" + mapping: | + $min = 0 + output.v = [5, 15, 25].all(x -> x > $min) + output: {"v": true} + + # --- Combined any/all --- + + - name: "any and all on same array" + mapping: | + $nums = [2, 4, 6, 8] + output.any_odd = $nums.any(x -> x % 2 != 0) + output.all_even = $nums.all(x -> x % 2 == 0) + output: {"any_odd": false, "all_even": true} diff --git a/internal/bloblang2/spec/tests/stdlib/array_modify.yaml b/internal/bloblang2/spec/tests/stdlib/array_modify.yaml new file mode 100644 index 000000000..d0d0ca23f --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/array_modify.yaml @@ -0,0 +1,196 @@ +description: "Array modify methods — append, concat, without_index, join, collect" + +tests: + # --- append --- + + - name: "append to array" + mapping: | + output.result = [1, 2, 3].append(4) + output: {"result": [1, 2, 3, 4]} + + - name: "append to empty array" + mapping: | + output.result = [].append("first") + output: {"result": ["first"]} + + - name: "append null" + mapping: | + output.result = [1, 2].append(null) + output: {"result": [1, 2, null]} + + - name: "append array as single element" + mapping: | + output.result = [1, 2].append([3, 4]) + output: {"result": [1, 2, [3, 4]]} + + - name: "append does not modify original" + mapping: | + $arr = [1, 2] + $new = $arr.append(3) + output.original = $arr + output.new = $new + output: {"original": [1, 2], "new": [1, 2, 3]} + + - name: "append bool" + mapping: | + output.result = [1, "two"].append(true) + output: {"result": [1, "two", true]} + + # --- concat --- + + - name: "concat two arrays" + mapping: | + output.result = [1, 2].concat([3, 4]) + output: {"result": [1, 2, 3, 4]} + + - name: "concat with empty array" + mapping: | + output.result = [1, 2].concat([]) + output: {"result": [1, 2]} + + - name: "concat empty with non-empty" + mapping: | + output.result = [].concat([1, 2]) + output: {"result": [1, 2]} + + - name: "concat two empty arrays" + mapping: | + output.result = [].concat([]) + output: {"result": []} + + - name: "concat preserves element types" + mapping: | + output.result = ["a", "b"].concat([1, 2]) + output: {"result": ["a", "b", 1, 2]} + + - name: "concat does not modify original" + mapping: | + $a = [1, 2] + $b = [3, 4] + $c = $a.concat($b) + output.a = $a + output.b = $b + output.c = $c + output: {"a": [1, 2], "b": [3, 4], "c": [1, 2, 3, 4]} + + # --- without_index --- + + - name: "without_index removes element at index" + mapping: | + output.result = [10, 20, 30].without_index(1) + output: {"result": [10, 30]} + + - name: "without_index first element" + mapping: | + output.result = [10, 20, 30].without_index(0) + output: {"result": [20, 30]} + + - name: "without_index last element" + mapping: | + output.result = [10, 20, 30].without_index(2) + output: {"result": [10, 20]} + + - name: "without_index negative index" + mapping: | + output.result = [10, 20, 30].without_index(-1) + output: {"result": [10, 20]} + + - name: "without_index negative index second to last" + mapping: | + output.result = [10, 20, 30].without_index(-2) + output: {"result": [10, 30]} + + - name: "without_index out of bounds positive is error" + mapping: | + output.result = [10, 20, 30].without_index(3) + error: "out of bounds" + + - name: "without_index out of bounds negative is error" + mapping: | + output.result = [10, 20, 30].without_index(-4) + error: "out of bounds" + + - name: "without_index single element array" + mapping: | + output.result = [42].without_index(0) + output: {"result": []} + + - name: "without_index does not modify original" + mapping: | + $arr = [1, 2, 3] + $new = $arr.without_index(1) + output.original = $arr + output.new = $new + output: {"original": [1, 2, 3], "new": [1, 3]} + + # --- join --- + + - name: "join strings with delimiter" + mapping: | + output.result = ["a", "b", "c"].join(", ") + output: {"result": "a, b, c"} + + - name: "join with empty delimiter" + mapping: | + output.result = ["a", "b", "c"].join("") + output: {"result": "abc"} + + - name: "join empty array" + mapping: | + output.result = [].join(", ") + output: {"result": ""} + + - name: "join single element" + mapping: | + output.result = ["hello"].join(", ") + output: {"result": "hello"} + + - name: "join non-string element is error" + mapping: | + output.result = ["a", 2, "c"].join(", ") + error: "string" + + - name: "join with newline delimiter" + mapping: | + output.result = ["line1", "line2", "line3"].join("\n") + output: {"result": "line1\nline2\nline3"} + + # --- collect --- + + - name: "collect key-value pairs into object" + mapping: | + $pairs = [{"key": "a", "value": 1}, {"key": "b", "value": 2}] + output.result = $pairs.collect() + output: {"result": {"a": 1, "b": 2}} + + - name: "collect empty array" + mapping: | + output.result = [].collect() + output: {"result": {}} + + - name: "collect duplicate keys last wins" + mapping: | + $pairs = [{"key": "a", "value": 1}, {"key": "a", "value": 2}] + output.result = $pairs.collect() + output: {"result": {"a": 2}} + + - name: "collect missing key field is error" + mapping: | + output.result = [{"value": 1}].collect() + error: "key" + + - name: "collect missing value field is error" + mapping: | + output.result = [{"key": "a"}].collect() + error: "value" + + - name: "collect non-object element is error" + mapping: | + output.result = ["not an object"].collect() + error: "key" + + - name: "collect with mixed value types" + mapping: | + $pairs = [{"key": "name", "value": "Alice"}, {"key": "age", "value": 30}, {"key": "active", "value": true}] + output.result = $pairs.collect() + output: {"result": {"name": "Alice", "age": 30, "active": true}} diff --git a/internal/bloblang2/spec/tests/stdlib/array_query.yaml b/internal/bloblang2/spec/tests/stdlib/array_query.yaml new file mode 100644 index 000000000..c4a1d1a06 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/array_query.yaml @@ -0,0 +1,352 @@ +description: "Array query methods — any, all, find, contains, index_of, sum, min, max, fold" + +tests: + # --- any --- + + - name: "any returns true when some match" + mapping: | + output.result = [1, 2, 3, 4].any(x -> x > 3) + output: {"result": true} + + - name: "any returns false when none match" + mapping: | + output.result = [1, 2, 3].any(x -> x > 10) + output: {"result": false} + + - name: "any returns false for empty array" + mapping: | + output.result = [].any(x -> x > 0) + output: {"result": false} + + - name: "any short-circuits on first true" + mapping: | + output.result = [1, 2, 3].any(x -> if x == 1 { true } else { throw("should not reach") }) + output: {"result": true} + + - name: "any non-bool return is error" + mapping: | + output.result = [1, 2].any(x -> x * 2) + error: "bool" + + - name: "any void return is error" + mapping: | + output.result = [1, 2].any(x -> if false { true }) + error: "void" + + # --- all --- + + - name: "all returns true when all match" + mapping: | + output.result = [2, 4, 6].all(x -> x % 2 == 0) + output: {"result": true} + + - name: "all returns false when some do not match" + mapping: | + output.result = [2, 3, 6].all(x -> x % 2 == 0) + output: {"result": false} + + - name: "all returns true for empty array" + mapping: | + output.result = [].all(x -> x > 0) + output: {"result": true} + + - name: "all short-circuits on first false" + mapping: | + output.result = [1, 2, 3].all(x -> if x == 1 { false } else { throw("should not reach") }) + output: {"result": false} + + - name: "all non-bool return is error" + mapping: | + output.result = [1, 2].all(x -> x.string()) + error: "bool" + + - name: "all void return is error" + mapping: | + output.result = [1, 2].all(x -> if false { true }) + error: "void" + + # --- find --- + + - name: "find returns first match" + mapping: | + output.result = [1, 2, 3, 4].find(x -> x > 2) + output: {"result": 3} + + - name: "find returns void when no match" + mapping: | + output.found = "before" + output.found = [1, 2, 3].find(x -> x > 10) + output: {"found": "before"} + + - name: "find on empty array returns void" + mapping: | + output.found = "default" + output.found = [].find(x -> x > 0) + output: {"found": "default"} + + - name: "find short-circuits on first match" + mapping: | + output.result = [10, 20, 30].find(x -> if x == 10 { true } else { throw("should not reach") }) + output: {"result": 10} + + - name: "find void lambda return is error" + mapping: | + output.result = [1, 2, 3].find(x -> if false { true }) + error: "void" + + - name: "find void result works with or" + mapping: | + output.result = [1, 2, 3].find(x -> x > 10).or("none") + output: {"result": "none"} + + - name: "find void result assigned to output field is void (no assignment)" + mapping: | + output.result = [1, 2, 3].find(x -> x > 10) + output: {} + + # --- contains (array) --- + + - name: "contains finds integer" + mapping: | + output.result = [1, 2, 3].contains(2) + output: {"result": true} + + - name: "contains does not find missing element" + mapping: | + output.result = [1, 2, 3].contains(4) + output: {"result": false} + + - name: "contains finds string" + mapping: | + output.result = ["apple", "banana"].contains("banana") + output: {"result": true} + + - name: "contains on empty array" + mapping: | + output.result = [].contains(1) + output: {"result": false} + + - name: "contains with null" + mapping: | + output.result = [1, null, 3].contains(null) + output: {"result": true} + + - name: "contains with bool" + mapping: | + output.result = [true, false].contains(false) + output: {"result": true} + + - name: "contains type mismatch returns false" + mapping: | + output.result = [1, 2, 3].contains("2") + output: {"result": false} + + # --- index_of (array) --- + + - name: "index_of finds first occurrence" + mapping: | + output.result = [10, 20, 30, 20].index_of(20) + output: {"result": 1} + + - name: "index_of returns -1 when not found" + mapping: | + output.result = [10, 20, 30].index_of(99) + output: {"result": -1} + + - name: "index_of on empty array" + mapping: | + output.result = [].index_of(1) + output: {"result": -1} + + - name: "index_of with string" + mapping: | + output.result = ["a", "b", "c"].index_of("b") + output: {"result": 1} + + - name: "index_of type mismatch returns -1" + mapping: | + output.result = [1, 2, 3].index_of("1") + output: {"result": -1} + + # --- sum --- + + - name: "sum of integers" + mapping: | + output.result = [1, 2, 3, 4].sum() + output: {"result": 10} + + - name: "sum of empty array is zero int64" + mapping: | + output.result = [].sum() + output: {"result": 0} + + - name: "sum of floats" + mapping: | + output.result = [1.5, 2.5, 3.0].sum() + output: {"result": 7.0} + + - name: "sum promotes int and float pairwise" + mapping: | + output.result = [1, 2.5, 3].sum() + output: {"result": 6.5} + + - name: "sum non-numeric element is error" + mapping: | + output.result = [1, "two", 3].sum() + error: "numeric" + + - name: "sum single element" + mapping: | + output.result = [42].sum() + output: {"result": 42} + + - name: "sum single non-numeric element is error" + mapping: | + output.result = ["hello"].sum() + error: "numeric" + + - name: "sum single bool element is error" + mapping: | + output.result = [true].sum() + error: "numeric" + + # --- min --- + + - name: "min of integers" + mapping: | + output.result = [3, 1, 4, 1, 5].min() + output: {"result": 1} + + - name: "min of floats" + mapping: | + output.result = [3.14, 1.0, 2.71].min() + output: {"result": 1.0} + + - name: "min of strings" + mapping: | + output.result = ["banana", "apple", "cherry"].min() + output: {"result": "apple"} + + - name: "min of empty array is error" + mapping: | + output.result = [].min() + error: "empty" + + - name: "min cross-family is error" + mapping: | + output.result = [1, "two"].min() + error: "sort" + + - name: "min single element" + mapping: | + output.result = [42].min() + output: {"result": 42} + + - name: "min with numeric promotion" + mapping: | + output.result = [3, 1.5, 2].min() + output: {"result": 1.5} + + - name: "min large int64 mixed with float is error" + mapping: | + output.result = [9007199254740993, 1.0].min() + error: "exact" + + - name: "max large int64 mixed with float is error" + mapping: | + output.result = [1.0, 9007199254740993].max() + error: "exact" + + - name: "min mixed numeric result is promoted type" + mapping: | + output.result = [3, 1.5, 2].min().type() + output: {"result": "float64"} + + - name: "max mixed numeric returns promoted type" + mapping: | + $result = [1.5, 3, 2].max() + output.value = $result + output.type = $result.type() + output: {"value": 3.0, "type": "float64"} + + # --- max --- + + - name: "max of integers" + mapping: | + output.result = [3, 1, 4, 1, 5].max() + output: {"result": 5} + + - name: "max of floats" + mapping: | + output.result = [3.14, 1.0, 2.71].max() + output: {"result": 3.14} + + - name: "max of strings" + mapping: | + output.result = ["banana", "apple", "cherry"].max() + output: {"result": "cherry"} + + - name: "max of empty array is error" + mapping: | + output.result = [].max() + error: "empty" + + - name: "max cross-family is error" + mapping: | + output.result = [1, "two"].max() + error: "sort" + + - name: "max single element" + mapping: | + output.result = ["only"].max() + output: {"result": "only"} + + # --- fold --- + + - name: "fold sum with integer accumulator" + mapping: | + output.result = [1, 2, 3, 4].fold(0, (tally, x) -> tally + x) + output: {"result": 10} + + - name: "fold string concatenation" + mapping: | + output.result = ["a", "b", "c"].fold("", (acc, x) -> acc + x) + output: {"result": "abc"} + + - name: "fold on empty array returns initial value" + mapping: | + output.result = [].fold(42, (acc, x) -> acc + x) + output: {"result": 42} + + - name: "fold builds array" + mapping: | + output.result = [1, 2, 3].fold([], (acc, x) -> acc.append(x * 10)) + output: {"result": [10, 20, 30]} + + - name: "fold builds object" + mapping: | + $pairs = [{"k": "a", "v": 1}, {"k": "b", "v": 2}] + output.result = $pairs.fold({}, (acc, item) -> acc.merge({item.k: item.v})) + output: {"result": {"a": 1, "b": 2}} + + - name: "fold with product" + mapping: | + output.result = [1, 2, 3, 4].fold(1, (product, x) -> product * x) + output: {"result": 24} + + # --- Named arguments for lambda methods --- + + - name: "fold with named args reordered" + mapping: | + output.result = [1, 2, 3].fold(fn: (tally, x) -> tally + x, initial: 0) + output: {"result": 6} + + - name: "filter with named fn arg" + mapping: | + output.result = [1, 2, 3, 4].filter(fn: x -> x > 2) + output: {"result": [3, 4]} + + - name: "slice with named args reordered" + mapping: | + output.result = [10, 20, 30, 40, 50].slice(high: 3, low: 1) + output: {"result": [20, 30]} diff --git a/internal/bloblang2/spec/tests/stdlib/array_transform.yaml b/internal/bloblang2/spec/tests/stdlib/array_transform.yaml new file mode 100644 index 000000000..5ddf0d16d --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/array_transform.yaml @@ -0,0 +1,292 @@ +description: "Array transform methods — filter, map, sort, sort_by, flatten, unique, enumerate" + +tests: + # --- filter --- + + - name: "filter keeps matching elements" + mapping: | + output.result = [1, 2, 3, 4, 5].filter(x -> x > 3) + output: {"result": [4, 5]} + + - name: "filter with no matches returns empty array" + mapping: | + output.result = [1, 2, 3].filter(x -> x > 100) + output: {"result": []} + + - name: "filter on empty array returns empty array" + mapping: | + output.result = [].filter(x -> x > 0) + output: {"result": []} + + - name: "filter keeps all when all match" + mapping: | + output.result = [10, 20, 30].filter(x -> x > 0) + output: {"result": [10, 20, 30]} + + - name: "filter preserves original element types" + mapping: | + output.result = ["apple", "banana", "avocado"].filter(s -> s.has_prefix("a")) + output: {"result": ["apple", "avocado"]} + + - name: "filter non-bool return is error" + mapping: | + output.result = [1, 2, 3].filter(x -> x * 2) + error: "bool" + + - name: "filter void return is error" + mapping: | + output.result = [1, 2, 3].filter(x -> if false { true }) + error: "void" + + - name: "filter does not modify original array" + mapping: | + $arr = [1, 2, 3, 4] + $filtered = $arr.filter(x -> x > 2) + output.original = $arr + output.filtered = $filtered + output: {"original": [1, 2, 3, 4], "filtered": [3, 4]} + + # --- map --- + + - name: "map doubles each element" + mapping: | + output.result = [1, 2, 3].map(x -> x * 2) + output: {"result": [2, 4, 6]} + + - name: "map on empty array returns empty array" + mapping: | + output.result = [].map(x -> x + 1) + output: {"result": []} + + - name: "map can change element types" + mapping: | + output.result = [1, 2, 3].map(x -> x.string()) + output: {"result": ["1", "2", "3"]} + + - name: "map deleted omits element" + mapping: | + output.result = [1, 2, 3, 4].map(x -> if x % 2 == 0 { x * 10 } else { deleted() }) + output: {"result": [20, 40]} + + - name: "map void return is error" + mapping: | + output.result = [1, 2].map(x -> if false { x }) + error: "void" + + - name: "map does not modify original array" + mapping: | + $arr = [1, 2, 3] + $mapped = $arr.map(x -> x + 10) + output.original = $arr + output.mapped = $mapped + output: {"original": [1, 2, 3], "mapped": [11, 12, 13]} + + - name: "map with block body" + mapping: | + output.result = [1, 2, 3].map(x -> { + $doubled = x * 2 + $doubled + 1 + }) + output: {"result": [3, 5, 7]} + + # --- sort --- + + - name: "sort integers ascending" + mapping: | + output.result = [3, 1, 4, 1, 5].sort() + output: {"result": [1, 1, 3, 4, 5]} + + - name: "sort empty array" + mapping: | + output.result = [].sort() + output: {"result": []} + + - name: "sort single element" + mapping: | + output.result = [42].sort() + output: {"result": [42]} + + - name: "sort strings lexicographic" + mapping: | + output.result = ["banana", "apple", "cherry"].sort() + output: {"result": ["apple", "banana", "cherry"]} + + - name: "sort floats ascending" + mapping: | + output.result = [3.14, 1.0, 2.71].sort() + output: {"result": [1.0, 2.71, 3.14]} + + - name: "sort numeric promotion int and float" + mapping: | + output.result = [3, 1.5, 2].sort() + output: {"result": [1.5, 2, 3]} + + - name: "sort is stable" + mapping: | + $items = [{"k": "a", "v": 2}, {"k": "b", "v": 1}, {"k": "c", "v": 2}] + output.result = $items.sort_by(x -> x.v).map(x -> x.k) + output: {"result": ["b", "a", "c"]} + + - name: "sort cross-family is error" + mapping: | + output.result = [1, "two", 3].sort() + error: "sort" + + - name: "sort booleans is error" + mapping: | + output.result = [true, false, true].sort() + error: "sort" + + - name: "sort nulls is error" + mapping: | + output.result = [null, null].sort() + error: "sort" + + - name: "sort single boolean is error" + mapping: | + output.result = [true].sort() + error: "sortable" + + - name: "sort single null is error" + mapping: | + output.result = [null].sort() + error: "sortable" + + - name: "sort single object is error" + mapping: | + output.result = [{"a": 1}].sort() + error: "sortable" + + - name: "sort int64 above 2^53 mixed with float is error" + mapping: | + output.result = [9007199254740993, 1.0].sort() + error: "exact" + + - name: "sort NaN sorts after all values" + input: {"nan": {_type: "float64", value: "NaN"}} + mapping: | + output.result = [input.nan, 1.0, 3.0, 2.0].sort() + output: {"result": [1.0, 2.0, 3.0, {_type: "float64", value: "NaN"}]} + + # --- sort_by --- + + - name: "sort_by with key function" + mapping: | + $items = [{"name": "Charlie"}, {"name": "Alice"}, {"name": "Bob"}] + output.result = $items.sort_by(x -> x.name) + output: {"result": [{"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"}]} + + - name: "sort_by numeric key" + mapping: | + $items = [{"score": 80}, {"score": 95}, {"score": 70}] + output.result = $items.sort_by(x -> x.score) + output: {"result": [{"score": 70}, {"score": 80}, {"score": 95}]} + + - name: "sort_by empty array" + mapping: | + output.result = [].sort_by(x -> x) + output: {"result": []} + + - name: "sort_by cross-family keys is error" + mapping: | + $items = [{"k": 1}, {"k": "two"}] + output.result = $items.sort_by(x -> x.k) + error: "sort" + + # --- flatten --- + + - name: "flatten nested arrays one level" + mapping: | + output.result = [[1, 2], [3, 4], [5]].flatten() + output: {"result": [1, 2, 3, 4, 5]} + + - name: "flatten only one level deep" + mapping: | + output.result = [[[1, 2]], [[3]]].flatten() + output: {"result": [[1, 2], [3]]} + + - name: "flatten non-arrays kept as-is" + mapping: | + output.result = [1, [2, 3], "hello", [4]].flatten() + output: {"result": [1, 2, 3, "hello", 4]} + + - name: "flatten empty inner arrays spliced as zero elements" + mapping: | + output.result = [1, [], 2, [], 3].flatten() + output: {"result": [1, 2, 3]} + + - name: "flatten empty array" + mapping: | + output.result = [].flatten() + output: {"result": []} + + - name: "flatten array of empty arrays" + mapping: | + output.result = [[], [], []].flatten() + output: {"result": []} + + - name: "flatten mixed types with nested arrays" + mapping: | + output.result = [null, [true, false], "abc"].flatten() + output: {"result": [null, true, false, "abc"]} + + # --- unique --- + + - name: "unique removes duplicates" + mapping: | + output.result = [1, 2, 2, 3, 1, 3].unique() + output: {"result": [1, 2, 3]} + + - name: "unique keeps first occurrence" + mapping: | + output.result = [3, 1, 2, 1, 3].unique() + output: {"result": [3, 1, 2]} + + - name: "unique on empty array" + mapping: | + output.result = [].unique() + output: {"result": []} + + - name: "unique strings" + mapping: | + output.result = ["a", "b", "a", "c", "b"].unique() + output: {"result": ["a", "b", "c"]} + + - name: "unique with key function" + mapping: | + $items = [{"id": 1, "v": "a"}, {"id": 2, "v": "b"}, {"id": 1, "v": "c"}] + output.result = $items.unique(x -> x.id) + output: {"result": [{"id": 1, "v": "a"}, {"id": 2, "v": "b"}]} + + - name: "unique NaN values considered equal" + input: {"nan1": {_type: "float64", value: "NaN"}, "nan2": {_type: "float64", value: "NaN"}} + mapping: | + output.result = [input.nan1, 1.0, input.nan2, 2.0].unique() + output: {"result": [{_type: "float64", value: "NaN"}, 1.0, 2.0]} + + - name: "unique mixed types preserved" + mapping: | + output.result = [1, "1", true, 1, "1", true].unique() + output: {"result": [1, "1", true]} + + # --- enumerate --- + + - name: "enumerate basic" + mapping: | + output.result = ["a", "b", "c"].enumerate() + output: {"result": [{"index": 0, "value": "a"}, {"index": 1, "value": "b"}, {"index": 2, "value": "c"}]} + + - name: "enumerate empty array" + mapping: | + output.result = [].enumerate() + output: {"result": []} + + - name: "enumerate single element" + mapping: | + output.result = [42].enumerate() + output: {"result": [{"index": 0, "value": 42}]} + + - name: "enumerate preserves element types" + mapping: | + output.result = [true, null, 3.14].enumerate() + output: {"result": [{"index": 0, "value": true}, {"index": 1, "value": null}, {"index": 2, "value": 3.14}]} diff --git a/internal/bloblang2/spec/tests/stdlib/collect_method.yaml b/internal/bloblang2/spec/tests/stdlib/collect_method.yaml new file mode 100644 index 000000000..a5f5d6fa7 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/collect_method.yaml @@ -0,0 +1,74 @@ +description: > + .collect() method — converts array of {key, value} objects into an object. + Last value wins on duplicate keys. + +tests: + # --- Basic collect --- + + - name: "collect simple key-value pairs" + mapping: | + output.v = [ + {"key": "a", "value": 1}, + {"key": "b", "value": 2}, + ].collect() + output: {"v": {"a": 1, "b": 2}} + + - name: "collect single entry" + mapping: | + output.v = [{"key": "x", "value": 42}].collect() + output: {"v": {"x": 42}} + + - name: "collect empty array" + mapping: | + output.v = [].collect() + output: {"v": {}} + + # --- Last value wins on duplicates --- + + - name: "collect duplicate keys — last wins" + mapping: | + output.v = [ + {"key": "a", "value": 1}, + {"key": "a", "value": 2}, + ].collect() + output: {"v": {"a": 2}} + + - name: "collect multiple duplicates — last wins" + mapping: | + output.v = [ + {"key": "x", "value": "first"}, + {"key": "x", "value": "second"}, + {"key": "x", "value": "third"}, + ].collect() + output: {"v": {"x": "third"}} + + # --- Mixed value types --- + + - name: "collect with mixed value types" + mapping: | + output.v = [ + {"key": "str", "value": "hello"}, + {"key": "num", "value": 42}, + {"key": "bool", "value": true}, + {"key": "arr", "value": [1, 2]}, + {"key": "null", "value": null}, + ].collect() + output: {"v": {"str": "hello", "num": 42, "bool": true, "arr": [1, 2], "null": null}} + + # --- Collect from transformed data --- + + - name: "enumerate then collect round-trips" + mapping: | + output.v = ["a", "b", "c"] + .enumerate() + .map(e -> {"key": e.index.string(), "value": e.value}) + .collect() + output: {"v": {"0": "a", "1": "b", "2": "c"}} + + - name: "map_entries to key-value then collect" + mapping: | + output.v = {"x": 1, "y": 2} + .iter() + .map(e -> {"key": e.key + "_new", "value": e.value * 10}) + .collect() + output: {"v": {"x_new": 10, "y_new": 20}} diff --git a/internal/bloblang2/spec/tests/stdlib/core_functions.yaml b/internal/bloblang2/spec/tests/stdlib/core_functions.yaml new file mode 100644 index 000000000..9fb8fbcb0 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/core_functions.yaml @@ -0,0 +1,247 @@ +description: "Core stdlib functions: uuid_v4, now, random_int, range, timestamp constructor, duration constants" + +tests: + # --- uuid_v4 --- + + - name: "uuid_v4 returns a string" + mapping: | + output = uuid_v4().type() + output: "string" + + - name: "uuid_v4 is non-deterministic" + mapping: | + output = uuid_v4() + no_output_check: true + output_type: "string" + + - name: "uuid_v4 returns 36 chars" + mapping: | + output = uuid_v4().length() + output: 36 + + - name: "uuid_v4 two calls differ" + mapping: | + output = uuid_v4() != uuid_v4() + output: true + + # --- now --- + + - name: "now returns a timestamp" + mapping: | + output = now() + no_output_check: true + output_type: "timestamp" + + - name: "now type check" + mapping: | + output = now().type() + output: "timestamp" + + - name: "now called twice may differ" + mapping: | + $a = now() + $b = now() + output = $a <= $b + output: true + + # --- random_int --- + + - name: "random_int returns int64" + mapping: | + output = random_int(0, 10).type() + output: "int64" + + - name: "random_int within range" + mapping: | + $v = random_int(5, 5) + output = $v + output: 5 + + - name: "random_int is non-deterministic" + mapping: | + output = random_int(0, 1000000) + no_output_check: true + output_type: "int64" + + - name: "random_int min equals max returns that value" + mapping: | + output = random_int(42, 42) + output: 42 + + - name: "random_int min greater than max is error" + mapping: | + output = random_int(10, 5) + error: "min" + + - name: "random_int named args reorder correctly" + mapping: | + output = random_int(max: 5, min: 5) + output: 5 + + - name: "random_int named args reversed still works" + mapping: | + $v = random_int(max: 10, min: 10) + output = $v + output: 10 + + - name: "random_int negative range" + mapping: | + $v = random_int(-10, -10) + output = $v + output: -10 + + # --- range --- + + - name: "range ascending default step" + mapping: | + output = range(0, 5) + output: [0, 1, 2, 3, 4] + + - name: "range ascending explicit step" + mapping: | + output = range(0, 10, 2) + output: [0, 2, 4, 6, 8] + + - name: "range descending inferred step" + mapping: | + output = range(5, 0) + output: [5, 4, 3, 2, 1] + + - name: "range descending explicit step" + mapping: | + output = range(10, 0, -3) + output: [10, 7, 4, 1] + + - name: "range start equals stop is empty" + mapping: | + output = range(5, 5) + output: [] + + - name: "range step zero is error" + mapping: | + output = range(0, 5, 0) + error: "step" + + - name: "range step contradicts direction positive" + mapping: | + output = range(0, 5, -1) + error: "step" + + - name: "range step contradicts direction negative" + mapping: | + output = range(5, 0, 1) + error: "step" + + - name: "range single element" + mapping: | + output = range(0, 1) + output: [0] + + - name: "range negative values" + mapping: | + output = range(-3, 3) + output: [-3, -2, -1, 0, 1, 2] + + - name: "range result element type is int64" + mapping: | + output = range(0, 3)[0].type() + output: "int64" + + - name: "range named args reorder correctly" + mapping: | + output = range(stop: 3, start: 0) + output: [0, 1, 2] + + - name: "range named args with step" + mapping: | + output = range(stop: 10, start: 0, step: 3) + output: [0, 3, 6, 9] + + - name: "range named args omit optional step" + mapping: | + output = range(stop: 3, start: 0) + output: [0, 1, 2] + + # --- timestamp constructor --- + + - name: "timestamp required args only" + mapping: | + output = timestamp(2024, 6, 15) + output: {_type: "timestamp", value: "2024-06-15T00:00:00Z"} + + - name: "timestamp all positional args" + mapping: | + output = timestamp(2024, 12, 25, 14, 30, 45, 500000000) + output: {_type: "timestamp", value: "2024-12-25T14:30:45.5Z"} + + - name: "timestamp with named args" + mapping: | + output = timestamp(year: 2025, month: 1, day: 1, hour: 12) + output: {_type: "timestamp", value: "2025-01-01T12:00:00Z"} + + - name: "timestamp with timezone" + mapping: | + output = timestamp(2024, 7, 4, 9, 0, 0, 0, "America/Chicago") + output: {_type: "timestamp", value: "2024-07-04T09:00:00-05:00"} + + - name: "timestamp invalid month zero" + mapping: | + output = timestamp(2024, 0, 1) + error: "out of range" + + - name: "timestamp invalid month thirteen" + mapping: | + output = timestamp(2024, 13, 1) + error: "out of range" + + - name: "timestamp invalid day zero" + mapping: | + output = timestamp(2024, 1, 0) + error: "out of range" + + - name: "timestamp invalid timezone" + mapping: | + output = timestamp(2024, 1, 1, 0, 0, 0, 0, "Fake/Zone") + error: "timezone" + + - name: "timestamp named args skip middle optional params" + mapping: | + output = timestamp(year: 2024, month: 7, day: 4, timezone: "America/Chicago") + output: {_type: "timestamp", value: "2024-07-04T00:00:00-05:00"} + + - name: "timestamp named args all required only" + mapping: | + output = timestamp(day: 15, month: 3, year: 2024) + output: {_type: "timestamp", value: "2024-03-15T00:00:00Z"} + + # --- duration constants --- + + - name: "second returns 1e9 nanoseconds" + mapping: | + output = second() + output: 1000000000 + + - name: "minute returns 60e9 nanoseconds" + mapping: | + output = minute() + output: 60000000000 + + - name: "hour returns 3600e9 nanoseconds" + mapping: | + output = hour() + output: 3600000000000 + + - name: "day returns 86400e9 nanoseconds" + mapping: | + output = day() + output: 86400000000000 + + - name: "duration constants are int64" + mapping: | + output = second().type() + output: "int64" + + - name: "duration arithmetic 2 hours plus 30 minutes" + mapping: | + output = 2 * hour() + 30 * minute() + output: 9000000000000 diff --git a/internal/bloblang2/spec/tests/stdlib/encoding.yaml b/internal/bloblang2/spec/tests/stdlib/encoding.yaml new file mode 100644 index 000000000..93411239d --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/encoding.yaml @@ -0,0 +1,363 @@ +description: "Encoding and parsing methods: parse_json, format_json, encode, decode" + +tests: + # --- parse_json: objects --- + + - name: "parse_json object" + mapping: | + output = `{"name":"Alice","age":30}`.parse_json() + output: {"name": "Alice", "age": 30} + + - name: "parse_json empty object" + mapping: | + output = `{}`.parse_json() + output: {} + + # --- parse_json: arrays --- + + - name: "parse_json array of ints" + mapping: | + output = `[1,2,3]`.parse_json() + output: [1, 2, 3] + + - name: "parse_json empty array" + mapping: | + output = `[]`.parse_json() + output: [] + + # --- parse_json: scalars --- + + - name: "parse_json string" + mapping: | + output = `"hello"`.parse_json() + output: "hello" + + - name: "parse_json integer is int64" + mapping: | + output = `42`.parse_json().type() + output: "int64" + + - name: "parse_json integer value" + mapping: | + output = `42`.parse_json() + output: 42 + + - name: "parse_json float with decimal is float64" + mapping: | + output = `3.14`.parse_json().type() + output: "float64" + + - name: "parse_json float value" + mapping: | + output = `3.14`.parse_json() + output: 3.14 + + - name: "parse_json exponent is float64" + mapping: | + output = `1e3`.parse_json().type() + output: "float64" + + - name: "parse_json exponent value" + mapping: | + output = `1e3`.parse_json() + output: 1000.0 + + - name: "parse_json boolean true" + mapping: | + output = `true`.parse_json() + output: true + + - name: "parse_json boolean false" + mapping: | + output = `false`.parse_json() + output: false + + - name: "parse_json null" + mapping: | + output = `null`.parse_json() + output: null + + - name: "parse_json negative integer" + mapping: | + output = `-100`.parse_json() + output: -100 + + - name: "parse_json zero integer" + mapping: | + output = `0`.parse_json() + output: 0 + + # --- parse_json: errors --- + + - name: "parse_json invalid json" + mapping: | + output = `not json`.parse_json() + error: "parse" + + - name: "parse_json empty string" + mapping: | + output = ``.parse_json() + error: "parse" + + - name: "parse_json truncated object" + mapping: | + output = `{"name":`.parse_json() + error: "parse" + + # --- format_json: basic --- + + - name: "format_json object keys sorted" + mapping: | + output = {"b": 2, "a": 1}.format_json() + output: '{"a":1,"b":2}' + + - name: "format_json array" + mapping: | + output = [1, 2, 3].format_json() + output: "[1,2,3]" + + - name: "format_json string" + mapping: | + output = "hello".format_json() + output: '"hello"' + + - name: "format_json integer" + mapping: | + output = 42.format_json() + output: "42" + + - name: "format_json boolean" + mapping: | + output = true.format_json() + output: "true" + + - name: "format_json null" + mapping: | + output = null.format_json() + output: "null" + + - name: "format_json empty object" + mapping: | + output = {}.format_json() + output: "{}" + + - name: "format_json empty array" + mapping: | + output = [].format_json() + output: "[]" + + # --- format_json: indent --- + + - name: "format_json with two-space indent" + mapping: | + output = {"a": 1}.format_json(indent: " ") + output: "{\n \"a\": 1\n}" + + - name: "format_json with tab indent" + mapping: | + output = {"a": 1}.format_json(indent: "\t") + output: "{\n\t\"a\": 1\n}" + + # --- format_json: escape_html --- + + - name: "format_json escapes html by default" + mapping: | + output = {"html": "hi"}.format_json() + output: '{"html":"\u003cb\u003ehi\u003c/b\u003e"}' + + - name: "format_json escape_html false" + mapping: | + output = {"html": "hi"}.format_json(escape_html: false) + output: '{"html":"hi"}' + + # --- format_json: timestamps --- + + - name: "format_json timestamp as RFC 3339" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).format_json() + output: '"2024-01-15T10:30:00Z"' + + - name: "format_json timestamp in object uses shortest precision" + mapping: | + output = {"time": timestamp(2024, 1, 15, 10, 30, 0)}.format_json() + output: '{"time":"2024-01-15T10:30:00Z"}' + + - name: "format_json timestamp with fractional seconds in object" + mapping: | + output = {"time": timestamp(2024, 1, 15, 10, 30, 0, 500000000)}.format_json() + output: '{"time":"2024-01-15T10:30:00.5Z"}' + + - name: "format_json timestamp in nested array" + mapping: | + output = [timestamp(2024, 6, 1)].format_json() + output: '["2024-06-01T00:00:00Z"]' + + # --- format_json: errors --- + + - name: "format_json bytes is error" + mapping: | + output = "hello".bytes().format_json() + error: "bytes" + + - name: "format_json nested bytes is error" + mapping: | + output = {"data": "hello".bytes()}.format_json() + error: "bytes" + + - name: "format_json NaN is error" + mapping: | + output = input.nan.format_json() + input: {nan: {_type: "float64", value: "NaN"}} + error: "NaN" + + - name: "format_json Infinity is error" + mapping: | + output = input.inf.format_json() + input: {inf: {_type: "float64", value: "Infinity"}} + error: "Infinity" + + # --- format_json / parse_json round-trip --- + + - name: "format then parse round-trip object" + mapping: | + $obj = {"name": "Alice", "age": 30} + output = $obj.format_json().parse_json() + output: {"name": "Alice", "age": 30} + + - name: "format then parse round-trip array" + mapping: | + $arr = [1, "two", true, null] + output = $arr.format_json().parse_json() + output: [1, "two", true, null] + + # --- encode: base64 --- + + - name: "encode base64 from string" + mapping: | + output = "hello".encode("base64") + output: "aGVsbG8=" + + - name: "encode base64 from bytes" + mapping: | + output = "hello".bytes().encode("base64") + output: "aGVsbG8=" + + - name: "encode base64 empty string" + mapping: | + output = "".encode("base64") + output: "" + + # --- encode: base64url --- + + - name: "encode base64url" + mapping: | + output = "hello?world>".encode("base64url") + output: "aGVsbG8_d29ybGQ-" + + # --- encode: base64rawurl --- + + - name: "encode base64rawurl no padding" + mapping: | + output = "hello".encode("base64rawurl") + output: "aGVsbG8" + + # --- encode: hex --- + + - name: "encode hex from string" + mapping: | + output = "hello".encode("hex") + output: "68656c6c6f" + + - name: "encode hex from bytes" + mapping: | + output = "hello".bytes().encode("hex") + output: "68656c6c6f" + + - name: "encode hex empty" + mapping: | + output = "".encode("hex") + output: "" + + # --- decode: base64 --- + + - name: "decode base64 returns bytes" + mapping: | + output = "aGVsbG8=".decode("base64").type() + output: "bytes" + + - name: "decode base64 to string" + mapping: | + output = "aGVsbG8=".decode("base64").string() + output: "hello" + + # --- decode: base64url --- + + - name: "decode base64url to string" + mapping: | + output = "aGVsbG8_d29ybGQ-".decode("base64url").string() + output: "hello?world>" + + # --- decode: base64rawurl --- + + - name: "decode base64rawurl to string" + mapping: | + output = "aGVsbG8".decode("base64rawurl").string() + output: "hello" + + # --- decode: hex --- + + - name: "decode hex returns bytes" + mapping: | + output = "68656c6c6f".decode("hex").type() + output: "bytes" + + - name: "decode hex to string" + mapping: | + output = "68656c6c6f".decode("hex").string() + output: "hello" + + # --- decode: errors --- + + - name: "decode invalid base64" + mapping: | + output = "!!!".decode("base64") + error: "decode" + + - name: "decode invalid hex" + mapping: | + output = "zzzz".decode("hex") + error: "decode" + + # --- encode/decode round-trips --- + + - name: "base64 encode decode round-trip" + mapping: | + output = "hello world!".encode("base64").decode("base64").string() + output: "hello world!" + + - name: "hex encode decode round-trip" + mapping: | + output = "hello world!".encode("hex").decode("hex").string() + output: "hello world!" + + - name: "base64rawurl encode decode round-trip" + mapping: | + output = "test data".encode("base64rawurl").decode("base64rawurl").string() + output: "test data" + + - name: "base64url encode decode round-trip" + mapping: | + output = "test data".encode("base64url").decode("base64url").string() + output: "test data" + + # --- encode: named arg --- + + - name: "encode with named scheme arg" + mapping: | + output = "hello".encode(scheme: "hex") + output: "68656c6c6f" + + - name: "decode with named scheme arg" + mapping: | + output = "68656c6c6f".decode(scheme: "hex").string() + output: "hello" diff --git a/internal/bloblang2/spec/tests/stdlib/enumerate_method.yaml b/internal/bloblang2/spec/tests/stdlib/enumerate_method.yaml new file mode 100644 index 000000000..49a440807 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/enumerate_method.yaml @@ -0,0 +1,139 @@ +description: > + .enumerate() method — returns array of {index, value} objects. + Also tests .without_index(), .sum(), .min(), .max(), .join() basics. + +tests: + # --- enumerate --- + + - name: "enumerate basic" + mapping: | + output.v = ["a", "b", "c"].enumerate() + output: {"v": [{"index": 0, "value": "a"}, {"index": 1, "value": "b"}, {"index": 2, "value": "c"}]} + + - name: "enumerate empty array" + mapping: | + output.v = [].enumerate() + output: {"v": []} + + - name: "enumerate single element" + mapping: | + output.v = [42].enumerate() + output: {"v": [{"index": 0, "value": 42}]} + + - name: "enumerate then map to transform with index" + mapping: | + output.v = ["a", "b"].enumerate().map(e -> e.index.string() + ":" + e.value) + output: {"v": ["0:a", "1:b"]} + + # --- without_index --- + + - name: "without_index removes element at index" + mapping: | + output.v = [10, 20, 30, 40].without_index(1) + output: {"v": [10, 30, 40]} + + - name: "without_index first element" + mapping: | + output.v = [10, 20, 30].without_index(0) + output: {"v": [20, 30]} + + - name: "without_index last element" + mapping: | + output.v = [10, 20, 30].without_index(2) + output: {"v": [10, 20]} + + - name: "without_index out of bounds is error" + mapping: | + output.v = [1, 2, 3].without_index(5) + error: "out of bounds" + + # --- join --- + + - name: "join array of strings" + mapping: | + output.v = ["a", "b", "c"].join(",") + output: {"v": "a,b,c"} + + - name: "join with empty separator" + mapping: | + output.v = ["a", "b", "c"].join("") + output: {"v": "abc"} + + - name: "join empty array" + mapping: | + output.v = [].join(",") + output: {"v": ""} + + - name: "join single element" + mapping: | + output.v = ["only"].join(",") + output: {"v": "only"} + + - name: "join with non-string element is error" + mapping: | + output.v = ["a", 1, "b"].join(",") + error: "string" + + # --- sum --- + + - name: "sum of integers" + mapping: | + output.v = [1, 2, 3, 4].sum() + output: {"v": 10} + + - name: "sum of floats" + mapping: | + output.v = [1.5, 2.5, 3.0].sum() + output: {"v": 7.0} + + - name: "sum of empty array returns zero" + mapping: | + output.v = [].sum() + output: {"v": 0} + + - name: "sum of single element" + mapping: | + output.v = [42].sum() + output: {"v": 42} + + # --- min / max --- + + - name: "min of integers" + mapping: | + output.v = [3, 1, 4, 1, 5].min() + output: {"v": 1} + + - name: "max of integers" + mapping: | + output.v = [3, 1, 4, 1, 5].max() + output: {"v": 5} + + - name: "min of strings" + mapping: | + output.v = ["banana", "apple", "cherry"].min() + output: {"v": "apple"} + + - name: "max of strings" + mapping: | + output.v = ["banana", "apple", "cherry"].max() + output: {"v": "cherry"} + + - name: "min of empty array is error" + mapping: | + output.v = [].min() + error: "empty" + + - name: "max of empty array is error" + mapping: | + output.v = [].max() + error: "empty" + + - name: "min of single element" + mapping: | + output.v = [42].min() + output: {"v": 42} + + - name: "max of single element" + mapping: | + output.v = [42].max() + output: {"v": 42} diff --git a/internal/bloblang2/spec/tests/stdlib/find_method.yaml b/internal/bloblang2/spec/tests/stdlib/find_method.yaml new file mode 100644 index 000000000..426811cff --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/find_method.yaml @@ -0,0 +1,81 @@ +description: > + .find() method — returns first matching element or void. Short-circuits + on first match. Void result can be rescued with .or(). + +tests: + # --- Basic find --- + + - name: "find returns first matching element" + mapping: | + output.v = [1, 2, 3, 4, 5].find(x -> x > 3) + output: {"v": 4} + + - name: "find returns first match, not all matches" + mapping: | + output.v = [10, 20, 30, 40].find(x -> x > 15) + output: {"v": 20} + + - name: "find with string array" + mapping: | + output.v = ["apple", "banana", "cherry"].find(s -> s.has_prefix("b")) + output: {"v": "banana"} + + # --- find returns void when no match --- + + - name: "find no match — void skips output assignment" + mapping: | + output.v = "default" + output.v = [1, 2, 3].find(x -> x > 100) + output: {"v": "default"} + + - name: "find no match — void errors in variable declaration" + mapping: | + $x = [1, 2, 3].find(x -> x > 100) + error: "void" + + - name: "find no match — void rescued with or" + mapping: | + output.v = [1, 2, 3].find(x -> x > 100).or(-1) + output: {"v": -1} + + - name: "find no match — void in array literal errors" + mapping: | + output.v = [[1, 2, 3].find(x -> x > 100)] + error: "void" + + # --- Short-circuit behavior --- + + - name: "find short-circuits after first match" + mapping: | + output.v = [1, 2, 3, 4, 5].find(x -> x == 2) + output: {"v": 2} + + # --- find on empty array --- + + - name: "find on empty array returns void — rescued with or" + mapping: | + output.v = [].find(x -> true).or("nothing") + output: {"v": "nothing"} + + # --- find with complex predicates --- + + - name: "find with object elements" + mapping: | + $users = [ + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 17}, + {"name": "Carol", "age": 30}, + ] + output.v = $users.find(u -> u.age < 18) + output: {"v": {"name": "Bob", "age": 17}} + + - name: "find with method chain in predicate" + mapping: | + output.v = ["hello", "world", "hi"].find(s -> s.length() < 4) + output: {"v": "hi"} + + - name: "find with outer variable capture" + mapping: | + $threshold = 15 + output.v = [10, 20, 30].find(x -> x > $threshold) + output: {"v": 20} diff --git a/internal/bloblang2/spec/tests/stdlib/into_method.yaml b/internal/bloblang2/spec/tests/stdlib/into_method.yaml new file mode 100644 index 000000000..da6f00bb3 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/into_method.yaml @@ -0,0 +1,173 @@ +description: ".into(lambda) pipeline method — invoke lambda with receiver, return result (Section 13.12)" + +tests: + # --- Basic semantics --- + + - name: "into() passes receiver to lambda and returns result" + mapping: | + output = "hello".into(s -> s.uppercase()) + output: "HELLO" + + - name: "into() with a literal receiver" + mapping: | + output = 5.into(n -> n * 2) + output: 10 + + - name: "into() chainable after other methods" + mapping: | + output = input.name.trim().lowercase().into(s -> s + "_id") + input: {"name": " Alice "} + output: "alice_id" + + - name: "into() chainable before other methods" + mapping: | + output = input.n.into(x -> x + 1).string() + input: {"n": 41} + output: "42" + + # --- Lambda binding --- + + - name: "into() lambda param references the receiver" + mapping: | + output = {"a": 1, "b": 2}.into(obj -> obj.a + obj.b) + output: 3 + + - name: "into() lambda can use the param multiple times" + mapping: | + output = input.items.filter(x -> x > 0).into(pos -> { + "count": pos.length(), + "sum": pos.fold(0, (tally, x) -> tally + x), + }) + input: {"items": [1, -2, 3, -4, 5]} + output: {"count": 3, "sum": 9} + + - name: "into() lambda param shadowing in nested into()" + mapping: | + output = 10.into(x -> 20.into(x -> x)) + output: 20 + + # --- Scoping: this and output --- + + - name: "into() lambda sees outer input via input keyword" + mapping: | + output = input.n.into(_ -> input.m) + input: {"n": 1, "m": 42} + output: 42 + + - name: "into() lambda sees outer variables" + mapping: | + $factor = 3 + output = 10.into(n -> n * $factor) + output: 30 + + # --- Sentinel returns --- + + - name: "into() lambda returning deleted() in field position removes the field" + mapping: | + output.a = 1 + output.b = input.val.into(v -> if v > 0 { v } else { deleted() }) + output.c = 3 + input: {"val": -5} + output: {"a": 1, "c": 3} + + - name: "into() lambda returning void() skips the assignment" + mapping: | + output.status = "pending" + output.status = input.val.into(v -> if v > 0 { v.string() } else { void() }) + input: {"val": -5} + output: {"status": "pending"} + + - name: "into() lambda returning void in array literal is an error" + mapping: | + output.items = [1, "x".into(_ -> void()), 3] + error: "" + + # --- Error propagation --- + + - name: "error in receiver propagates (into() not called)" + mapping: | + output = (1 / 0).into(n -> n + 1).catch(_ -> "caught") + output: "caught" + + - name: "error from lambda propagates and is caught" + mapping: | + output = 5.into(_ -> throw("boom")).catch(err -> err.what) + output: "boom" + + # --- Nested and complex pipelines --- + + - name: "nested into() chains work" + mapping: | + output = input.raw + .split(",") + .into(parts -> { + "head": parts[0], + "tail": parts.slice(1), + }) + .into(split -> split.head + ":" + split.tail.length().string()) + input: {"raw": "a,b,c,d"} + output: "a:3" + + - name: "into() with match in the lambda body" + mapping: | + output = input.kind.into(k -> match k { + "http" => 80, + "https" => 443, + _ => 0, + }) + input: {"kind": "https"} + output: 443 + + # --- Arity / compile-time rules --- + + - name: "into() without a lambda argument is a compile error" + mapping: | + output = "hi".into() + compile_error: "into" + + - name: "into() with a non-lambda argument is a runtime error" + mapping: | + output = "hi".into("not a lambda") + error: "" + + - name: "into() with a two-parameter lambda is a runtime error" + mapping: | + output = "hi".into((a, b) -> a) + error: "" + + # --- Special receivers: void, deleted(), error --- + + - name: "into() on a void receiver is a runtime error" + mapping: | + output = (if false { 1 }).into(n -> n + 1) + error: "void" + + - name: "into() on a void() receiver is a runtime error" + mapping: | + output = void().into(n -> "x") + error: "void" + + - name: "into() on a deleted() receiver is a runtime error" + mapping: | + output = deleted().into(n -> "x") + error: "deleted" + + - name: "into() passes errors through unchanged (catchable downstream)" + mapping: | + output = (1 / 0).into(n -> n).catch(_ -> "caught") + output: "caught" + + - name: "rescue void before into() via .or()" + mapping: | + output = (if false { 1 }).or(0).into(n -> n + 1) + output: 1 + + - name: "into() on a null receiver passes null to the lambda" + mapping: | + output = null.into(n -> n) + output: null + + - name: "into() on a null receiver, lambda can null-check" + mapping: | + output = null.into(n -> if n == null { "was null" } else { "was value" }) + output: "was null" diff --git a/internal/bloblang2/spec/tests/stdlib/iter_chain_patterns.yaml b/internal/bloblang2/spec/tests/stdlib/iter_chain_patterns.yaml new file mode 100644 index 000000000..354be05b9 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/iter_chain_patterns.yaml @@ -0,0 +1,108 @@ +description: > + Iterator chain patterns — complex combinations of .iter(), .map(), + .filter(), .fold(), .map_entries(), .map_values() with nested lambdas, + outer variable capture, and control flow inside lambda bodies. + +tests: + # --- iter + fold to rebuild object --- + + - name: "iter then fold to filter and rebuild object" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3, "d": 4} + output.v = $obj.iter().filter(e -> e.value > 2).fold({}, (acc, e) -> { + $a = acc + $a[e.key] = e.value + $a + }) + output: {"v": {"c": 3, "d": 4}} + + - name: "iter then map then fold" + mapping: | + output.v = {"x": 1, "y": 2}.iter() + .map(e -> {"key": e.key.uppercase(), "value": e.value * 10}) + .fold({}, (acc, e) -> { + $a = acc + $a[e.key] = e.value + $a + }) + output: {"v": {"X": 10, "Y": 20}} + + # --- map_entries with complex lambda --- + + - name: "map_entries with if in lambda body" + mapping: | + output.v = {"a": 1, "b": -2, "c": 3}.map_entries((k, v) -> + if v < 0 { deleted() } else { {"key": k.uppercase(), "value": v * 10} } + ) + output: {"v": {"A": 10, "C": 30}} + + - name: "map_entries with block body and local variables" + mapping: | + output.v = {"name": "alice", "city": "london"}.map_entries((k, v) -> { + $upper_key = k.uppercase() + $upper_val = v.uppercase() + {"key": $upper_key, "value": $upper_val} + }) + output: {"v": {"NAME": "ALICE", "CITY": "LONDON"}} + + # --- filter + map + sort chain --- + + - name: "filter then map then sort" + mapping: | + output.v = [5, -3, 8, -1, 4, -6] + .filter(x -> x > 0) + .map(x -> x * x) + .sort() + output: {"v": [16, 25, 64]} + + # --- map with outer capture + dynamic path --- + + - name: "map with outer capture building objects" + mapping: | + $prefix = "item" + output.v = [10, 20, 30].enumerate().map(e -> { + $obj = {} + $obj[$prefix + "_" + e.index.string()] = e.value + $obj + }) + output: {"v": [{"item_0": 10}, {"item_1": 20}, {"item_2": 30}]} + + # --- Nested map operations --- + + - name: "map_values calling map on inner arrays" + mapping: | + output.v = {"a": [1, 2], "b": [3, 4]}.map_values(arr -> arr.map(x -> x * 10)) + output: {"v": {"a": [10, 20], "b": [30, 40]}} + + - name: "map_values with fold on inner arrays" + mapping: | + output.v = {"a": [1, 2, 3], "b": [4, 5]}.map_values(arr -> arr.sum()) + output: {"v": {"a": 6, "b": 9}} + + # --- Combined filter_entries and map_values --- + + - name: "filter_entries then map_values chain" + mapping: | + output.v = {"a": 1, "b": 2, "c": 3} + .filter_entries((k, v) -> v > 1) + .map_values(v -> v * 100) + output: {"v": {"b": 200, "c": 300}} + + # --- map producing objects then collect --- + + - name: "map to key-value then collect" + mapping: | + output.v = [1, 2, 3].map(x -> {"key": "item_" + x.string(), "value": x * 10}).collect() + output: {"v": {"item_1": 10, "item_2": 20, "item_3": 30}} + + # --- Chained any/all after transform --- + + - name: "map then any" + mapping: | + output.v = [1, 2, 3, 4, 5].map(x -> x * x).any(sq -> sq > 20) + output: {"v": true} + + - name: "filter then all" + mapping: | + output.v = [2, 4, 6, 8, 10].filter(x -> x > 3).all(x -> x % 2 == 0) + output: {"v": true} diff --git a/internal/bloblang2/spec/tests/stdlib/method_composition.yaml b/internal/bloblang2/spec/tests/stdlib/method_composition.yaml new file mode 100644 index 000000000..6db0eca6e --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/method_composition.yaml @@ -0,0 +1,165 @@ +description: "Cross-method composition tests: chaining stdlib methods in realistic combinations" + +tests: + # --- String + encoding chains --- + + - name: "trim then encode base64" + mapping: | + output = " hello ".trim().encode("base64") + output: "aGVsbG8=" + + - name: "split then map uppercase" + mapping: | + output = "a,b,c".split(",").map(v -> v.uppercase()) + output: ["A", "B", "C"] + + - name: "split then filter then join" + mapping: | + output = "a,,b,,c".split(",").filter(v -> v.length() > 0).join(",") + output: "a,b,c" + + - name: "replace_all then split then length" + mapping: | + output = "one::two::three".replace_all("::", ",").split(",").length() + output: 3 + + # --- Numeric + type chains --- + + - name: "abs then floor" + mapping: | + output = (-3.7).abs().floor() + output: 3.0 + + - name: "ceil then int64 conversion" + mapping: | + output = 3.2.ceil().int64() + output: 4 + + - name: "round then format_json" + mapping: | + output = 3.14159.round(2).format_json() + output: "3.14" + + - name: "abs preserves int32 type through chain" + mapping: | + output = (-5).int32().abs().type() + output: "int32" + + # --- Timestamp + format chains --- + + - name: "ts_parse then ts_format custom" + mapping: | + output = "2024-01-15T10:30:00Z".ts_parse().ts_format("%Y-%m-%d") + output: "2024-01-15" + + - name: "timestamp constructor then ts_add then ts_format" + mapping: | + output = timestamp(2024, 1, 15, 10, 0, 0).ts_add(30 * minute()).ts_format() + output: "2024-01-15T10:30:00Z" + + - name: "ts_parse then ts_unix then ts_from_unix round-trip" + mapping: | + $s = "2024-01-15T10:30:00Z" + output = $s.ts_parse().ts_unix().ts_from_unix().ts_format() + output: "2024-01-15T10:30:00Z" + + - name: "timestamp subtraction then comparison" + mapping: | + $a = timestamp(2024, 1, 15, 10, 0, 0) + $b = timestamp(2024, 1, 15, 11, 0, 0) + output = ($b - $a) == hour() + output: true + + # --- JSON + object chains --- + + - name: "parse_json then keys sorted" + mapping: | + output = `{"b":2,"a":1}`.parse_json().keys().sort() + output: ["a", "b"] + + - name: "object merge then format_json" + mapping: | + output = {"a": 1}.merge({"b": 2}).format_json() + output: '{"a":1,"b":2}' + + - name: "parse_json then map values" + mapping: | + output = `{"x":1,"y":2}`.parse_json().map_values(v -> v * 10) + output: {"x": 10, "y": 20} + + - name: "format_json then parse_json idempotent" + mapping: | + $obj = {"name": "Alice", "age": 30, "active": true} + output = $obj.format_json().parse_json() == $obj + output: true + + # --- Array + query chains --- + + - name: "range then filter then sum" + mapping: | + output = range(1, 11).filter(v -> v % 2 == 0).sum() + output: 30 + + - name: "range then map then sort descending" + mapping: | + output = range(1, 4).map(v -> v * v).sort().reverse() + output: [9, 4, 1] + + - name: "array map abs then max" + mapping: | + output = [-3, 1, -7, 4].map(v -> v.abs()).max() + output: 7 + + - name: "split then unique then sort then join" + mapping: | + output = "b,a,c,a,b".split(",").unique().sort().join(",") + output: "a,b,c" + + # --- Encoding round-trip chains --- + + - name: "string to base64 to hex round-trip" + mapping: | + output = "hello".encode("base64").decode("base64").encode("hex") + output: "68656c6c6f" + + - name: "format_json then encode base64 then decode" + mapping: | + output = {"key": "value"}.format_json().encode("base64").decode("base64").string() + output: '{"key":"value"}' + + # --- Error handling in chains --- + + - name: "parse_json catch returns default" + mapping: | + output = "bad json".parse_json().catch(err -> "fallback") + output: "fallback" + + - name: "ts_parse catch returns null" + mapping: | + output = "not-a-date".ts_parse().catch(err -> null) + output: null + + - name: "chained method after or" + mapping: | + output = null.or("hello").uppercase() + output: "HELLO" + + - name: "abs on parse_json result" + mapping: | + output = `"-42"`.parse_json().int64().abs() + output: 42 + + # --- Timestamp arithmetic composition --- + + - name: "ts_add chained multiple times" + mapping: | + output = timestamp(2024, 1, 15, 10, 0, 0).ts_add(hour()).ts_add(30 * minute()).ts_add(15 * second()) + output: {_type: "timestamp", value: "2024-01-15T11:30:15Z"} + + - name: "ts_unix_nano to int then back preserves nanos" + mapping: | + $ts = timestamp(2024, 1, 15, 10, 30, 0, 123456789) + $nanos = $ts.ts_unix_nano() + $ts2 = $nanos.ts_from_unix_nano() + output = $ts2.ts_unix_nano() == $nanos + output: true diff --git a/internal/bloblang2/spec/tests/stdlib/numeric_methods.yaml b/internal/bloblang2/spec/tests/stdlib/numeric_methods.yaml new file mode 100644 index 000000000..5d18effe2 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/numeric_methods.yaml @@ -0,0 +1,325 @@ +description: "Numeric methods: abs, floor, ceil, round (half-even)" + +tests: + # --- abs: float64 --- + + - name: "abs positive float64 is identity" + mapping: | + output = 3.14.abs() + output: 3.14 + + - name: "abs negative float64" + mapping: | + output = (-3.14).abs() + output: 3.14 + + - name: "abs zero float64" + mapping: | + output = 0.0.abs() + output: 0.0 + + - name: "abs float64 returns float64" + mapping: | + output = (-2.5).abs().type() + output: "float64" + + # --- abs: int64 --- + + - name: "abs positive int64 is identity" + mapping: | + output = 42.abs() + output: 42 + + - name: "abs negative int64" + mapping: | + output = (-42).abs() + output: 42 + + - name: "abs zero int64" + mapping: | + output = 0.abs() + output: 0 + + - name: "abs int64 returns int64" + mapping: | + output = (-5).abs().type() + output: "int64" + + - name: "abs int64 min value overflows" + mapping: | + output = (-9223372036854775807 - 1).abs() + error: "overflow" + + # --- abs: int32 --- + + - name: "abs negative int32" + mapping: | + output = (-100).int32().abs() + output: {_type: "int32", value: "100"} + + - name: "abs int32 min value overflows" + mapping: | + output = (-2147483648).int32().abs() + error: "overflow" + + - name: "abs int32 returns int32" + mapping: | + output = (-7).int32().abs().type() + output: "int32" + + # --- abs: uint64 identity --- + + - name: "abs uint64 is identity" + mapping: | + output = 99.uint64().abs() + output: {_type: "uint64", value: "99"} + + # --- abs: uint32 identity --- + + - name: "abs uint32 is identity" + mapping: | + output = 99.uint32().abs() + output: {_type: "uint32", value: "99"} + + # --- abs: float32 --- + + - name: "abs negative float32" + mapping: | + output = (-1.5).float32().abs() + output: {_type: "float32", value: "1.5"} + + - name: "abs float32 returns float32" + mapping: | + output = (-1.5).float32().abs().type() + output: "float32" + + # --- floor --- + + - name: "floor positive float64" + mapping: | + output = 3.7.floor() + output: 3.0 + + - name: "floor negative float64" + mapping: | + output = (-3.2).floor() + output: -4.0 + + - name: "floor already integer float64" + mapping: | + output = 5.0.floor() + output: 5.0 + + - name: "floor zero" + mapping: | + output = 0.0.floor() + output: 0.0 + + - name: "floor negative already integer" + mapping: | + output = (-4.0).floor() + output: -4.0 + + - name: "floor small positive" + mapping: | + output = 0.1.floor() + output: 0.0 + + - name: "floor small negative" + mapping: | + output = (-0.1).floor() + output: -1.0 + + - name: "floor returns float64" + mapping: | + output = 3.7.floor().type() + output: "float64" + + - name: "floor float32 returns float32" + mapping: | + output = 3.7.float32().floor().type() + output: "float32" + + - name: "floor float32 value" + mapping: | + output = 3.7.float32().floor() + output: {_type: "float32", value: "3.0"} + + # --- ceil --- + + - name: "ceil positive float64" + mapping: | + output = 3.2.ceil() + output: 4.0 + + - name: "ceil negative float64" + mapping: | + output = (-3.7).ceil() + output: -3.0 + + - name: "ceil already integer float64" + mapping: | + output = 5.0.ceil() + output: 5.0 + + - name: "ceil zero" + mapping: | + output = 0.0.ceil() + output: 0.0 + + - name: "ceil small positive" + mapping: | + output = 0.1.ceil() + output: 1.0 + + - name: "ceil small negative" + mapping: | + output = (-0.1).ceil() + output: 0.0 + + - name: "ceil returns float64" + mapping: | + output = 3.2.ceil().type() + output: "float64" + + - name: "ceil float32 returns float32" + mapping: | + output = 3.2.float32().ceil().type() + output: "float32" + + - name: "ceil float32 value" + mapping: | + output = 3.2.float32().ceil() + output: {_type: "float32", value: "4.0"} + + # --- round: default n=0 (half-even) --- + + - name: "round up from 3.7" + mapping: | + output = 3.7.round() + output: 4.0 + + - name: "round down from 3.2" + mapping: | + output = 3.2.round() + output: 3.0 + + - name: "round half-even 2.5 rounds to 2" + mapping: | + output = 2.5.round() + output: 2.0 + + - name: "round half-even 3.5 rounds to 4" + mapping: | + output = 3.5.round() + output: 4.0 + + - name: "round half-even 0.5 rounds to 0" + mapping: | + output = 0.5.round() + output: 0.0 + + - name: "round half-even 1.5 rounds to 2" + mapping: | + output = 1.5.round() + output: 2.0 + + - name: "round half-even 4.5 rounds to 4" + mapping: | + output = 4.5.round() + output: 4.0 + + - name: "round half-even negative -2.5 rounds to -2" + mapping: | + output = (-2.5).round() + output: -2.0 + + - name: "round half-even negative -3.5 rounds to -4" + mapping: | + output = (-3.5).round() + output: -4.0 + + - name: "round returns float64" + mapping: | + output = 3.7.round().type() + output: "float64" + + # --- round: positive n (decimal places) --- + + - name: "round to 2 decimal places" + mapping: | + output = 3.456.round(2) + output: 3.46 + + - name: "round to 1 decimal place" + mapping: | + output = 3.456.round(1) + output: 3.5 + + - name: "round half-even at 2 decimal places" + mapping: | + output = 2.125.round(2) + output: 2.12 + + - name: "round half-even at 2 decimal places odd" + mapping: | + output = 2.375.round(2) + output: 2.38 + + # --- round: negative n (powers of 10) --- + + - name: "round to nearest 10" + mapping: | + output = 1234.0.round(-1) + output: 1230.0 + + - name: "round to nearest 100" + mapping: | + output = 1234.0.round(-2) + output: 1200.0 + + - name: "round half-even to nearest 100" + mapping: | + output = 1250.0.round(-2) + output: 1200.0 + + - name: "round half-even to nearest 100 odd" + mapping: | + output = 1350.0.round(-2) + output: 1400.0 + + - name: "round to nearest 1000" + mapping: | + output = 1500.0.round(-3) + output: 2000.0 + + - name: "round half-even to nearest 1000" + mapping: | + output = 2500.0.round(-3) + output: 2000.0 + + # --- round: float32 --- + + - name: "round float32 returns float32" + mapping: | + output = 3.7.float32().round().type() + output: "float32" + + - name: "round float32 value" + mapping: | + output = 3.7.float32().round() + output: {_type: "float32", value: "4.0"} + + # --- round: explicit n=0 --- + + - name: "round with explicit n=0 same as default" + mapping: | + output = 2.5.round(0) + output: 2.0 + + # --- round: named arg --- + + - name: "round with named arg n" + mapping: | + output = 3.456.round(n: 2) + output: 3.46 diff --git a/internal/bloblang2/spec/tests/stdlib/object_methods.yaml b/internal/bloblang2/spec/tests/stdlib/object_methods.yaml new file mode 100644 index 000000000..894a67110 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/object_methods.yaml @@ -0,0 +1,224 @@ +description: "Object methods — iter, keys, values, has_key, merge, without" + +tests: + # --- iter --- + + - name: "iter produces key-value pairs" + mapping: | + $obj = {"a": 1} + output.result = $obj.iter() + output: {"result": [{"key": "a", "value": 1}]} + + - name: "iter empty object" + mapping: | + output.result = {}.iter() + output: {"result": []} + + - name: "iter preserves value types" + mapping: | + $obj = {"s": "hello", "n": 42, "b": true, "nil": null} + $entries = $obj.iter() + output.len = $entries.length() + output: {"len": 4} + + - name: "iter entries have key and value fields" + mapping: | + $obj = {"x": 10} + $entry = $obj.iter()[0] + output.k = $entry.key + output.v = $entry.value + output: {"k": "x", "v": 10} + + - name: "iter result can be used with map" + mapping: | + $obj = {"a": 1, "b": 2} + output.result = $obj.iter().map(e -> e.key + "=" + e.value.string()).sort() + output: {"result": ["a=1", "b=2"]} + + # --- keys --- + + - name: "keys of single-key object" + mapping: | + output.result = {"name": "Alice"}.keys() + output: {"result": ["name"]} + + - name: "keys of empty object" + mapping: | + output.result = {}.keys() + output: {"result": []} + + - name: "keys returns strings" + mapping: | + $obj = {"a": 1} + output.result = $obj.keys()[0].type() + output: {"result": "string"} + + - name: "keys count matches object length" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3} + output.result = $obj.keys().length() + output: {"result": 3} + + - name: "keys can be sorted for deterministic comparison" + mapping: | + $obj = {"c": 3, "a": 1, "b": 2} + output.result = $obj.keys().sort() + output: {"result": ["a", "b", "c"]} + + # --- values --- + + - name: "values of single-key object" + mapping: | + output.result = {"x": 42}.values() + output: {"result": [42]} + + - name: "values of empty object" + mapping: | + output.result = {}.values() + output: {"result": []} + + - name: "values count matches object length" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3} + output.result = $obj.values().length() + output: {"result": 3} + + - name: "values can be sorted for deterministic comparison" + mapping: | + $obj = {"c": 3, "a": 1, "b": 2} + output.result = $obj.values().sort() + output: {"result": [1, 2, 3]} + + - name: "values preserves types" + mapping: | + $obj = {"s": "hello", "n": 42} + $vals = $obj.values().sort_by(v -> v.type()) + output.types = $vals.map(v -> v.type()) + output: {"types": ["int64", "string"]} + + # --- has_key --- + + - name: "has_key returns true for existing key" + mapping: | + output.result = {"name": "Alice"}.has_key("name") + output: {"result": true} + + - name: "has_key returns false for missing key" + mapping: | + output.result = {"name": "Alice"}.has_key("age") + output: {"result": false} + + - name: "has_key on empty object" + mapping: | + output.result = {}.has_key("anything") + output: {"result": false} + + - name: "has_key with null value still returns true" + mapping: | + output.result = {"x": null}.has_key("x") + output: {"result": true} + + - name: "has_key checks only top-level" + mapping: | + $obj = {"a": {"b": 1}} + output.top = $obj.has_key("a") + output.nested = $obj.has_key("b") + output: {"top": true, "nested": false} + + # --- merge --- + + - name: "merge two objects" + mapping: | + output.result = {"a": 1}.merge({"b": 2}) + output: {"result": {"a": 1, "b": 2}} + + - name: "merge other wins on conflict" + mapping: | + output.result = {"a": 1, "b": 2}.merge({"b": 99, "c": 3}) + output: {"result": {"a": 1, "b": 99, "c": 3}} + + - name: "merge with empty object" + mapping: | + output.result = {"a": 1}.merge({}) + output: {"result": {"a": 1}} + + - name: "merge empty with non-empty" + mapping: | + output.result = {}.merge({"a": 1}) + output: {"result": {"a": 1}} + + - name: "merge two empty objects" + mapping: | + output.result = {}.merge({}) + output: {"result": {}} + + - name: "merge does not modify original" + mapping: | + $a = {"x": 1} + $b = {"y": 2} + $c = $a.merge($b) + output.a = $a + output.c = $c + output: {"a": {"x": 1}, "c": {"x": 1, "y": 2}} + + - name: "merge nested objects are replaced not deep merged" + mapping: | + $a = {"config": {"host": "localhost", "port": 8080}} + $b = {"config": {"host": "remote"}} + output.result = $a.merge($b) + output: {"result": {"config": {"host": "remote"}}} + + # --- without --- + + - name: "without removes specified keys" + mapping: | + output.result = {"a": 1, "b": 2, "c": 3}.without(["b"]) + output: {"result": {"a": 1, "c": 3}} + + - name: "without multiple keys" + mapping: | + output.result = {"a": 1, "b": 2, "c": 3}.without(["a", "c"]) + output: {"result": {"b": 2}} + + - name: "without missing key is ignored" + mapping: | + output.result = {"a": 1, "b": 2}.without(["c", "d"]) + output: {"result": {"a": 1, "b": 2}} + + - name: "without empty key list" + mapping: | + output.result = {"a": 1, "b": 2}.without([]) + output: {"result": {"a": 1, "b": 2}} + + - name: "without all keys" + mapping: | + output.result = {"a": 1, "b": 2}.without(["a", "b"]) + output: {"result": {}} + + - name: "without on empty object" + mapping: | + output.result = {}.without(["a"]) + output: {"result": {}} + + - name: "without does not modify original" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3} + $new = $obj.without(["b"]) + output.original = $obj + output.new = $new + output: {"original": {"a": 1, "b": 2, "c": 3}, "new": {"a": 1, "c": 3}} + + - name: "without mix of present and missing keys" + mapping: | + output.result = {"a": 1, "b": 2}.without(["a", "z"]) + output: {"result": {"b": 2}} + + - name: "without non-string key in array is error" + mapping: | + output.result = {"a": 1, "b": 2}.without(["a", 42]) + error: "string" + + - name: "without null key in array is error" + mapping: | + output.result = {"a": 1}.without([null]) + error: "string" diff --git a/internal/bloblang2/spec/tests/stdlib/object_transform.yaml b/internal/bloblang2/spec/tests/stdlib/object_transform.yaml new file mode 100644 index 000000000..cefe4cd12 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/object_transform.yaml @@ -0,0 +1,192 @@ +description: "Object transform methods — map_values, map_keys, map_entries, filter_entries" + +tests: + # --- map_values --- + + - name: "map_values transforms all values" + mapping: | + output.result = {"a": 1, "b": 2, "c": 3}.map_values(v -> v * 10) + output: {"result": {"a": 10, "b": 20, "c": 30}} + + - name: "map_values on empty object" + mapping: | + output.result = {}.map_values(v -> v + 1) + output: {"result": {}} + + - name: "map_values can change value types" + mapping: | + output.result = {"a": 1, "b": 2}.map_values(v -> v.string()) + output: {"result": {"a": "1", "b": "2"}} + + - name: "map_values void return is error" + mapping: | + output.result = {"a": 1}.map_values(v -> if false { v }) + error: "void" + + - name: "map_values deleted omits entry" + mapping: | + output.result = {"a": 1, "b": 2, "c": 3}.map_values(v -> if v > 1 { v * 10 } else { deleted() }) + output: {"result": {"b": 20, "c": 30}} + + - name: "map_values does not modify original" + mapping: | + $obj = {"x": 1, "y": 2} + $new = $obj.map_values(v -> v + 100) + output.original = $obj + output.new = $new + output: {"original": {"x": 1, "y": 2}, "new": {"x": 101, "y": 102}} + + - name: "map_values with block body" + mapping: | + output.result = {"a": 5, "b": 10}.map_values(v -> { + $doubled = v * 2 + $doubled + 1 + }) + output: {"result": {"a": 11, "b": 21}} + + # --- map_keys --- + + - name: "map_keys transforms all keys" + mapping: | + output.result = {"a": 1, "b": 2}.map_keys(k -> k.uppercase()) + output: {"result": {"A": 1, "B": 2}} + + - name: "map_keys on empty object" + mapping: | + output.result = {}.map_keys(k -> k + "_suffix") + output: {"result": {}} + + - name: "map_keys adds prefix" + mapping: | + output.result = {"name": "Alice", "age": 30}.map_keys(k -> "user_" + k) + output: {"result": {"user_name": "Alice", "user_age": 30}} + + - name: "map_keys must return string" + mapping: | + output.result = {"a": 1}.map_keys(k -> 42) + error: "string" + + - name: "map_keys void return is error" + mapping: | + output.result = {"a": 1}.map_keys(k -> if false { k }) + error: "void" + + - name: "map_keys deleted omits entry" + mapping: | + output.result = {"keep": 1, "drop": 2, "also_keep": 3}.map_keys(k -> if k.has_prefix("drop") { deleted() } else { k }) + output: {"result": {"keep": 1, "also_keep": 3}} + + - name: "map_keys does not modify original" + mapping: | + $obj = {"a": 1, "b": 2} + $new = $obj.map_keys(k -> k.uppercase()) + output.original = $obj + output.new = $new + output: {"original": {"a": 1, "b": 2}, "new": {"A": 1, "B": 2}} + + # --- map_entries --- + + - name: "map_entries transforms keys and values" + mapping: | + output.result = {"a": 1, "b": 2}.map_entries((k, v) -> {"key": k.uppercase(), "value": v * 10}) + output: {"result": {"A": 10, "B": 20}} + + - name: "map_entries on empty object" + mapping: | + output.result = {}.map_entries((k, v) -> {"key": k, "value": v}) + output: {"result": {}} + + - name: "map_entries swap keys and values" + mapping: | + output.result = {"x": "alpha", "y": "beta"}.map_entries((k, v) -> {"key": v, "value": k}) + output: {"result": {"alpha": "x", "beta": "y"}} + + - name: "map_entries deleted omits entry" + mapping: | + output.result = {"a": 1, "b": 2, "c": 3}.map_entries((k, v) -> if v == 2 { deleted() } else { {"key": k, "value": v} }) + output: {"result": {"a": 1, "c": 3}} + + - name: "map_entries void return is error" + mapping: | + output.result = {"a": 1}.map_entries((k, v) -> if false { {"key": k, "value": v} }) + error: "void" + + - name: "map_entries with computed keys" + mapping: | + output.result = {"name": "Alice", "city": "London"}.map_entries((k, v) -> {"key": "user_" + k, "value": v}) + output: {"result": {"user_name": "Alice", "user_city": "London"}} + + - name: "map_entries does not modify original" + mapping: | + $obj = {"a": 1, "b": 2} + $new = $obj.map_entries((k, v) -> {"key": k, "value": v + 100}) + output.original = $obj + output.new = $new + output: {"original": {"a": 1, "b": 2}, "new": {"a": 101, "b": 102}} + + - name: "map_entries with block body" + mapping: | + output.result = {"x": 5}.map_entries((k, v) -> { + $new_key = k + "_modified" + $new_val = v * 2 + {"key": $new_key, "value": $new_val} + }) + output: {"result": {"x_modified": 10}} + + # --- filter_entries --- + + - name: "filter_entries keeps matching entries" + mapping: | + output.result = {"a": 1, "b": 5, "c": 3}.filter_entries((k, v) -> v > 2) + output: {"result": {"b": 5, "c": 3}} + + - name: "filter_entries on empty object" + mapping: | + output.result = {}.filter_entries((k, v) -> true) + output: {"result": {}} + + - name: "filter_entries no matches" + mapping: | + output.result = {"a": 1, "b": 2}.filter_entries((k, v) -> v > 100) + output: {"result": {}} + + - name: "filter_entries all match" + mapping: | + output.result = {"a": 1, "b": 2}.filter_entries((k, v) -> v > 0) + output: {"result": {"a": 1, "b": 2}} + + - name: "filter_entries by key" + mapping: | + output.result = {"name": "Alice", "age": 30, "note": "test"}.filter_entries((k, v) -> k.has_prefix("n")) + output: {"result": {"name": "Alice", "note": "test"}} + + - name: "filter_entries non-bool return is error" + mapping: | + output.result = {"a": 1}.filter_entries((k, v) -> v * 2) + error: "bool" + + - name: "filter_entries void return is error" + mapping: | + output.result = {"a": 1}.filter_entries((k, v) -> if false { true }) + error: "void" + + - name: "filter_entries does not modify original" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3} + $new = $obj.filter_entries((k, v) -> v > 1) + output.original = $obj + output.new = $new + output: {"original": {"a": 1, "b": 2, "c": 3}, "new": {"b": 2, "c": 3}} + + - name: "filter_entries combined key and value condition" + mapping: | + output.result = {"x": 10, "y": 20, "z": 5}.filter_entries((k, v) -> k != "y" && v > 3) + output: {"result": {"x": 10, "z": 5}} + + - name: "filter_entries with block body" + mapping: | + output.result = {"a": 1, "b": 10, "c": 5}.filter_entries((k, v) -> { + $threshold = 3 + v > $threshold + }) + output: {"result": {"b": 10, "c": 5}} diff --git a/internal/bloblang2/spec/tests/stdlib/sequence_methods.yaml b/internal/bloblang2/spec/tests/stdlib/sequence_methods.yaml new file mode 100644 index 000000000..3c2edd659 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/sequence_methods.yaml @@ -0,0 +1,322 @@ +description: "Sequence methods: .length(), .contains(), .index_of(), .slice(), .reverse() for strings, arrays, and bytes" + +tests: + # --- .length() on strings --- + + - name: "string length ascii" + mapping: | + output = "hello".length() + output: 5 + + - name: "string length empty" + mapping: | + output = "".length() + output: 0 + + - name: "string length non-ascii codepoint-based" + mapping: | + output = "caf\u00E9".length() + output: 4 + + - name: "string length emoji single codepoint" + mapping: | + output = "\u{1F600}".length() + output: 1 + + # --- .length() on arrays --- + + - name: "array length" + mapping: | + output = [1, 2, 3].length() + output: 3 + + - name: "array length empty" + mapping: | + output = [].length() + output: 0 + + - name: "array length nested" + mapping: | + output = [[1, 2], [3]].length() + output: 2 + + # --- .length() on bytes --- + + - name: "bytes length ascii" + mapping: | + output = "hello".bytes().length() + output: 5 + + - name: "bytes length multibyte" + mapping: | + output = "\u{1F600}".bytes().length() + output: 4 + + - name: "bytes length empty" + mapping: | + output = "".bytes().length() + output: 0 + + # --- .contains() on strings --- + + - name: "string contains true" + mapping: | + output = "hello world".contains("world") + output: true + + - name: "string contains false" + mapping: | + output = "hello world".contains("xyz") + output: false + + - name: "string contains empty substring" + mapping: | + output = "hello".contains("") + output: true + + - name: "string contains full match" + mapping: | + output = "hello".contains("hello") + output: true + + - name: "string contains empty in empty" + mapping: | + output = "".contains("") + output: true + + # --- .contains() on arrays --- + + - name: "array contains int true" + mapping: | + output = [1, 2, 3].contains(2) + output: true + + - name: "array contains int false" + mapping: | + output = [1, 2, 3].contains(5) + output: false + + - name: "array contains string" + mapping: | + output = ["a", "b", "c"].contains("b") + output: true + + - name: "array contains null" + mapping: | + output = [1, null, 3].contains(null) + output: true + + - name: "array contains empty array" + mapping: | + output = [].contains(1) + output: false + + # --- .contains() on bytes --- + + - name: "bytes contains subsequence true" + mapping: | + output = "hello".bytes().contains("ll".bytes()) + output: true + + - name: "bytes contains subsequence false" + mapping: | + output = "hello".bytes().contains("xyz".bytes()) + output: false + + # --- .index_of() on strings --- + + - name: "string index_of found" + mapping: | + output = "hello world".index_of("world") + output: 6 + + - name: "string index_of not found" + mapping: | + output = "hello world".index_of("xyz") + output: -1 + + - name: "string index_of first occurrence" + mapping: | + output = "abcabc".index_of("abc") + output: 0 + + - name: "string index_of empty needle" + mapping: | + output = "hello".index_of("") + output: 0 + + - name: "string index_of codepoint-based" + mapping: | + output = "caf\u00E9!".index_of("!") + output: 4 + + # --- .index_of() on arrays --- + + - name: "array index_of found" + mapping: | + output = [10, 20, 30].index_of(20) + output: 1 + + - name: "array index_of not found" + mapping: | + output = [10, 20, 30].index_of(99) + output: -1 + + - name: "array index_of first occurrence" + mapping: | + output = [1, 2, 1, 2].index_of(2) + output: 1 + + - name: "array index_of string element" + mapping: | + output = ["a", "b", "c"].index_of("c") + output: 2 + + # --- .index_of() on bytes --- + + - name: "bytes index_of found" + mapping: | + output = "hello".bytes().index_of("ll".bytes()) + output: 2 + + - name: "bytes index_of not found" + mapping: | + output = "hello".bytes().index_of("xyz".bytes()) + output: -1 + + # --- .slice() on strings --- + + - name: "string slice basic" + mapping: | + output = "hello world".slice(0, 5) + output: "hello" + + - name: "string slice to end" + mapping: | + output = "hello world".slice(6) + output: "world" + + - name: "string slice negative indices" + mapping: | + output = "hello world".slice(-5, -1) + output: "worl" + + - name: "string slice clamped beyond length" + mapping: | + output = "hello".slice(0, 100) + output: "hello" + + - name: "string slice empty result from inverted" + mapping: | + output = "hello".slice(3, 1) + output: "" + + - name: "string slice full string" + mapping: | + output = "hello".slice(0) + output: "hello" + + # --- .slice() on arrays --- + + - name: "array slice basic" + mapping: | + output = [10, 20, 30, 40, 50].slice(1, 4) + output: [20, 30, 40] + + - name: "array slice to end" + mapping: | + output = [10, 20, 30].slice(1) + output: [20, 30] + + - name: "array slice negative indices" + mapping: | + output = [10, 20, 30, 40, 50].slice(-3, -1) + output: [30, 40] + + - name: "array slice clamped beyond length" + mapping: | + output = [1, 2, 3].slice(0, 100) + output: [1, 2, 3] + + - name: "array slice empty from inverted" + mapping: | + output = [1, 2, 3].slice(2, 0) + output: [] + + - name: "array slice empty array" + mapping: | + output = [].slice(0) + output: [] + + # --- .slice() on bytes --- + + - name: "bytes slice basic" + mapping: | + output = "hello".bytes().slice(0, 3) + output: {_type: "bytes", value: "aGVs"} + + - name: "bytes slice to end" + mapping: | + output = "hello".bytes().slice(3) + output: {_type: "bytes", value: "bG8="} + + - name: "bytes slice negative" + mapping: | + output = "hello".bytes().slice(-2) + output: {_type: "bytes", value: "bG8="} + + # --- .reverse() on strings --- + + - name: "string reverse ascii" + mapping: | + output = "hello".reverse() + output: "olleh" + + - name: "string reverse empty" + mapping: | + output = "".reverse() + output: "" + + - name: "string reverse single char" + mapping: | + output = "a".reverse() + output: "a" + + - name: "string reverse palindrome" + mapping: | + output = "racecar".reverse() + output: "racecar" + + # --- .reverse() on arrays --- + + - name: "array reverse" + mapping: | + output = [1, 2, 3].reverse() + output: [3, 2, 1] + + - name: "array reverse empty" + mapping: | + output = [].reverse() + output: [] + + - name: "array reverse single element" + mapping: | + output = [42].reverse() + output: [42] + + - name: "array reverse mixed types" + mapping: | + output = [1, "two", true].reverse() + output: [true, "two", 1] + + # --- .reverse() on bytes --- + + - name: "bytes reverse" + mapping: | + output = "hello".bytes().reverse() + output: {_type: "bytes", value: "b2xsZWg="} + + - name: "bytes reverse empty" + mapping: | + output = "".bytes().reverse() + output: {_type: "bytes", value: ""} diff --git a/internal/bloblang2/spec/tests/stdlib/sort_edge_cases.yaml b/internal/bloblang2/spec/tests/stdlib/sort_edge_cases.yaml new file mode 100644 index 000000000..f8b29b0aa --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/sort_edge_cases.yaml @@ -0,0 +1,152 @@ +description: > + Sort edge cases — stable sort, mixed numeric families error, NaN total + ordering, sort_by with complex keys, and sort with various types. + +tests: + # --- Basic stable sort --- + + - name: "sort integers ascending" + mapping: | + output.v = [3, 1, 4, 1, 5, 9, 2, 6].sort() + output: {"v": [1, 1, 2, 3, 4, 5, 6, 9]} + + - name: "sort already sorted" + mapping: | + output.v = [1, 2, 3].sort() + output: {"v": [1, 2, 3]} + + - name: "sort reverse sorted" + mapping: | + output.v = [3, 2, 1].sort() + output: {"v": [1, 2, 3]} + + - name: "sort single element" + mapping: | + output.v = [42].sort() + output: {"v": [42]} + + - name: "sort empty array" + mapping: | + output.v = [].sort() + output: {"v": []} + + - name: "sort strings lexicographic" + mapping: | + output.v = ["banana", "apple", "cherry", "date"].sort() + output: {"v": ["apple", "banana", "cherry", "date"]} + + - name: "sort floats" + mapping: | + output.v = [3.14, 1.41, 2.72].sort() + output: {"v": [1.41, 2.72, 3.14]} + + # --- Mixed numeric types within same family --- + + - name: "sort mixed int32 and int64" + mapping: | + output.v = [3, 1.int32(), 2].sort() + output: {"v": [{_type: "int32", value: "1"}, 2, 3]} + + # --- Mixed type families error --- + + - name: "sort mixed strings and integers errors" + mapping: | + output.v = ["a", 1, "b"].sort() + error: "sort" + + - name: "sort mixed booleans and integers errors" + mapping: | + output.v = [true, 1, false].sort() + error: "sort" + + # --- NaN in sort (total ordering: after all numbers) --- + + - name: "sort with special float values" + mapping: | + output.v = input.arr.sort() + cases: + - name: "NaN after positive infinity" + input: {"arr": [{_type: "float64", value: "NaN"}, {_type: "float64", value: "Infinity"}, 1.0]} + output: {"v": [1.0, {_type: "float64", value: "Infinity"}, {_type: "float64", value: "NaN"}]} + - name: "negative infinity before all" + input: {"arr": [0.0, {_type: "float64", value: "-Infinity"}, 1.0]} + output: {"v": [{_type: "float64", value: "-Infinity"}, 0.0, 1.0]} + + # --- sort_by --- + + - name: "sort_by numeric field" + mapping: | + output.v = [ + {"name": "Charlie", "age": 30}, + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 35}, + ].sort_by(x -> x.age) + output: + v: + - name: "Alice" + age: 25 + - name: "Charlie" + age: 30 + - name: "Bob" + age: 35 + + - name: "sort_by string field" + mapping: | + output.v = [ + {"name": "Charlie"}, + {"name": "Alice"}, + {"name": "Bob"}, + ].sort_by(x -> x.name) + output: + v: + - name: "Alice" + - name: "Bob" + - name: "Charlie" + + - name: "sort_by computed key" + mapping: | + output.v = ["banana", "fig", "apple", "kiwi"].sort_by(s -> s.length()) + output: {"v": ["fig", "kiwi", "apple", "banana"]} + + - name: "sort_by with block body and local variables" + mapping: | + output.v = [ + {"first": "John", "last": "Doe"}, + {"first": "Jane", "last": "Abc"}, + {"first": "Bob", "last": "Xyz"}, + ].sort_by(p -> { + $key = p.last + ", " + p.first + $key + }) + output: + v: + - first: "Jane" + last: "Abc" + - first: "John" + last: "Doe" + - first: "Bob" + last: "Xyz" + + - name: "sort_by with outer variable capture" + mapping: | + $field = "age" + output.v = [ + {"name": "B", "age": 20}, + {"name": "A", "age": 10}, + ].sort_by(x -> x[$field]) + output: + v: + - name: "A" + age: 10 + - name: "B" + age: 20 + + # --- sort does not modify original --- + + - name: "sort returns new array" + mapping: | + $arr = [3, 1, 2] + $sorted = $arr.sort() + output.original = $arr + output.sorted = $sorted + output: {"original": [3, 1, 2], "sorted": [1, 2, 3]} diff --git a/internal/bloblang2/spec/tests/stdlib/string_methods.yaml b/internal/bloblang2/spec/tests/stdlib/string_methods.yaml new file mode 100644 index 000000000..c0d555c26 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/string_methods.yaml @@ -0,0 +1,267 @@ +description: "String methods: uppercase, lowercase, trim, trim_prefix, trim_suffix, has_prefix, has_suffix, split, replace_all, repeat" + +tests: + # --- .uppercase() --- + + - name: "uppercase basic" + mapping: | + output = "hello world".uppercase() + output: "HELLO WORLD" + + - name: "uppercase already uppercase" + mapping: | + output = "HELLO".uppercase() + output: "HELLO" + + - name: "uppercase empty string" + mapping: | + output = "".uppercase() + output: "" + + - name: "uppercase mixed case" + mapping: | + output = "hElLo".uppercase() + output: "HELLO" + + - name: "uppercase non-ascii" + mapping: | + output = "caf\u00E9".uppercase() + output: "CAF\u00C9" + + # --- .lowercase() --- + + - name: "lowercase basic" + mapping: | + output = "HELLO WORLD".lowercase() + output: "hello world" + + - name: "lowercase already lowercase" + mapping: | + output = "hello".lowercase() + output: "hello" + + - name: "lowercase empty string" + mapping: | + output = "".lowercase() + output: "" + + - name: "lowercase mixed case" + mapping: | + output = "HeLLo".lowercase() + output: "hello" + + # --- .trim() --- + + - name: "trim spaces" + mapping: | + output = " hello ".trim() + output: "hello" + + - name: "trim tabs and newlines" + mapping: | + output = "\t\nhello\n\t".trim() + output: "hello" + + - name: "trim no whitespace unchanged" + mapping: | + output = "hello".trim() + output: "hello" + + - name: "trim empty string" + mapping: | + output = "".trim() + output: "" + + - name: "trim only whitespace" + mapping: | + output = " \t\n ".trim() + output: "" + + # --- .trim_prefix() --- + + - name: "trim_prefix present" + mapping: | + output = "hello world".trim_prefix("hello ") + output: "world" + + - name: "trim_prefix absent" + mapping: | + output = "hello world".trim_prefix("xyz") + output: "hello world" + + - name: "trim_prefix empty prefix" + mapping: | + output = "hello".trim_prefix("") + output: "hello" + + - name: "trim_prefix entire string" + mapping: | + output = "hello".trim_prefix("hello") + output: "" + + - name: "trim_prefix only removes first occurrence" + mapping: | + output = "aaa".trim_prefix("a") + output: "aa" + + # --- .trim_suffix() --- + + - name: "trim_suffix present" + mapping: | + output = "hello world".trim_suffix(" world") + output: "hello" + + - name: "trim_suffix absent" + mapping: | + output = "hello world".trim_suffix("xyz") + output: "hello world" + + - name: "trim_suffix empty suffix" + mapping: | + output = "hello".trim_suffix("") + output: "hello" + + - name: "trim_suffix entire string" + mapping: | + output = "hello".trim_suffix("hello") + output: "" + + # --- .has_prefix() --- + + - name: "has_prefix true" + mapping: | + output = "hello world".has_prefix("hello") + output: true + + - name: "has_prefix false" + mapping: | + output = "hello world".has_prefix("world") + output: false + + - name: "has_prefix empty prefix is always true" + mapping: | + output = "hello".has_prefix("") + output: true + + - name: "has_prefix exact match" + mapping: | + output = "hello".has_prefix("hello") + output: true + + # --- .has_suffix() --- + + - name: "has_suffix true" + mapping: | + output = "hello world".has_suffix("world") + output: true + + - name: "has_suffix false" + mapping: | + output = "hello world".has_suffix("hello") + output: false + + - name: "has_suffix empty suffix is always true" + mapping: | + output = "hello".has_suffix("") + output: true + + - name: "has_suffix exact match" + mapping: | + output = "hello".has_suffix("hello") + output: true + + # --- .split() --- + + - name: "split by comma" + mapping: | + output = "a,b,c".split(",") + output: ["a", "b", "c"] + + - name: "split no delimiter found" + mapping: | + output = "hello".split(",") + output: ["hello"] + + - name: "split empty string by comma" + mapping: | + output = "".split(",") + output: [""] + + - name: "split empty string by empty string" + mapping: | + output = "".split("") + output: [] + + - name: "split by empty string produces codepoints" + mapping: | + output = "hello".split("") + output: ["h", "e", "l", "l", "o"] + + - name: "split by empty string with non-ascii" + mapping: | + output = "caf\u00E9".split("") + output: ["c", "a", "f", "\u00E9"] + + - name: "split with multi-char delimiter" + mapping: | + output = "one::two::three".split("::") + output: ["one", "two", "three"] + + - name: "split trailing delimiter" + mapping: | + output = "a,b,".split(",") + output: ["a", "b", ""] + + # --- .replace_all() --- + + - name: "replace_all basic" + mapping: | + output = "hello world".replace_all("world", "earth") + output: "hello earth" + + - name: "replace_all multiple occurrences" + mapping: | + output = "aabaa".replace_all("a", "x") + output: "xxbxx" + + - name: "replace_all no match" + mapping: | + output = "hello".replace_all("xyz", "abc") + output: "hello" + + - name: "replace_all empty old with new" + mapping: | + output = "ab".replace_all("", "-") + output: "-a-b-" + + - name: "replace_all remove substring" + mapping: | + output = "hello world".replace_all(" world", "") + output: "hello" + + # --- .repeat() --- + + - name: "repeat basic" + mapping: | + output = "ab".repeat(3) + output: "ababab" + + - name: "repeat zero times" + mapping: | + output = "hello".repeat(0) + output: "" + + - name: "repeat one time" + mapping: | + output = "hello".repeat(1) + output: "hello" + + - name: "repeat negative count is error" + mapping: | + output = "hello".repeat(-1) + error: "count" + + - name: "repeat empty string" + mapping: | + output = "".repeat(5) + output: "" diff --git a/internal/bloblang2/spec/tests/stdlib/string_regex.yaml b/internal/bloblang2/spec/tests/stdlib/string_regex.yaml new file mode 100644 index 000000000..047a40656 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/string_regex.yaml @@ -0,0 +1,158 @@ +description: "String regex methods: re_match, re_find_all, re_replace_all" + +tests: + # --- .re_match() --- + + - name: "re_match simple match" + mapping: | + output = "hello world".re_match("world") + output: true + + - name: "re_match no match" + mapping: | + output = "hello world".re_match("xyz") + output: false + + - name: "re_match partial match returns true" + mapping: | + output = "foobar".re_match("oba") + output: true + + - name: "re_match anchored start" + mapping: | + output = "hello world".re_match("^hello") + output: true + + - name: "re_match anchored start no match" + mapping: | + output = "hello world".re_match("^world") + output: false + + - name: "re_match anchored end" + mapping: | + output = "hello world".re_match("world$") + output: true + + - name: "re_match full anchor" + mapping: | + output = "hello".re_match("^hello$") + output: true + + - name: "re_match digit pattern" + mapping: | + output = "abc123def".re_match("[0-9]+") + output: true + + - name: "re_match digit pattern no match" + mapping: | + output = "abcdef".re_match("[0-9]+") + output: false + + - name: "re_match empty pattern always matches" + mapping: | + output = "anything".re_match("") + output: true + + - name: "re_match empty string with empty pattern" + mapping: | + output = "".re_match("") + output: true + + - name: "re_match character class" + mapping: | + output = "Hello World".re_match("[A-Z][a-z]+") + output: true + + # --- .re_find_all() --- + + - name: "re_find_all basic" + mapping: | + output = "cat bat rat".re_find_all("[a-z]at") + output: ["cat", "bat", "rat"] + + - name: "re_find_all no matches" + mapping: | + output = "hello world".re_find_all("[0-9]+") + output: [] + + - name: "re_find_all digits" + mapping: | + output = "a1b22c333".re_find_all("[0-9]+") + output: ["1", "22", "333"] + + - name: "re_find_all overlapping non-overlap" + mapping: | + output = "aaa".re_find_all("aa") + output: ["aa"] + + - name: "re_find_all single char pattern" + mapping: | + output = "abcabc".re_find_all("a") + output: ["a", "a"] + + - name: "re_find_all empty pattern" + mapping: | + output = "ab".re_find_all("") + output: ["", "", ""] + + - name: "re_find_all word boundaries" + mapping: | + output = "foo bar baz".re_find_all("\\b[a-z]+\\b") + output: ["foo", "bar", "baz"] + + - name: "re_find_all capture groups return full match" + mapping: | + output = "2024-03-01".re_find_all("([0-9]{4})-([0-9]{2})-([0-9]{2})") + output: ["2024-03-01"] + + # --- .re_replace_all() --- + + - name: "re_replace_all basic" + mapping: | + output = "hello world".re_replace_all("world", "earth") + output: "hello earth" + + - name: "re_replace_all with pattern" + mapping: | + output = "abc123def456".re_replace_all("[0-9]+", "NUM") + output: "abcNUMdefNUM" + + - name: "re_replace_all no match unchanged" + mapping: | + output = "hello".re_replace_all("[0-9]+", "X") + output: "hello" + + - name: "re_replace_all backreference $0" + mapping: | + output = "hello world".re_replace_all("[a-z]+", "[$0]") + output: "[hello] [world]" + + - name: "re_replace_all capture group $1" + mapping: | + output = "2024-03-01".re_replace_all("([0-9]{4})-([0-9]{2})-([0-9]{2})", "$2/$3/$1") + output: "03/01/2024" + + - name: "re_replace_all named group" + mapping: | + output = "hello world".re_replace_all("(?P[a-z]+)", "${word}!") + output: "hello! world!" + + - name: "re_replace_all remove matches" + mapping: | + output = "a1b2c3".re_replace_all("[0-9]", "") + output: "abc" + + - name: "re_replace_all entire string" + mapping: | + output = "hello".re_replace_all("^.*$", "replaced") + output: "replaced" + + - name: "re_replace_all empty string match" + mapping: | + output = "ab".re_replace_all("", "-") + output: "-a-b-" + + - name: "re_replace_all special regex chars in replacement" + mapping: | + output = "hello world".re_replace_all("world", "w.o.r.l.d") + output: "hello w.o.r.l.d" diff --git a/internal/bloblang2/spec/tests/stdlib/timestamp_methods.yaml b/internal/bloblang2/spec/tests/stdlib/timestamp_methods.yaml new file mode 100644 index 000000000..a4f3dbc4f --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/timestamp_methods.yaml @@ -0,0 +1,369 @@ +description: "Timestamp methods: ts_parse, ts_format, ts_unix*, ts_from_unix*, ts_add" + +tests: + # --- ts_parse --- + + - name: "ts_parse default format RFC 3339" + mapping: | + output = "2024-01-15T10:30:00Z".ts_parse() + output: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + + - name: "ts_parse with fractional seconds" + mapping: | + output = "2024-01-15T10:30:00.123Z".ts_parse() + output: {_type: "timestamp", value: "2024-01-15T10:30:00.123Z"} + + - name: "ts_parse with timezone offset" + mapping: | + output = "2024-01-15T10:30:00+05:30".ts_parse() + output: {_type: "timestamp", value: "2024-01-15T10:30:00+05:30"} + + - name: "ts_parse with negative offset" + mapping: | + output = "2024-01-15T10:30:00-08:00".ts_parse() + output: {_type: "timestamp", value: "2024-01-15T10:30:00-08:00"} + + - name: "ts_parse custom format date only" + mapping: | + output = "2024-01-15".ts_parse("%Y-%m-%d") + output: {_type: "timestamp", value: "2024-01-15T00:00:00Z"} + + - name: "ts_parse invalid string errors" + mapping: | + output = "not-a-date".ts_parse("%Y-%m-%d") + error: "parse" + + - name: "ts_parse returns timestamp type" + mapping: | + output = "2024-01-15T10:30:00Z".ts_parse().type() + output: "timestamp" + + - name: "ts_parse named format arg" + mapping: | + output = "2024-01-15".ts_parse(format: "%Y-%m-%d") + output: {_type: "timestamp", value: "2024-01-15T00:00:00Z"} + + # --- ts_format --- + + - name: "ts_format default RFC 3339" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_format() + output: "2024-01-15T10:30:00Z" + + - name: "ts_format custom date only" + mapping: | + output = timestamp(2024, 1, 15).ts_format("%Y-%m-%d") + output: "2024-01-15" + + - name: "ts_format with fractional seconds" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0, 500000000).ts_format() + output: "2024-01-15T10:30:00.5Z" + + - name: "ts_format whole seconds omit fraction" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_format() + output: "2024-01-15T10:30:00Z" + + - name: "ts_format matches string conversion" + mapping: | + $ts = timestamp(2024, 1, 15, 10, 30, 0) + output = $ts.ts_format() == $ts.string() + output: true + + - name: "ts_format named format arg" + mapping: | + output = timestamp(2024, 1, 15).ts_format(format: "%Y-%m-%d") + output: "2024-01-15" + + # --- ts_unix --- + + - name: "ts_unix returns epoch seconds" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_unix() + output: 1705314600 + + - name: "ts_unix returns int64" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_unix().type() + output: "int64" + + - name: "ts_unix epoch zero" + mapping: | + output = timestamp(1970, 1, 1, 0, 0, 0).ts_unix() + output: 0 + + - name: "ts_unix truncates sub-second" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0, 999999999).ts_unix() + output: 1705314600 + + # --- ts_unix_milli --- + + - name: "ts_unix_milli returns epoch millis" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_unix_milli() + output: 1705314600000 + + - name: "ts_unix_milli with fractional seconds" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0, 123000000).ts_unix_milli() + output: 1705314600123 + + - name: "ts_unix_milli returns int64" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_unix_milli().type() + output: "int64" + + # --- ts_unix_micro --- + + - name: "ts_unix_micro returns epoch micros" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_unix_micro() + output: 1705314600000000 + + - name: "ts_unix_micro with fractional seconds" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0, 123456000).ts_unix_micro() + output: 1705314600123456 + + # --- ts_unix_nano --- + + - name: "ts_unix_nano returns epoch nanos" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_unix_nano() + output: 1705314600000000000 + + - name: "ts_unix_nano with full precision" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0, 123456789).ts_unix_nano() + output: 1705314600123456789 + + - name: "ts_unix_nano returns int64" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_unix_nano().type() + output: "int64" + + # --- ts_from_unix --- + + - name: "ts_from_unix integer seconds" + mapping: | + output = 1705314600.ts_from_unix() + output: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + + - name: "ts_from_unix float sub-second" + mapping: | + output = 1705314600.5.ts_from_unix() + output: {_type: "timestamp", value: "2024-01-15T10:30:00.5Z"} + + - name: "ts_from_unix epoch zero" + mapping: | + output = 0.ts_from_unix() + output: {_type: "timestamp", value: "1970-01-01T00:00:00Z"} + + - name: "ts_from_unix returns timestamp type" + mapping: | + output = 1705314600.ts_from_unix().type() + output: "timestamp" + + - name: "ts_from_unix non-numeric receiver is error" + mapping: | + output = "not a number".ts_from_unix() + error: "numeric" + + - name: "ts_from_unix bool receiver is error" + mapping: | + output = true.ts_from_unix() + error: "numeric" + + # --- ts_from_unix_milli --- + + - name: "ts_from_unix_milli whole seconds" + mapping: | + output = 1705314600000.ts_from_unix_milli() + output: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + + - name: "ts_from_unix_milli with millis" + mapping: | + output = 1705314600123.ts_from_unix_milli() + output: {_type: "timestamp", value: "2024-01-15T10:30:00.123Z"} + + # --- ts_from_unix_micro --- + + - name: "ts_from_unix_micro whole seconds" + mapping: | + output = 1705314600000000.ts_from_unix_micro() + output: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + + - name: "ts_from_unix_micro with micros" + mapping: | + output = 1705314600123456.ts_from_unix_micro() + output: {_type: "timestamp", value: "2024-01-15T10:30:00.123456Z"} + + # --- ts_from_unix_nano --- + + - name: "ts_from_unix_nano whole seconds" + mapping: | + output = 1705314600000000000.ts_from_unix_nano() + output: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + + - name: "ts_from_unix_nano with full precision" + mapping: | + output = 1705314600123456789.ts_from_unix_nano() + output: {_type: "timestamp", value: "2024-01-15T10:30:00.123456789Z"} + + # --- Round-trips --- + + - name: "ts_unix round-trip" + mapping: | + $ts = timestamp(2024, 1, 15, 10, 30, 0) + output = $ts.ts_unix().ts_from_unix() == $ts + output: true + + - name: "ts_unix_milli round-trip" + mapping: | + $ts = timestamp(2024, 1, 15, 10, 30, 0, 123000000) + output = $ts.ts_unix_milli().ts_from_unix_milli() == $ts + output: true + + - name: "ts_unix_micro round-trip" + mapping: | + $ts = timestamp(2024, 1, 15, 10, 30, 0, 123456000) + output = $ts.ts_unix_micro().ts_from_unix_micro() == $ts + output: true + + - name: "ts_unix_nano round-trip lossless" + mapping: | + $ts = timestamp(2024, 1, 15, 10, 30, 0, 123456789) + output = $ts.ts_unix_nano().ts_from_unix_nano() == $ts + output: true + + - name: "ts_parse ts_format round-trip" + mapping: | + $s = "2024-01-15T10:30:00.123Z" + output = $s.ts_parse().ts_format() + output: "2024-01-15T10:30:00.123Z" + + # --- Stored zone behavior (Section 13.9) --- + # Parsing without a zone directive yields UTC; parsing with a zone stores + # that zone. Formatting uses the stored zone. These tests avoid `now()` + # and `.ts_from_unix*` because their stored zone is the process's local + # zone (implementation-defined and host-dependent). + + - name: "ts_parse no zone directive yields UTC stored zone" + mapping: | + $ts = "2024-03-01T12:00:00".ts_parse("%Y-%m-%dT%H:%M:%S") + output = $ts.ts_format() + output: "2024-03-01T12:00:00Z" + + - name: "ts_parse date-only yields UTC midnight" + mapping: | + $ts = "2024-03-01".ts_parse("%Y-%m-%d") + output = $ts.ts_format() + output: "2024-03-01T00:00:00Z" + + - name: "ts_parse explicit UTC offset preserves Z on format" + mapping: | + $ts = "2024-03-01T12:00:00Z".ts_parse() + output = $ts.ts_format() + output: "2024-03-01T12:00:00Z" + + - name: "ts_parse explicit positive offset preserves offset on format" + mapping: | + $ts = "2024-03-01T12:00:00+05:30".ts_parse() + output = $ts.ts_format() + output: "2024-03-01T12:00:00+05:30" + + - name: "ts_parse explicit negative offset preserves offset on format" + mapping: | + $ts = "2024-03-01T12:00:00-08:00".ts_parse() + output = $ts.ts_format() + output: "2024-03-01T12:00:00-08:00" + + - name: "timestamp constructor UTC default formats with Z" + mapping: | + output = timestamp(2024, 3, 1, 12, 0, 0).ts_format() + output: "2024-03-01T12:00:00Z" + + - name: "timestamp constructor with explicit zone formats with that offset" + mapping: | + output = timestamp(2024, 3, 1, 12, 0, 0, 0, "America/New_York").ts_format() + output: "2024-03-01T12:00:00-05:00" + + - name: "ts_format without zone directive uses stored zone clock" + mapping: | + $ts = timestamp(2024, 3, 1, 12, 0, 0, 0, "America/New_York") + output = $ts.ts_format("%Y-%m-%d %H:%M:%S") + output: "2024-03-01 12:00:00" + + - name: "different stored zones with same instant compare equal" + mapping: | + $a = "2024-03-01T12:00:00Z".ts_parse() + $b = "2024-03-01T07:00:00-05:00".ts_parse() + output = $a == $b + output: true + + - name: "ts_unix is instant-based and ignores stored zone" + mapping: | + $a = "2024-03-01T12:00:00Z".ts_parse() + $b = "2024-03-01T07:00:00-05:00".ts_parse() + output = $a.ts_unix() == $b.ts_unix() + output: true + + # --- ts_add --- + + - name: "ts_add one second" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(second()) + output: {_type: "timestamp", value: "2024-01-15T10:30:01Z"} + + - name: "ts_add one minute" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(minute()) + output: {_type: "timestamp", value: "2024-01-15T10:31:00Z"} + + - name: "ts_add one hour" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(hour()) + output: {_type: "timestamp", value: "2024-01-15T11:30:00Z"} + + - name: "ts_add one day" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(day()) + output: {_type: "timestamp", value: "2024-01-16T10:30:00Z"} + + - name: "ts_add negative subtracts" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(second() * -1) + output: {_type: "timestamp", value: "2024-01-15T10:29:59Z"} + + - name: "ts_add negative crosses day boundary" + mapping: | + output = timestamp(2024, 1, 15, 0, 0, 0).ts_add(second() * -1) + output: {_type: "timestamp", value: "2024-01-14T23:59:59Z"} + + - name: "ts_add multiple hours" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(hour() * 3) + output: {_type: "timestamp", value: "2024-01-15T13:30:00Z"} + + - name: "ts_add compound duration" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(hour() + 30 * minute()) + output: {_type: "timestamp", value: "2024-01-15T12:00:00Z"} + + - name: "ts_add zero is identity" + mapping: | + $ts = timestamp(2024, 1, 15, 10, 30, 0) + output = $ts.ts_add(0) == $ts + output: true + + - name: "ts_add named arg" + mapping: | + output = timestamp(2024, 1, 15, 10, 30, 0).ts_add(nanos: second()) + output: {_type: "timestamp", value: "2024-01-15T10:30:01Z"} + + - name: "ts_add crosses leap day" + mapping: | + output = timestamp(2024, 2, 28, 12, 0, 0).ts_add(day()) + output: {_type: "timestamp", value: "2024-02-29T12:00:00Z"} diff --git a/internal/bloblang2/spec/tests/stdlib/type_conversion.yaml b/internal/bloblang2/spec/tests/stdlib/type_conversion.yaml new file mode 100644 index 000000000..937d96f31 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/type_conversion.yaml @@ -0,0 +1,357 @@ +description: "Type conversion methods: .string(), .int32(), .int64(), .uint32(), .uint64(), .float32(), .float64(), .bool(), .char(), .bytes()" + +tests: + # --- .string() --- + + - name: "string from int64" + mapping: | + output = 42.string() + output: "42" + + - name: "string from negative int64" + mapping: | + output = (-100).string() + output: "-100" + + - name: "string from float64" + mapping: | + output = 3.14.string() + output: "3.14" + + - name: "string from float64 whole number" + mapping: | + output = 5.0.string() + output: "5.0" + + - name: "string from negative zero" + mapping: | + output = input.nz.string() + input: {nz: {_type: "float64", value: "-0.0"}} + output: "0.0" + + - name: "string from float32" + mapping: | + output = 3.14.float32().string() + output: "3.14" + + - name: "string from float32 whole number" + mapping: | + output = 5.0.float32().string() + output: "5.0" + + - name: "string from nan" + mapping: | + output = input.nan.string() + input: {nan: {_type: "float64", value: "NaN"}} + output: "NaN" + + - name: "string from bool true" + mapping: | + output = true.string() + output: "true" + + - name: "string from bool false" + mapping: | + output = false.string() + output: "false" + + - name: "string from null" + mapping: | + output = null.string() + output: "null" + + - name: "string from timestamp" + mapping: | + output = timestamp(2024, 3, 1, 12, 0, 0).string() + output: "2024-03-01T12:00:00Z" + + - name: "string from array compact json" + mapping: | + output = [1, 2, 3].string() + output: "[1,2,3]" + + - name: "string from object keys sorted" + mapping: | + output = {"b": 2, "a": 1}.string() + output: "{\"a\":1,\"b\":2}" + + - name: "string from string is identity" + mapping: | + output = "hello".string() + output: "hello" + + - name: "string from bytes valid utf8" + mapping: | + output = "hello".bytes().string() + output: "hello" + + # --- .int32() --- + + - name: "int32 from int64" + mapping: | + output = 100.int32() + output: {_type: "int32", value: "100"} + + - name: "int32 from float64 truncates" + mapping: | + output = 3.9.int32() + output: {_type: "int32", value: "3"} + + - name: "int32 from negative float truncates toward zero" + mapping: | + output = (-3.9).int32() + output: {_type: "int32", value: "-3"} + + - name: "int32 from string" + mapping: | + output = "42".int32() + output: {_type: "int32", value: "42"} + + - name: "int32 overflow positive" + mapping: | + output = 2147483648.int32() + error: "overflow" + + - name: "int32 overflow negative" + mapping: | + output = (-2147483649).int32() + error: "overflow" + + # --- .int64() --- + + - name: "int64 from float64 truncates" + mapping: | + output = 9.99.int64() + output: 9 + + - name: "int64 from negative float truncates toward zero" + mapping: | + output = (-7.8).int64() + output: -7 + + - name: "int64 from string" + mapping: | + output = "-500".int64() + output: -500 + + - name: "int64 from invalid string" + mapping: | + output = "abc".int64() + error: "convert" + + - name: "int64 identity" + mapping: | + output = 42.int64() + output: 42 + + # --- .uint32() --- + + - name: "uint32 from int64" + mapping: | + output = 255.uint32() + output: {_type: "uint32", value: "255"} + + - name: "uint32 from float truncates toward zero" + mapping: | + output = 3.7.uint32() + output: {_type: "uint32", value: "3"} + + - name: "uint32 negative is error" + mapping: | + output = (-1).uint32() + error: "overflow" + + - name: "uint32 overflow" + mapping: | + output = 4294967296.uint32() + error: "overflow" + + - name: "uint32 from string" + mapping: | + output = "100".uint32() + output: {_type: "uint32", value: "100"} + + # --- .uint64() --- + + - name: "uint64 from int64" + mapping: | + output = 1000.uint64() + output: {_type: "uint64", value: "1000"} + + - name: "uint64 from float truncates toward zero" + mapping: | + output = 99.9.uint64() + output: {_type: "uint64", value: "99"} + + - name: "uint64 negative is error" + mapping: | + output = (-1).uint64() + error: "overflow" + + - name: "uint64 from string" + mapping: | + output = "18446744073709551615".uint64() + output: {_type: "uint64", value: "18446744073709551615"} + + # --- .float32() --- + + - name: "float32 from int64" + mapping: | + output = 42.float32() + output: {_type: "float32", value: "42.0"} + + - name: "float32 from float64" + mapping: | + output = 3.14.float32() + output: {_type: "float32", value: "3.14"} + + - name: "float32 from string" + mapping: | + output = "2.5".float32() + output: {_type: "float32", value: "2.5"} + + - name: "float32 from invalid string" + mapping: | + output = "nope".float32() + error: "convert" + + # --- .float64() --- + + - name: "float64 from int64" + mapping: | + output = 42.float64() + output: 42.0 + + - name: "float64 from string" + mapping: | + output = "3.14".float64() + output: 3.14 + + - name: "float64 from bool is error" + mapping: | + output = true.float64() + error: "float64" + + - name: "float64 from invalid string" + mapping: | + output = "xyz".float64() + error: "convert" + + # --- .bool() --- + + - name: "bool from true" + mapping: | + output = true.bool() + output: true + + - name: "bool from false" + mapping: | + output = false.bool() + output: false + + - name: "bool from string true" + mapping: | + output = "true".bool() + output: true + + - name: "bool from string false" + mapping: | + output = "false".bool() + output: false + + - name: "bool from int zero is false" + mapping: | + output = 0.bool() + output: false + + - name: "bool from int nonzero is true" + mapping: | + output = 1.bool() + output: true + + - name: "bool from negative int is true" + mapping: | + output = (-5).bool() + output: true + + - name: "bool from float zero is false" + mapping: | + output = 0.0.bool() + output: false + + - name: "bool from negative zero is false" + mapping: | + output = input.nz.bool() + input: {nz: {_type: "float64", value: "-0.0"}} + output: false + + - name: "bool from infinity is true" + mapping: | + output = input.inf.bool() + input: {inf: {_type: "float64", value: "Infinity"}} + output: true + + - name: "bool from nan is error" + mapping: | + output = input.nan.bool() + input: {nan: {_type: "float64", value: "NaN"}} + error: "NaN" + + - name: "bool from invalid string is error" + mapping: | + output = "maybe".bool() + error: "convert" + + # --- .char() --- + + - name: "char from ascii codepoint" + mapping: | + output = 65.char() + output: "A" + + - name: "char from emoji codepoint" + mapping: | + output = 128512.char() + output: "\U0001F600" + + - name: "char from non-ascii codepoint" + mapping: | + output = 233.char() + output: "\u00E9" + + - name: "char from zero codepoint" + mapping: | + output = 0.char() + output: "\u0000" + + - name: "char from invalid codepoint" + mapping: | + output = (-1).char() + error: "codepoint" + + # --- .bytes() --- + + - name: "bytes from string" + mapping: | + output = "hello".bytes() + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "bytes from bytes identity" + mapping: | + output = "hello".bytes().bytes() + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "bytes from int goes through string" + mapping: | + output = 42.bytes() + output: {_type: "bytes", value: "NDI="} + + - name: "bytes from bool goes through string" + mapping: | + output = true.bytes() + output: {_type: "bytes", value: "dHJ1ZQ=="} + + - name: "bytes from empty string" + mapping: | + output = "".bytes() + output: {_type: "bytes", value: ""} diff --git a/internal/bloblang2/spec/tests/stdlib/unique_flatten.yaml b/internal/bloblang2/spec/tests/stdlib/unique_flatten.yaml new file mode 100644 index 000000000..b7ee551bf --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/unique_flatten.yaml @@ -0,0 +1,98 @@ +description: > + .unique() and .flatten() methods — deduplication and one-level flattening. + +tests: + # --- unique basic --- + + - name: "unique integers" + mapping: | + output.v = [1, 2, 3, 2, 1].unique() + output: {"v": [1, 2, 3]} + + - name: "unique strings" + mapping: | + output.v = ["a", "b", "a", "c", "b"].unique() + output: {"v": ["a", "b", "c"]} + + - name: "unique preserves first occurrence order" + mapping: | + output.v = [3, 1, 2, 1, 3, 2].unique() + output: {"v": [3, 1, 2]} + + - name: "unique on empty array" + mapping: | + output.v = [].unique() + output: {"v": []} + + - name: "unique single element" + mapping: | + output.v = [42].unique() + output: {"v": [42]} + + - name: "unique all same elements" + mapping: | + output.v = [5, 5, 5, 5].unique() + output: {"v": [5]} + + - name: "unique booleans" + mapping: | + output.v = [true, false, true, false, true].unique() + output: {"v": [true, false]} + + - name: "unique with null values" + mapping: | + output.v = [1, null, 2, null, 3].unique() + output: {"v": [1, null, 2, 3]} + + # --- unique with key function --- + + - name: "unique with key function on objects" + mapping: | + output.v = [ + {"id": 1, "name": "Alice"}, + {"id": 2, "name": "Bob"}, + {"id": 1, "name": "Alice2"}, + ].unique(x -> x.id) + output: {"v": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]} + + - name: "unique with key function — string length" + mapping: | + output.v = ["hi", "hey", "yo", "sup"].unique(s -> s.length()) + output: {"v": ["hi", "hey"]} + + # --- flatten basic --- + + - name: "flatten nested arrays one level" + mapping: | + output.v = [[1, 2], [3, 4], [5]].flatten() + output: {"v": [1, 2, 3, 4, 5]} + + - name: "flatten empty inner arrays" + mapping: | + output.v = [[], [1], [], [2, 3], []].flatten() + output: {"v": [1, 2, 3]} + + - name: "flatten empty outer array" + mapping: | + output.v = [].flatten() + output: {"v": []} + + - name: "flatten non-array elements kept as-is" + mapping: | + output.v = [1, [2, 3], 4, [5]].flatten() + output: {"v": [1, 2, 3, 4, 5]} + + - name: "flatten only goes one level deep" + mapping: | + output.v = [[[1, 2]], [[3]]].flatten() + output: {"v": [[1, 2], [3]]} + + - name: "flatten single nested array" + mapping: | + output.v = [[1, 2, 3]].flatten() + output: {"v": [1, 2, 3]} + + - name: "flatten preserves non-array types in mixed" + mapping: | + output.v = ["a", ["b", "c"], "d"].flatten() + output: {"v": ["a", "b", "c", "d"]} diff --git a/internal/bloblang2/spec/tests/stdlib/void_function.yaml b/internal/bloblang2/spec/tests/stdlib/void_function.yaml new file mode 100644 index 000000000..c16592923 --- /dev/null +++ b/internal/bloblang2/spec/tests/stdlib/void_function.yaml @@ -0,0 +1,204 @@ +description: "void() builtin — explicit spelling of the void sentinel (Section 13.1)" + +tests: + # --- Statement RHS: assignment skipped --- + + - name: "void() as output RHS skips assignment (no prior)" + mapping: | + output.a = void() + output.b = 1 + output: {"b": 1} + + - name: "void() as output RHS leaves prior value intact" + mapping: | + output.v = "pending" + output.v = void() + output: {"v": "pending"} + + - name: "void() as root output RHS with no prior is skipped (stays {})" + mapping: | + output = void() + output: {} + + - name: "void() as root output RHS preserves prior root assignment" + mapping: | + output = {"kept": true} + output = void() + output: {"kept": true} + + - name: "void() as root output RHS preserves prior field assignments" + mapping: | + output.a = 1 + output.b = 2 + output = void() + output: {"a": 1, "b": 2} + + - name: "void() as root output via if-without-else is skipped" + mapping: | + output.keep = 1 + output = if false { "replace" } + output: {"keep": 1} + + - name: "void() in if branch skips assignment" + mapping: | + output.status = if input.ready { "done" } else { void() } + input: {"ready": false} + output: {} + + - name: "void() in if branch leaves prior value intact" + mapping: | + output.status = "pending" + output.status = if input.ready { "done" } else { void() } + input: {"ready": false} + output: {"status": "pending"} + + # --- Match: void() in a case arm --- + + - name: "void() in match case skips assignment" + mapping: | + output.c = match input.kind { + "a" => "apple", + "b" => "banana", + _ => void(), + } + input: {"kind": "c"} + output: {} + + - name: "void() in match case works alongside other arms" + mapping: | + output.c = match input.kind { + "a" => "apple", + _ => void(), + } + input: {"kind": "a"} + output: {"c": "apple"} + + # --- Variable assignments --- + + - name: "void() as variable declaration RHS is a runtime error" + mapping: | + $x = void() + output.v = $x + error: "" + + - name: "void() as variable reassignment preserves prior value" + mapping: | + $x = 10 + $x = void() + output.v = $x + output: {"v": 10} + + # --- Collection literals: void() errors, use deleted() instead --- + + - name: "void() in array literal is a runtime error" + mapping: | + output.items = [1, void(), 3] + error: "" + + - name: "void() in object literal value is a runtime error" + mapping: | + output.obj = {"a": 1, "b": void(), "c": 3} + error: "" + + - name: "deleted() omits in collection literal (contrast with void)" + mapping: | + output.items = [1, deleted(), 3] + output.obj = {"a": 1, "b": deleted(), "c": 3} + output: {"items": [1, 3], "obj": {"a": 1, "c": 3}} + + # --- Operand / argument: void() errors --- + + - name: "void() as operator operand is a runtime error" + mapping: | + output.v = void() + 1 + error: "" + + - name: "void() as function argument is a runtime error" + mapping: | + output.v = throw(void()) + error: "" + + # --- .or() rescues void --- + + - name: ".or() rescues void() to a default" + mapping: | + output.v = void().or("fallback") + output: {"v": "fallback"} + + # --- .catch() passes void through unchanged (then rescued later) --- + + - name: ".catch().or() rescues void() produced earlier in the chain" + mapping: | + output.v = void().catch(_ -> "caught").or("fallback") + output: {"v": "fallback"} + + # --- In .map() lambdas: error --- + + - name: ".map() lambda returning void() is a runtime error" + mapping: | + output.xs = [1, 2, 3].map(_ -> void()) + error: "" + + # --- type() on void: error --- + + - name: ".type() on void() is a runtime error" + mapping: | + output.t = void().type() + error: "" + + # --- void() has no arguments --- + + - name: "void() with an argument is a compile error" + mapping: | + output.v = void(1) + compile_error: "void" + + # --- void is a reserved name (Section 1.3) --- + # Mirrors the restrictions on `deleted` and `throw`: cannot be a variable, + # map, or parameter name. Field names are unaffected (they use `word`). + + - name: "void cannot be used as a variable name" + mapping: | + $void = 1 + output.v = $void + compile_error: "void" + + - name: "void cannot be used as a map name" + mapping: | + map void(x) { x + 1 } + output.v = void(1) + compile_error: "void" + + - name: "void cannot be used as a map parameter name" + mapping: | + map wrap(void) { void } + output.v = wrap(1) + compile_error: "void" + + - name: "void cannot be used as a lambda parameter name" + mapping: | + output.xs = [1, 2, 3].map(void -> void + 1) + compile_error: "void" + + - name: "void cannot be used as a match as binding" + mapping: | + output.v = match input.x as void { + void > 0 => "pos", + _ => "other", + } + input: {"x": 1} + compile_error: "void" + + - name: "void is valid as a field name after ." + mapping: | + output.void = input.void + input: {"void": 42} + output: {"void": 42} + + - name: "void is valid as a metadata key" + mapping: | + output@.void = "routed" + output.x = input@.void + input_metadata: {"void": "inbound"} + output: {"x": "inbound"} + output_metadata: {"void": "routed"} diff --git a/internal/bloblang2/spec/tests/types/array.yaml b/internal/bloblang2/spec/tests/types/array.yaml new file mode 100644 index 000000000..8fc1aa7b0 --- /dev/null +++ b/internal/bloblang2/spec/tests/types/array.yaml @@ -0,0 +1,167 @@ +description: "Array literals, indexing, nesting, trailing commas, and methods" + +tests: + # --- Literals --- + + - name: "empty array literal" + mapping: | + output.arr = [] + output: {"arr": []} + + - name: "single element array" + mapping: | + output.arr = [42] + output: {"arr": [42]} + + - name: "multi-element array" + mapping: | + output.arr = [1, 2, 3] + output: {"arr": [1, 2, 3]} + + - name: "mixed type array" + mapping: | + output.arr = [1, "two", true, null, 3.14] + output: {"arr": [1, "two", true, null, 3.14]} + + - name: "trailing comma allowed" + mapping: | + output.arr = [1, 2, 3,] + output: {"arr": [1, 2, 3]} + + - name: "nested arrays" + mapping: | + output.arr = [[1, 2], [3, 4]] + output: {"arr": [[1, 2], [3, 4]]} + + - name: "array with expressions" + mapping: | + $x = 10 + output.arr = [$x, $x + 1, $x * 2] + output: {"arr": [10, 11, 20]} + + - name: "deeply nested arrays" + mapping: | + output.arr = [[[1]]] + output: {"arr": [[[1]]]} + + # --- Indexing --- + + - name: "index first element" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[0] + output: {"v": 10} + + - name: "index last element positive" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[2] + output: {"v": 30} + + - name: "negative index last element" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-1] + output: {"v": 30} + + - name: "negative index second to last" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-2] + output: {"v": 20} + + - name: "negative index first element" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-3] + output: {"v": 10} + + - name: "out of bounds positive index" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[3] + error: "out of bounds" + + - name: "out of bounds negative index" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[-4] + error: "out of bounds" + + - name: "index empty array" + mapping: | + $arr = [] + output.v = $arr[0] + error: "out of bounds" + + - name: "index with float whole number accepted" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[2.0] + output: {"v": 30} + + - name: "index with non-whole float is error" + mapping: | + $arr = [10, 20, 30] + output.v = $arr[1.5] + error: "whole number" + + - name: "index with string is error" + mapping: | + $arr = [10, 20, 30] + output.v = $arr["0"] + error: "non-numeric" + + - name: "nested array indexing" + mapping: | + $arr = [[1, 2], [3, 4]] + output.v = $arr[1][0] + output: {"v": 3} + + # --- Length --- + + - name: "length of empty array" + mapping: | + output.len = [].length() + output: {"len": 0} + + - name: "length of non-empty array" + mapping: | + output.len = [1, 2, 3].length() + output: {"len": 3} + + # --- Type --- + + - name: "array type" + mapping: | + output.t = [1, 2, 3].type() + output: {"t": "array"} + + - name: "empty array type" + mapping: | + output.t = [].type() + output: {"t": "array"} + + # --- Array from input --- + + - name: "array from input field" + input: {"items": [10, 20, 30]} + mapping: | + output.first = input.items[0] + output.last = input.items[-1] + output.len = input.items.length() + output: {"first": 10, "last": 30, "len": 3} + + # --- Deleted in array literal --- + + - name: "deleted in array literal removes element" + mapping: | + output.arr = [1, deleted(), 3] + output: {"arr": [1, 3]} + + # --- Void in array literal is error --- + + - name: "void in array literal is error" + mapping: | + output.arr = [1, if false { 2 }, 3] + error: "void" diff --git a/internal/bloblang2/spec/tests/types/bool_null.yaml b/internal/bloblang2/spec/tests/types/bool_null.yaml new file mode 100644 index 000000000..8fcf4393a --- /dev/null +++ b/internal/bloblang2/spec/tests/types/bool_null.yaml @@ -0,0 +1,244 @@ +description: "Boolean and null literals, equality, type checking" + +tests: + # --- Boolean literals --- + + - name: "true literal" + mapping: | + output = true + output: true + + - name: "false literal" + mapping: | + output = false + output: false + + - name: "true type" + mapping: | + output = true.type() + output: "bool" + + - name: "false type" + mapping: | + output = false.type() + output: "bool" + + # --- Boolean equality --- + + - name: "true equals true" + mapping: | + output = true == true + output: true + + - name: "false equals false" + mapping: | + output = false == false + output: true + + - name: "true not equals false" + mapping: | + output = true != false + output: true + + - name: "true equals false is false" + mapping: | + output = true == false + output: false + + # --- Boolean logical operators --- + + - name: "logical and true true" + mapping: | + output = true && true + output: true + + - name: "logical and true false" + mapping: | + output = true && false + output: false + + - name: "logical or false true" + mapping: | + output = false || true + output: true + + - name: "logical or false false" + mapping: | + output = false || false + output: false + + - name: "logical not true" + mapping: | + output = !true + output: false + + - name: "logical not false" + mapping: | + output = !false + output: true + + # --- Boolean cross-type equality --- + + - name: "bool equals int is false" + mapping: | + output = true == 1 + output: false + + - name: "bool equals string is false" + mapping: | + output = true == "true" + output: false + + - name: "bool equals null is false" + mapping: | + output = false == null + output: false + + # --- Boolean conversions --- + + - name: "bool from string true" + mapping: | + output = "true".bool() + output: true + + - name: "bool from string false" + mapping: | + output = "false".bool() + output: false + + - name: "bool from int64 nonzero" + mapping: | + output = 1.bool() + output: true + + - name: "bool from int64 zero" + mapping: | + output = 0.bool() + output: false + + - name: "bool from float64 nonzero" + mapping: | + output = 3.14.bool() + output: true + + - name: "bool from float64 zero" + mapping: | + output = 0.0.bool() + output: false + + - name: "bool to string true" + mapping: | + output = true.string() + output: "true" + + - name: "bool to string false" + mapping: | + output = false.string() + output: "false" + + - name: "bool to int64 is error" + mapping: | + output = true.int64() + error: "int64" + + # --- Boolean errors with non-boolean operators --- + + - name: "bool arithmetic is error" + mapping: | + output = true + false + error: "cannot add" + + - name: "bool comparison is error" + mapping: | + output = true > false + error: "comparable" + + - name: "logical and with non-bool is error" + mapping: | + output = true && 1 + error: "bool" + + - name: "logical or with non-bool is error" + mapping: | + output = false || "yes" + error: "bool" + + - name: "logical not with non-bool is error" + mapping: | + output = !42 + error: "bool" + + # --- Null literal --- + + - name: "null literal" + mapping: | + output = null + output: null + + - name: "null type" + mapping: | + output = null.type() + output: "null" + + # --- Null equality --- + + - name: "null equals null" + mapping: | + output = null == null + output: true + + - name: "null not equals null" + mapping: | + output = null != null + output: false + + - name: "null equals zero is false" + mapping: | + output = null == 0 + output: false + + - name: "null equals empty string is false" + mapping: | + output = null == "" + output: false + + - name: "null equals false is false" + mapping: | + output = null == false + output: false + + # --- Null errors in operations --- + + - name: "null arithmetic is error" + mapping: | + output = null + 5 + error: "add" + + - name: "null comparison is error" + mapping: | + output = null > 5 + error: "comparable" + + - name: "null method call is error" + mapping: | + output = null.uppercase() + error: "null" + + # --- Null string conversion --- + + - name: "null to string" + mapping: | + output = null.string() + output: "null" + + # --- Null with .or() --- + + - name: "null or default" + mapping: | + output = null.or("default") + output: "default" + + - name: "non-null or default returns value" + mapping: | + output = "hello".or("default") + output: "hello" diff --git a/internal/bloblang2/spec/tests/types/bytes.yaml b/internal/bloblang2/spec/tests/types/bytes.yaml new file mode 100644 index 000000000..bdf22699d --- /dev/null +++ b/internal/bloblang2/spec/tests/types/bytes.yaml @@ -0,0 +1,263 @@ +description: "Bytes type: creation from strings, byte-level operations, encoding" + +tests: + # --- Bytes creation --- + + - name: "bytes from string" + mapping: | + output = "hello".bytes() + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "bytes type check" + mapping: | + output = "hello".bytes().type() + output: "bytes" + + - name: "bytes from empty string" + mapping: | + output = "".bytes() + output: {_type: "bytes", value: ""} + + - name: "bytes from bytes is unchanged" + mapping: | + output = "hello".bytes().bytes() + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "bytes from integer goes through string" + mapping: | + output = 42.bytes() + output: {_type: "bytes", value: "NDI="} + + - name: "bytes from bool goes through string" + mapping: | + output = true.bytes() + output: {_type: "bytes", value: "dHJ1ZQ=="} + + - name: "bytes from null goes through string" + mapping: | + output = null.bytes() + output: {_type: "bytes", value: "bnVsbA=="} + + # --- Bytes length (byte-based) --- + + - name: "bytes length ascii" + mapping: | + output = "hello".bytes().length() + output: 5 + + - name: "bytes length empty" + mapping: | + output = "".bytes().length() + output: 0 + + - name: "bytes length multibyte utf8" + mapping: | + output = "\u{1F44B}".bytes().length() + output: 4 + + - name: "bytes length non-ascii two byte" + mapping: | + output = "\u00E9".bytes().length() + output: 2 + + # --- Bytes indexing (byte-based, returns int64) --- + + - name: "bytes index first byte" + mapping: | + output = "hello".bytes()[0] + output: 104 + + - name: "bytes index last byte" + mapping: | + output = "hello".bytes()[4] + output: 111 + + - name: "bytes negative index" + mapping: | + output = "hello".bytes()[-1] + output: 111 + + - name: "bytes index out of bounds" + mapping: | + output = "hello".bytes()[5] + error: "out of bounds" + + - name: "bytes negative index out of bounds" + mapping: | + output = "hello".bytes()[-6] + error: "out of bounds" + + - name: "bytes index returns byte value 0-255" + mapping: | + output = "\u00E9".bytes()[0] + output: 195 + + # --- Bytes to string --- + + - name: "bytes to string utf8" + mapping: | + output = "hello".bytes().string() + output: "hello" + + - name: "bytes to string empty" + mapping: | + output = "".bytes().string() + output: "" + + # --- Bytes concatenation --- + + - name: "bytes concatenation" + mapping: | + output = "hello".bytes() + " world".bytes() + output: {_type: "bytes", value: "aGVsbG8gd29ybGQ="} + + - name: "bytes concat with empty" + mapping: | + output = "hello".bytes() + "".bytes() + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "bytes plus string is error" + mapping: | + output = "hello".bytes() + " world" + error: "cannot add" + + - name: "string plus bytes is error" + mapping: | + output = "hello" + " world".bytes() + error: "cannot add" + + # --- Bytes slicing --- + + - name: "bytes slice basic" + mapping: | + output = "hello".bytes().slice(0, 3) + output: {_type: "bytes", value: "aGVs"} + + - name: "bytes slice to end" + mapping: | + output = "hello".bytes().slice(3) + output: {_type: "bytes", value: "bG8="} + + - name: "bytes slice negative" + mapping: | + output = "hello".bytes().slice(-3, -1) + output: {_type: "bytes", value: "bGw="} + + - name: "bytes slice clamped" + mapping: | + output = "hello".bytes().slice(0, 100) + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "bytes slice empty result" + mapping: | + output = "hello".bytes().slice(3, 1) + output: {_type: "bytes", value: ""} + + # --- Bytes reverse --- + + - name: "bytes reverse" + mapping: | + output = "hello".bytes().reverse() + output: {_type: "bytes", value: "b2xsZWg="} + + - name: "bytes reverse empty" + mapping: | + output = "".bytes().reverse() + output: {_type: "bytes", value: ""} + + # --- Bytes contains --- + + - name: "bytes contains subsequence true" + mapping: | + output = "hello".bytes().contains("ll".bytes()) + output: true + + - name: "bytes contains subsequence false" + mapping: | + output = "hello".bytes().contains("xyz".bytes()) + output: false + + # --- Bytes index_of --- + + - name: "bytes index_of found" + mapping: | + output = "hello".bytes().index_of("ll".bytes()) + output: 2 + + - name: "bytes index_of not found" + mapping: | + output = "hello".bytes().index_of("xyz".bytes()) + output: -1 + + # --- Bytes encoding --- + + - name: "bytes encode base64" + mapping: | + output = "hello".bytes().encode("base64") + output: "aGVsbG8=" + + - name: "bytes encode hex" + mapping: | + output = "hello".bytes().encode("hex") + output: "68656c6c6f" + + - name: "string encode base64" + mapping: | + output = "hello".encode("base64") + output: "aGVsbG8=" + + # --- Bytes decoding --- + + - name: "decode base64 to bytes" + mapping: | + output = "aGVsbG8=".decode("base64") + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "decode hex to bytes" + mapping: | + output = "68656c6c6f".decode("hex") + output: {_type: "bytes", value: "aGVsbG8="} + + - name: "decode base64 then string" + mapping: | + output = "aGVsbG8=".decode("base64").string() + output: "hello" + + - name: "decode invalid base64 is error" + mapping: | + output = "not-valid-base64!!!".decode("base64") + error: "decode" + + - name: "decode invalid hex is error" + mapping: | + output = "zzzz".decode("hex") + error: "decode" + + # --- Bytes equality --- + + - name: "bytes equal same content" + mapping: | + output = "hello".bytes() == "hello".bytes() + output: true + + - name: "bytes not equal different content" + mapping: | + output = "hello".bytes() == "world".bytes() + output: false + + - name: "bytes not equal to string cross type" + mapping: | + output = "hello".bytes() == "hello" + output: false + + # --- Bytes comparison --- + + - name: "bytes less than lexicographic" + mapping: | + output = "abc".bytes() < "abd".bytes() + output: true + + - name: "bytes greater than lexicographic" + mapping: | + output = "b".bytes() > "a".bytes() + output: true diff --git a/internal/bloblang2/spec/tests/types/floats.yaml b/internal/bloblang2/spec/tests/types/floats.yaml new file mode 100644 index 000000000..463d2b49c --- /dev/null +++ b/internal/bloblang2/spec/tests/types/floats.yaml @@ -0,0 +1,251 @@ +description: "Float types: float32, float64, NaN, Infinity, negative zero" + +tests: + # --- float64 literals (default) --- + + - name: "float literal is float64" + mapping: | + output = 3.14.type() + output: "float64" + + - name: "float64 zero" + mapping: | + output = 0.0 + output: 0.0 + + - name: "float64 negative" + mapping: | + output = -3.14 + output: -3.14 + + - name: "float64 small decimal" + mapping: | + output = 0.001 + output: 0.001 + + - name: "float64 large value" + mapping: | + output = 1000000.5 + output: 1000000.5 + + # --- float32 conversions --- + + - name: "float32 from float64" + mapping: | + output = 3.14.float32() + output: {_type: "float32", value: "3.14"} + + - name: "float32 type check" + mapping: | + output = 3.14.float32().type() + output: "float32" + + - name: "float32 from string" + mapping: | + output = "3.14".float32() + output: {_type: "float32", value: "3.14"} + + - name: "float32 from integer" + mapping: | + output = 42.float32() + output: {_type: "float32", value: "42.0"} + + - name: "float32 zero" + mapping: | + output = 0.0.float32() + output: {_type: "float32", value: "0.0"} + + - name: "float32 negative" + mapping: | + output = (-2.5).float32() + output: {_type: "float32", value: "-2.5"} + + # --- float64 conversions --- + + - name: "float64 from string" + mapping: | + output = "3.14".float64() + output: 3.14 + + - name: "float64 from integer" + mapping: | + output = 42.float64() + output: 42.0 + + - name: "float64 from bool is error" + mapping: | + output = true.float64() + error: "float64" + + - name: "float64 from invalid string" + mapping: | + output = "not_a_number".float64() + error: "convert" + + # --- float64 arithmetic --- + + - name: "float64 addition" + mapping: | + output = 1.5 + 2.5 + output: 4.0 + + - name: "float64 subtraction" + mapping: | + output = 5.0 - 3.5 + output: 1.5 + + - name: "float64 multiplication" + mapping: | + output = 2.5 * 4.0 + output: 10.0 + + - name: "float64 division" + mapping: | + output = 10.0 / 3.0 + output: 3.3333333333333335 + + - name: "float64 modulo" + mapping: | + output = 7.5 % 2.0 + output: 1.5 + + # --- Division by zero --- + + - name: "float division by zero is error" + mapping: | + output = 7.0 / 0.0 + error: "division by zero" + + - name: "integer division by zero is error" + mapping: | + output = 7 / 0 + error: "division by zero" + + # --- NaN behavior --- + + - name: "nan not equal to itself" + mapping: | + $nan = input.nan + output = $nan == $nan + input: {nan: {_type: "float64", value: "NaN"}} + output: false + + - name: "nan not equal to nan explicit" + mapping: | + $nan = input.nan + output = $nan != $nan + input: {nan: {_type: "float64", value: "NaN"}} + output: true + + - name: "nan less than any is false" + mapping: | + output = input.nan < 1.0 + input: {nan: {_type: "float64", value: "NaN"}} + output: false + + - name: "nan greater than any is false" + mapping: | + output = input.nan > 1.0 + input: {nan: {_type: "float64", value: "NaN"}} + output: false + + - name: "nan arithmetic produces nan" + mapping: | + output = input.nan + 1.0 + input: {nan: {_type: "float64", value: "NaN"}} + output: {_type: "float64", value: "NaN"} + + - name: "nan type is float64" + mapping: | + output = input.nan.type() + input: {nan: {_type: "float64", value: "NaN"}} + output: "float64" + + - name: "nan bool conversion is error" + mapping: | + output = input.nan.bool() + input: {nan: {_type: "float64", value: "NaN"}} + error: "NaN" + + # --- Infinity behavior --- + + - name: "infinity greater than any number" + mapping: | + output = input.inf > 999999999.0 + input: {inf: {_type: "float64", value: "Infinity"}} + output: true + + - name: "infinity equals infinity" + mapping: | + output = input.inf == input.inf + input: {inf: {_type: "float64", value: "Infinity"}} + output: true + + - name: "negative infinity less than any number" + mapping: | + output = input.ninf < -999999999.0 + input: {ninf: {_type: "float64", value: "-Infinity"}} + output: true + + - name: "infinity type is float64" + mapping: | + output = input.inf.type() + input: {inf: {_type: "float64", value: "Infinity"}} + output: "float64" + + - name: "infinity minus infinity is nan" + mapping: | + output = input.inf - input.inf + input: {inf: {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "NaN"} + + # --- Negative zero --- + + - name: "negative zero equals positive zero" + mapping: | + output = input.nz == 0.0 + input: {nz: {_type: "float64", value: "-0.0"}} + output: true + + - name: "negative zero not less than positive zero" + mapping: | + output = input.nz < 0.0 + input: {nz: {_type: "float64", value: "-0.0"}} + output: false + + - name: "negative zero string normalizes to zero" + mapping: | + output = input.nz.string() + input: {nz: {_type: "float64", value: "-0.0"}} + output: "0.0" + + # --- Float-integer promotion --- + + - name: "int plus float promotes to float64" + mapping: | + output = 5 + 3.0 + output: 8.0 + + - name: "int plus float result type" + mapping: | + output = (5 + 3.0).type() + output: "float64" + + - name: "large int to float precision error" + mapping: | + output = 9007199254740993 + 1.0 + error: "exact" + + # --- float32 arithmetic promotion --- + + - name: "float32 plus float64 promotes to float64" + mapping: | + output = (1.5.float32() + 2.5).type() + output: "float64" + + - name: "float32 division result" + mapping: | + $a = 10.0.float32() + $b = 3.0.float32() + output = ($a / $b).type() + output: "float32" diff --git a/internal/bloblang2/spec/tests/types/integers.yaml b/internal/bloblang2/spec/tests/types/integers.yaml new file mode 100644 index 000000000..dff63f021 --- /dev/null +++ b/internal/bloblang2/spec/tests/types/integers.yaml @@ -0,0 +1,278 @@ +description: "Integer types: int32, int64, uint32, uint64 literals, limits, and conversions" + +tests: + # --- int64 literals (default) --- + + - name: "integer literal is int64" + mapping: | + output = 42.type() + output: "int64" + + - name: "zero literal is int64" + mapping: | + output = 0.type() + output: "int64" + + - name: "negative integer via unary minus" + mapping: | + output = (-10).type() + output: "int64" + + - name: "negative integer value" + mapping: | + output = -10 + output: -10 + + - name: "int64 max value" + mapping: | + output = 9223372036854775807 + output: 9223372036854775807 + + - name: "int64 min literal exceeds int64 range" + mapping: | + output = -9223372036854775808 + compile_error: "exceeds" + + - name: "int64 min value via arithmetic" + mapping: | + output = -9223372036854775807 - 1 + output: -9223372036854775808 + + # --- int32 conversions --- + + - name: "int32 from int64 literal" + mapping: | + output = 42.int32() + output: {_type: "int32", value: "42"} + + - name: "int32 type check" + mapping: | + output = 42.int32().type() + output: "int32" + + - name: "int32 from string" + mapping: | + output = "42".int32() + output: {_type: "int32", value: "42"} + + - name: "int32 max value" + mapping: | + output = 2147483647.int32() + output: {_type: "int32", value: "2147483647"} + + - name: "int32 min value" + mapping: | + output = (-2147483648).int32() + output: {_type: "int32", value: "-2147483648"} + + - name: "int32 overflow positive" + mapping: | + output = 2147483648.int32() + error: "overflow" + + - name: "int32 overflow negative" + mapping: | + output = (-2147483649).int32() + error: "overflow" + + - name: "int32 zero" + mapping: | + output = 0.int32() + output: {_type: "int32", value: "0"} + + - name: "int32 negative" + mapping: | + output = (-100).int32() + output: {_type: "int32", value: "-100"} + + # --- uint32 conversions --- + + - name: "uint32 from int64" + mapping: | + output = 42.uint32() + output: {_type: "uint32", value: "42"} + + - name: "uint32 type check" + mapping: | + output = 42.uint32().type() + output: "uint32" + + - name: "uint32 from string" + mapping: | + output = "255".uint32() + output: {_type: "uint32", value: "255"} + + - name: "uint32 max value" + mapping: | + output = 4294967295.uint32() + output: {_type: "uint32", value: "4294967295"} + + - name: "uint32 zero" + mapping: | + output = 0.uint32() + output: {_type: "uint32", value: "0"} + + - name: "uint32 overflow" + mapping: | + output = 4294967296.uint32() + error: "overflow" + + - name: "uint32 negative is error" + mapping: | + output = (-1).uint32() + error: "overflow" + + # --- uint64 conversions --- + + - name: "uint64 from int64" + mapping: | + output = 42.uint64() + output: {_type: "uint64", value: "42"} + + - name: "uint64 type check" + mapping: | + output = 42.uint64().type() + output: "uint64" + + - name: "uint64 from string" + mapping: | + output = "1000".uint64() + output: {_type: "uint64", value: "1000"} + + - name: "uint64 max from string" + mapping: | + output = "18446744073709551615".uint64() + output: {_type: "uint64", value: "18446744073709551615"} + + - name: "uint64 zero" + mapping: | + output = 0.uint64() + output: {_type: "uint64", value: "0"} + + - name: "uint64 negative is error" + mapping: | + output = (-1).uint64() + error: "overflow" + + - name: "uint64 max as bare literal is compile error" + mapping: | + output = 18446744073709551615.uint64() + compile_error: "exceeds" + + - name: "uint64 overflow from string" + mapping: | + output = "18446744073709551616".uint64() + error: "overflow" + + # --- int64 conversions --- + + - name: "int64 from string" + mapping: | + output = "42".int64() + output: 42 + + - name: "int64 from float truncates" + mapping: | + output = 3.9.int64() + output: 3 + + - name: "int64 from negative float truncates toward zero" + mapping: | + output = (-3.9).int64() + output: -3 + + - name: "int64 from bool is error" + mapping: | + output = true.int64() + error: "int64" + + - name: "int64 from invalid string" + mapping: | + output = "not_a_number".int64() + error: "convert" + + # --- Cross-type integer equality (promotion) --- + + - name: "int32 equals int64 same value" + mapping: | + output = 5.int32() == 5 + output: true + + - name: "int32 not equals int64 different value" + mapping: | + output = 5.int32() == 6 + output: false + + - name: "uint32 equals int64 same value" + mapping: | + output = 42.uint32() == 42 + output: true + + - name: "uint64 equals int64 same value" + mapping: | + output = 42.uint64() == 42 + output: true + + - name: "int64 equals float64 same value" + mapping: | + output = 5 == 5.0 + output: true + + # --- Integer arithmetic basics --- + + - name: "int64 addition" + mapping: | + output = 5 + 3 + output: 8 + + - name: "int64 subtraction" + mapping: | + output = 10 - 3 + output: 7 + + - name: "int64 multiplication" + mapping: | + output = 6 * 7 + output: 42 + + - name: "int64 modulo" + mapping: | + output = 7 % 2 + output: 1 + + - name: "int64 division produces float64" + mapping: | + output = 7 / 2 + output: 3.5 + + - name: "int64 overflow addition" + mapping: | + output = 9223372036854775807 + 1 + error: "overflow" + + - name: "int64 min literal is compile error in subtraction" + mapping: | + output = -9223372036854775808 - 1 + compile_error: "exceeds" + + - name: "int64 overflow subtraction" + mapping: | + output = (-9223372036854775807 - 1) - 1 + error: "overflow" + + # --- Conversion from float to integer types --- + + - name: "float to int32" + mapping: | + output = 3.14.int32() + output: {_type: "int32", value: "3"} + + - name: "float to uint32" + mapping: | + output = 100.0.uint32() + output: {_type: "uint32", value: "100"} + + - name: "float to uint64" + mapping: | + output = 100.0.uint64() + output: {_type: "uint64", value: "100"} diff --git a/internal/bloblang2/spec/tests/types/object.yaml b/internal/bloblang2/spec/tests/types/object.yaml new file mode 100644 index 000000000..95e80456d --- /dev/null +++ b/internal/bloblang2/spec/tests/types/object.yaml @@ -0,0 +1,198 @@ +description: "Object literals, field access, expression keys, key ordering, and methods" + +tests: + # --- Literals --- + + - name: "empty object literal" + mapping: | + output.obj = {} + output: {"obj": {}} + + - name: "single field object" + mapping: | + output.obj = {"name": "Alice"} + output: {"obj": {"name": "Alice"}} + + - name: "multi-field object" + mapping: | + output.obj = {"name": "Alice", "age": 30} + output: {"obj": {"name": "Alice", "age": 30}} + + - name: "trailing comma allowed" + mapping: | + output.obj = {"a": 1, "b": 2,} + output: {"obj": {"a": 1, "b": 2}} + + - name: "mixed value types" + mapping: | + output.obj = {"s": "hello", "n": 42, "f": 3.14, "b": true, "nil": null} + output: {"obj": {"s": "hello", "n": 42, "f": 3.14, "b": true, "nil": null}} + + - name: "nested objects" + mapping: | + output.obj = {"user": {"name": "Alice", "address": {"city": "London"}}} + output: {"obj": {"user": {"name": "Alice", "address": {"city": "London"}}}} + + - name: "object containing array" + mapping: | + output.obj = {"items": [1, 2, 3]} + output: {"obj": {"items": [1, 2, 3]}} + + # --- Field access --- + + - name: "field access dot notation" + mapping: | + $obj = {"name": "Alice", "age": 30} + output.name = $obj.name + output: {"name": "Alice"} + + - name: "nested field access" + mapping: | + $obj = {"user": {"name": "Alice"}} + output.name = $obj.user.name + output: {"name": "Alice"} + + - name: "non-existent field returns null" + mapping: | + $obj = {"name": "Alice"} + output.v = $obj.missing + output: {"v": null} + + - name: "deeply nested non-existent field returns null" + mapping: | + $obj = {"a": {"b": {}}} + output.v = $obj.a.b.c + output: {"v": null} + + - name: "dynamic field access with bracket notation" + mapping: | + $obj = {"name": "Alice"} + $key = "name" + output.v = $obj[$key] + output: {"v": "Alice"} + + - name: "dynamic field access non-string key is error" + mapping: | + $obj = {"name": "Alice"} + output.v = $obj[42] + error: "non-string" + + # --- Expression keys --- + + - name: "variable as key" + mapping: | + $key = "dynamic" + output.obj = {$key: "value"} + output: {"obj": {"dynamic": "value"}} + + - name: "concatenation as key" + mapping: | + $prefix = "pre" + output.obj = {$prefix + "_field": "value"} + output: {"obj": {"pre_field": "value"}} + + - name: "non-string expression key is runtime error" + mapping: | + $key = 42 + output.obj = {$key: "value"} + error: "string" + + - name: "null expression key is runtime error" + mapping: | + $key = null + output.obj = {$key: "value"} + error: "string" + + - name: "bool expression key is runtime error" + mapping: | + $key = true + output.obj = {$key: "value"} + error: "string" + + # --- Key ordering is NOT preserved --- + + - name: "object equality ignores key order" + mapping: | + $a = {"x": 1, "y": 2} + $b = {"y": 2, "x": 1} + output.eq = $a == $b + output: {"eq": true} + + - name: "object inequality when values differ" + mapping: | + $a = {"x": 1, "y": 2} + $b = {"x": 1, "y": 3} + output.eq = $a == $b + output: {"eq": false} + + - name: "object inequality different keys" + mapping: | + $a = {"x": 1} + $b = {"y": 1} + output.eq = $a == $b + output: {"eq": false} + + # --- Length --- + + - name: "length of empty object" + mapping: | + output.len = {}.length() + output: {"len": 0} + + - name: "length of non-empty object" + mapping: | + output.len = {"a": 1, "b": 2, "c": 3}.length() + output: {"len": 3} + + # --- Type --- + + - name: "object type" + mapping: | + output.t = {"a": 1}.type() + output: {"t": "object"} + + - name: "empty object type" + mapping: | + output.t = {}.type() + output: {"t": "object"} + + # --- From input --- + + - name: "object from input" + input: {"user": {"name": "Alice", "age": 30}} + mapping: | + output.name = input.user.name + output.age = input.user.age + output: {"name": "Alice", "age": 30} + + # --- Deleted in object literal --- + + - name: "deleted value in object literal omits field" + mapping: | + output.obj = {"a": 1, "b": deleted(), "c": 3} + output: {"obj": {"a": 1, "c": 3}} + + # --- Void in object literal is error --- + + - name: "void value in object literal is error" + mapping: | + output.obj = {"a": 1, "b": if false { 2 }} + error: "void" + + # --- Quoted field names --- + + - name: "quoted field name with special characters" + mapping: | + output.obj = {"field-with-dashes": "value"} + output: {"obj": {"field-with-dashes": "value"}} + + - name: "quoted field name starting with digit" + mapping: | + output.obj = {"123abc": "value"} + output: {"obj": {"123abc": "value"}} + + - name: "access quoted field name" + mapping: | + $obj = {"field-name": "hello"} + output.v = $obj."field-name" + output: {"v": "hello"} diff --git a/internal/bloblang2/spec/tests/types/string.yaml b/internal/bloblang2/spec/tests/types/string.yaml new file mode 100644 index 000000000..6114304af --- /dev/null +++ b/internal/bloblang2/spec/tests/types/string.yaml @@ -0,0 +1,320 @@ +description: "String literals, escape sequences, raw strings, and codepoint semantics" + +tests: + # --- Basic string literals --- + + - name: "empty string literal" + mapping: | + output = "" + output: "" + + - name: "simple string literal" + mapping: | + output = "hello world" + output: "hello world" + + - name: "string type introspection" + mapping: | + output = "hello".type() + output: "string" + + # --- Escape sequences --- + + - name: "escape newline" + mapping: | + output = "line1\nline2" + output: "line1\nline2" + + - name: "escape tab" + mapping: | + output = "col1\tcol2" + output: "col1\tcol2" + + - name: "escape carriage return" + mapping: | + output = "hello\rworld" + output: "hello\rworld" + + - name: "escape double quote" + mapping: | + output = "say \"hello\"" + output: "say \"hello\"" + + - name: "escape backslash" + mapping: | + output = "back\\slash" + output: "back\\slash" + + - name: "unicode escape 4 digit BMP" + mapping: | + output = "\u0041" + output: "A" + + - name: "unicode escape 4 digit non-ascii" + mapping: | + output = "\u00E9" + output: "\u00E9" + + - name: "unicode escape braced single digit" + mapping: | + output = "\u{41}" + output: "A" + + - name: "unicode escape braced emoji" + mapping: | + output = "\u{1F600}" + output: "\U0001F600" + + # --- Unicode escape range + surrogate rejection (Section 10) --- + + - name: "unicode escape above U+10FFFF is a compile error" + mapping: | + output = "\u{110000}" + compile_error: "unicode" + + - name: "unicode escape of a high surrogate is a compile error" + mapping: | + output = "\u{D800}" + compile_error: "surrogate" + + - name: "unicode escape of a low surrogate is a compile error" + mapping: | + output = "\u{DFFF}" + compile_error: "surrogate" + + - name: "fixed-width unicode escape of a surrogate is a compile error" + mapping: | + output = "\uD800" + compile_error: "surrogate" + + - name: "multiple escapes in one string" + mapping: | + output = "a\tb\nc\\d\"e" + output: "a\tb\nc\\d\"e" + + # --- Raw strings --- + + - name: "raw string basic" + mapping: | + output = `hello world` + output: "hello world" + + - name: "raw string no escape processing" + mapping: | + output = `no\nescape\there` + output: "no\\nescape\\there" + + - name: "raw string preserves quotes" + mapping: | + output = `she said "hello"` + output: "she said \"hello\"" + + - name: "raw string preserves backslashes" + mapping: | + output = `C:\path\to\file` + output: "C:\\path\\to\\file" + + - name: "raw string with newlines preserved" + mapping: "output = `line1\nline2`" + output: "line1\nline2" + + # --- String length (codepoint-based) --- + + - name: "length of ascii string" + mapping: | + output = "hello".length() + output: 5 + + - name: "length of empty string" + mapping: | + output = "".length() + output: 0 + + - name: "length of string with non-ascii" + mapping: | + output = "caf\u00E9".length() + output: 4 + + - name: "length of single codepoint emoji" + mapping: | + output = "\u{1F600}".length() + output: 1 + + - name: "length of multi-codepoint emoji" + mapping: | + output = "\u{1F44B}\u{1F3FD}".length() + output: 2 + + # --- String indexing (codepoint-based, returns int64) --- + + - name: "index first codepoint" + mapping: | + output = "hello"[0] + output: 104 + + - name: "index last codepoint positive" + mapping: | + output = "hello"[4] + output: 111 + + - name: "index negative last codepoint" + mapping: | + output = "hello"[-1] + output: 111 + + - name: "index negative second to last" + mapping: | + output = "hello"[-2] + output: 108 + + - name: "index non-ascii codepoint" + mapping: | + output = "caf\u00E9"[3] + output: 233 + + - name: "index emoji codepoint" + mapping: | + output = "\u{1F600}"[0] + output: 128512 + + - name: "index out of bounds positive" + mapping: | + output = "hello"[5] + error: "out of bounds" + + - name: "index out of bounds negative" + mapping: | + output = "hello"[-6] + error: "out of bounds" + + # --- Codepoint round-trip with .char() --- + + - name: "char round trip ascii" + mapping: | + output = "hello"[0].char() + output: "h" + + - name: "char round trip non-ascii" + mapping: | + output = "caf\u00E9"[3].char() + output: "\u00E9" + + # --- String concatenation --- + + - name: "string concatenation" + mapping: | + output = "hello" + " " + "world" + output: "hello world" + + - name: "string concat with empty" + mapping: | + output = "" + "hello" + "" + output: "hello" + + - name: "string plus number is error" + mapping: | + output = "hello" + 5 + error: "cannot add" + + # --- String comparison --- + + - name: "string equality same" + mapping: | + output = "abc" == "abc" + output: true + + - name: "string equality different" + mapping: | + output = "abc" == "abd" + output: false + + - name: "string less than lexicographic" + mapping: | + output = "abc" < "abd" + output: true + + - name: "string greater than lexicographic" + mapping: | + output = "b" > "a" + output: true + + - name: "string equality cross type is false" + mapping: | + output = "5" == 5 + output: false + + # --- No Unicode normalization --- + + - name: "no normalization precomposed vs decomposed not equal" + mapping: | + output = "\u00E9" == "e\u0301" + output: false + + - name: "no normalization different lengths" + mapping: | + output.precomposed = "\u00E9".length() + output.decomposed = "e\u0301".length() + output: + precomposed: 1 + decomposed: 2 + + # --- String slicing --- + + - name: "string slice basic" + mapping: | + output = "hello world".slice(0, 5) + output: "hello" + + - name: "string slice to end" + mapping: | + output = "hello world".slice(6) + output: "world" + + - name: "string slice negative indices" + mapping: | + output = "hello world".slice(-5, -1) + output: "worl" + + - name: "string slice clamped" + mapping: | + output = "hello".slice(0, 100) + output: "hello" + + - name: "string slice empty result" + mapping: | + output = "hello".slice(3, 1) + output: "" + + # --- String reverse --- + + - name: "string reverse ascii" + mapping: | + output = "hello".reverse() + output: "olleh" + + - name: "string reverse empty" + mapping: | + output = "".reverse() + output: "" + + # --- String contains and index_of --- + + - name: "string contains true" + mapping: | + output = "hello world".contains("world") + output: true + + - name: "string contains false" + mapping: | + output = "hello world".contains("xyz") + output: false + + - name: "string index_of found" + mapping: | + output = "hello world".index_of("world") + output: 6 + + - name: "string index_of not found" + mapping: | + output = "hello world".index_of("xyz") + output: -1 diff --git a/internal/bloblang2/spec/tests/types/timestamp.yaml b/internal/bloblang2/spec/tests/types/timestamp.yaml new file mode 100644 index 000000000..e98e5456e --- /dev/null +++ b/internal/bloblang2/spec/tests/types/timestamp.yaml @@ -0,0 +1,244 @@ +description: "Timestamp creation, formatting, arithmetic, and comparison" + +tests: + # --- Creation --- + + - name: "timestamp constructor with required args only" + mapping: | + output.ts = timestamp(2024, 3, 1) + output: {"ts": {_type: "timestamp", value: "2024-03-01T00:00:00Z"}} + + - name: "timestamp constructor with all positional args" + mapping: | + output.ts = timestamp(2024, 12, 25, 8, 30, 45, 123000000) + output: {"ts": {_type: "timestamp", value: "2024-12-25T08:30:45.123Z"}} + + - name: "timestamp constructor with named args" + mapping: | + output.ts = timestamp(year: 2024, month: 1, day: 15, hour: 10) + output: {"ts": {_type: "timestamp", value: "2024-01-15T10:00:00Z"}} + + - name: "timestamp constructor with timezone" + mapping: | + output.ts = timestamp(2024, 3, 1, 12, 30, 0, 0, "America/New_York") + output: {"ts": {_type: "timestamp", value: "2024-03-01T12:30:00-05:00"}} + + - name: "timestamp constructor invalid month" + mapping: | + output.ts = timestamp(2024, 13, 1) + error: "out of range" + + - name: "timestamp constructor invalid timezone" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0, 0, "Not/A/Zone") + error: "timezone" + + - name: "now returns a timestamp" + mapping: | + output = now() + no_output_check: true + output_type: "timestamp" + + # --- Parsing --- + + - name: "ts_parse with default format (RFC 3339)" + mapping: | + output.ts = "2024-03-01T12:00:00Z".ts_parse() + output: {"ts": {_type: "timestamp", value: "2024-03-01T12:00:00Z"}} + + - name: "ts_parse with explicit format" + mapping: | + output.ts = "2024-03-01".ts_parse("%Y-%m-%d") + output: {"ts": {_type: "timestamp", value: "2024-03-01T00:00:00Z"}} + + - name: "ts_parse with fractional seconds" + mapping: | + output.ts = "2024-03-01T12:00:00.123Z".ts_parse() + output: {"ts": {_type: "timestamp", value: "2024-03-01T12:00:00.123Z"}} + + - name: "ts_parse with timezone offset" + mapping: | + output.ts = "2024-03-01T12:00:00+05:30".ts_parse() + output: {"ts": {_type: "timestamp", value: "2024-03-01T12:00:00+05:30"}} + + - name: "ts_parse invalid string" + mapping: | + output.ts = "not-a-date".ts_parse("%Y-%m-%d") + error: "parse" + + # --- Formatting --- + + - name: "ts_format default is RFC 3339" + mapping: | + output.s = timestamp(2024, 3, 1, 12, 0, 0).ts_format() + output: {"s": "2024-03-01T12:00:00Z"} + + - name: "ts_format custom format" + mapping: | + output.s = timestamp(2024, 3, 1).ts_format("%Y-%m-%d") + output: {"s": "2024-03-01"} + + - name: "timestamp string serialization trims trailing zeros" + mapping: | + output.s = timestamp(2024, 3, 1, 12, 0, 0, 500000000).string() + output: {"s": "2024-03-01T12:00:00.5Z"} + + - name: "timestamp string serialization whole seconds omit fraction" + mapping: | + output.s = timestamp(2024, 3, 1, 12, 0, 0).string() + output: {"s": "2024-03-01T12:00:00Z"} + + # --- Comparison --- + + - name: "timestamp equality same value" + mapping: | + $a = timestamp(2024, 3, 1) + $b = timestamp(2024, 3, 1) + output.eq = $a == $b + output: {"eq": true} + + - name: "timestamp equality different values" + mapping: | + $a = timestamp(2024, 3, 1) + $b = timestamp(2024, 3, 2) + output.eq = $a == $b + output: {"eq": false} + + - name: "timestamp inequality" + mapping: | + $a = timestamp(2024, 3, 1) + $b = timestamp(2024, 3, 2) + output.neq = $a != $b + output: {"neq": true} + + - name: "timestamp less than" + mapping: | + $a = timestamp(2024, 3, 1) + $b = timestamp(2024, 3, 2) + output.lt = $a < $b + output: {"lt": true} + + - name: "timestamp greater than" + mapping: | + $a = timestamp(2024, 3, 2) + $b = timestamp(2024, 3, 1) + output.gt = $a > $b + output: {"gt": true} + + - name: "timestamp less than or equal (equal case)" + mapping: | + $a = timestamp(2024, 3, 1) + $b = timestamp(2024, 3, 1) + output.le = $a <= $b + output: {"le": true} + + - name: "timestamp greater than or equal (greater case)" + mapping: | + $a = timestamp(2024, 3, 2) + $b = timestamp(2024, 3, 1) + output.ge = $a >= $b + output: {"ge": true} + + # --- Arithmetic --- + + - name: "timestamp subtraction returns nanoseconds" + mapping: | + $a = timestamp(2024, 3, 1, 0, 0, 0) + $b = timestamp(2024, 3, 1, 0, 0, 1) + output.diff = $b - $a + output: {"diff": 1000000000} + + - name: "timestamp subtraction negative result" + mapping: | + $a = timestamp(2024, 3, 1, 0, 0, 1) + $b = timestamp(2024, 3, 1, 0, 0, 0) + output.diff = $b - $a + output: {"diff": -1000000000} + + - name: "timestamp addition is an error" + mapping: | + $a = timestamp(2024, 3, 1) + $b = timestamp(2024, 3, 2) + output.bad = $a + $b + error: "cannot add" + + - name: "timestamp plus number is an error" + mapping: | + output.bad = timestamp(2024, 3, 1) + 1 + error: "cannot add" + + - name: "number minus timestamp is an error" + mapping: | + output.bad = 1 - timestamp(2024, 3, 1) + error: "cannot" + + - name: "timestamp multiply is an error" + mapping: | + output.bad = timestamp(2024, 3, 1) * 2 + error: "cannot" + + - name: "timestamp divide is an error" + mapping: | + output.bad = timestamp(2024, 3, 1) / 2 + error: "cannot" + + - name: "timestamp modulo is an error" + mapping: | + output.bad = timestamp(2024, 3, 1) % 2 + error: "cannot" + + # --- ts_add --- + + - name: "ts_add positive duration" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(second()) + output: {"ts": {_type: "timestamp", value: "2024-03-01T00:00:01Z"}} + + - name: "ts_add negative duration" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(second() * -1) + output: {"ts": {_type: "timestamp", value: "2024-02-29T23:59:59Z"}} + + - name: "ts_add with minute constant" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(minute()) + output: {"ts": {_type: "timestamp", value: "2024-03-01T00:01:00Z"}} + + - name: "ts_add with hour constant" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(hour()) + output: {"ts": {_type: "timestamp", value: "2024-03-01T01:00:00Z"}} + + - name: "ts_add with day constant" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(day()) + output: {"ts": {_type: "timestamp", value: "2024-03-02T00:00:00Z"}} + + # --- Duration constants --- + + - name: "second returns nanoseconds" + mapping: | + output.v = second() + output: {"v": 1000000000} + + - name: "minute returns nanoseconds" + mapping: | + output.v = minute() + output: {"v": 60000000000} + + - name: "hour returns nanoseconds" + mapping: | + output.v = hour() + output: {"v": 3600000000000} + + - name: "day returns nanoseconds" + mapping: | + output.v = day() + output: {"v": 86400000000000} + + # --- Type --- + + - name: "timestamp type" + mapping: | + output.t = timestamp(2024, 3, 1).type() + output: {"t": "timestamp"} diff --git a/internal/bloblang2/spec/tests/types/timestamp_arithmetic.yaml b/internal/bloblang2/spec/tests/types/timestamp_arithmetic.yaml new file mode 100644 index 000000000..7c9310696 --- /dev/null +++ b/internal/bloblang2/spec/tests/types/timestamp_arithmetic.yaml @@ -0,0 +1,195 @@ +description: > + Timestamp arithmetic edge cases — subtraction overflow for far-apart + timestamps, ts_add overflow, unix conversion round-trips, nanosecond + precision, fractional second formatting, and timezone handling. + +tests: + # --- Subtraction precision --- + + - name: "timestamp subtraction with nanosecond precision" + mapping: | + $a = timestamp(2024, 3, 1, 0, 0, 0, 0) + $b = timestamp(2024, 3, 1, 0, 0, 0, 123456789) + output.diff = $b - $a + output: {"diff": 123456789} + + - name: "timestamp subtraction across days" + mapping: | + $a = timestamp(2024, 3, 1, 0, 0, 0) + $b = timestamp(2024, 3, 3, 0, 0, 0) + output.diff = $b - $a + output: {"diff": 172800000000000} + + - name: "timestamp subtraction across months" + mapping: | + $a = timestamp(2024, 1, 1, 0, 0, 0) + $b = timestamp(2024, 2, 1, 0, 0, 0) + output.diff = ($b - $a) / second() + output: {"diff": 2678400.0} + + - name: "timestamp subtraction yields zero for same timestamp" + mapping: | + $t = timestamp(2024, 6, 15, 12, 0, 0) + output.diff = $t - $t + output: {"diff": 0} + + # --- Subtraction overflow (timestamps > ~292 years apart) --- + + - name: "timestamp subtraction overflow — far future minus far past" + mapping: | + $past = timestamp(1700, 1, 1, 0, 0, 0) + $future = timestamp(2300, 1, 1, 0, 0, 0) + output.diff = $future - $past + error: "overflow" + + # --- ts_add edge cases --- + + - name: "ts_add with nanosecond precision" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(1) + output: {"ts": {_type: "timestamp", value: "2024-03-01T00:00:00.000000001Z"}} + + - name: "ts_add with sub-millisecond precision" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(1500000) + output: {"ts": {_type: "timestamp", value: "2024-03-01T00:00:00.0015Z"}} + + - name: "ts_add negative one day" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(day() * -1) + output: {"ts": {_type: "timestamp", value: "2024-02-29T00:00:00Z"}} + + - name: "ts_add crossing year boundary" + mapping: | + output.ts = timestamp(2024, 12, 31, 23, 59, 59).ts_add(second()) + output: {"ts": {_type: "timestamp", value: "2025-01-01T00:00:00Z"}} + + - name: "ts_add multiple days as seconds" + mapping: | + output.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(second() * 86400 * 7) + output: {"ts": {_type: "timestamp", value: "2024-03-08T00:00:00Z"}} + + # --- Unix conversion round-trips --- + + - name: "ts_unix round-trip" + mapping: | + $ts = timestamp(2024, 3, 1, 12, 0, 0) + output.rt = $ts.ts_unix().ts_from_unix().ts_format() + output: {"rt": "2024-03-01T12:00:00Z"} + + - name: "ts_unix_milli round-trip with milliseconds" + mapping: | + $ts = timestamp(2024, 3, 1, 12, 0, 0, 123000000) + output.rt = $ts.ts_unix_milli().ts_from_unix_milli().ts_format() + output: {"rt": "2024-03-01T12:00:00.123Z"} + + - name: "ts_unix_micro round-trip with microseconds" + mapping: | + $ts = timestamp(2024, 3, 1, 12, 0, 0, 123456000) + output.rt = $ts.ts_unix_micro().ts_from_unix_micro().ts_format() + output: {"rt": "2024-03-01T12:00:00.123456Z"} + + - name: "ts_unix_nano lossless round-trip with nanoseconds" + mapping: | + $ts = timestamp(2024, 3, 1, 12, 0, 0, 123456789) + output.rt = $ts.ts_unix_nano().ts_from_unix_nano().ts_format() + output: {"rt": "2024-03-01T12:00:00.123456789Z"} + + - name: "ts_unix returns int64" + mapping: | + output.v = timestamp(2024, 3, 1, 12, 0, 0).ts_unix().type() + output: {"v": "int64"} + + - name: "ts_unix_nano returns int64" + mapping: | + output.v = timestamp(2024, 3, 1, 12, 0, 0).ts_unix_nano().type() + output: {"v": "int64"} + + # --- Fractional second formatting --- + + - name: "fractional seconds trim trailing zeros to shortest" + mapping: | + output.s = timestamp(2024, 3, 1, 12, 0, 0, 100000000).string() + output: {"s": "2024-03-01T12:00:00.1Z"} + + - name: "microsecond precision formatting" + mapping: | + output.s = timestamp(2024, 3, 1, 12, 0, 0, 123456000).string() + output: {"s": "2024-03-01T12:00:00.123456Z"} + + - name: "nanosecond precision formatting" + mapping: | + output.s = timestamp(2024, 3, 1, 12, 0, 0, 123456789).string() + output: {"s": "2024-03-01T12:00:00.123456789Z"} + + # --- ts_parse timezone handling --- + + - name: "ts_parse with Z timezone" + mapping: | + $ts = "2024-03-01T12:00:00Z".ts_parse() + output.s = $ts.ts_format() + output: {"s": "2024-03-01T12:00:00Z"} + + - name: "ts_parse with positive offset" + mapping: | + $ts = "2024-03-01T12:00:00+05:30".ts_parse() + output.s = $ts.ts_format() + output: {"s": "2024-03-01T12:00:00+05:30"} + + - name: "ts_parse with negative offset" + mapping: | + $ts = "2024-03-01T12:00:00-08:00".ts_parse() + output.s = $ts.ts_format() + output: {"s": "2024-03-01T12:00:00-08:00"} + + - name: "ts_parse fractional seconds with nanosecond precision" + mapping: | + output.ts = "2024-03-01T12:00:00.123456789Z".ts_parse() + output: {"ts": {_type: "timestamp", value: "2024-03-01T12:00:00.123456789Z"}} + + # --- ts_from_unix with float (limited precision) --- + + - name: "ts_from_unix with integer" + mapping: | + output.s = 1709294400.ts_from_unix().ts_format() + output: {"s": "2024-03-01T12:00:00Z"} + + - name: "ts_from_unix with float gives sub-second" + mapping: | + output.s = 1709294400.5.ts_from_unix().ts_format() + output: {"s": "2024-03-01T12:00:00.5Z"} + + # --- Comparison with timestamps from different construction methods --- + + - name: "timestamps from constructor and parse are equal" + mapping: | + $a = timestamp(2024, 3, 1, 12, 0, 0) + $b = "2024-03-01T12:00:00Z".ts_parse() + output.eq = $a == $b + output: {"eq": true} + + - name: "timestamps from constructor and unix round-trip are equal" + mapping: | + $a = timestamp(2024, 3, 1, 12, 0, 0) + $b = $a.ts_unix().ts_from_unix() + output.eq = $a == $b + output: {"eq": true} + + # --- Arithmetic type errors --- + + - name: "number minus timestamp is error" + mapping: | + output.v = 100 - timestamp(2024, 3, 1) + error: "cannot" + + - name: "timestamp plus timestamp is error" + mapping: | + $a = timestamp(2024, 3, 1) + $b = timestamp(2024, 3, 2) + output.v = $a + $b + error: "cannot add" + + - name: "timestamp minus number is error" + mapping: | + output.v = timestamp(2024, 3, 1) - 1 + error: "cannot" diff --git a/internal/bloblang2/spec/tests/types/type_introspection.yaml b/internal/bloblang2/spec/tests/types/type_introspection.yaml new file mode 100644 index 000000000..1bf10a510 --- /dev/null +++ b/internal/bloblang2/spec/tests/types/type_introspection.yaml @@ -0,0 +1,184 @@ +description: ".type() method for every runtime type" + +tests: + # --- String --- + + - name: "type of string" + mapping: | + output.t = "hello".type() + output: {"t": "string"} + + - name: "type of empty string" + mapping: | + output.t = "".type() + output: {"t": "string"} + + # --- Integer types --- + + - name: "type of int64 literal" + mapping: | + output.t = 42.type() + output: {"t": "int64"} + + - name: "type of negative int64" + mapping: | + output.t = (-10).type() + output: {"t": "int64"} + + - name: "type of zero int64" + mapping: | + output.t = 0.type() + output: {"t": "int64"} + + - name: "type of int32" + mapping: | + output.t = 42.int32().type() + output: {"t": "int32"} + + - name: "type of uint32" + mapping: | + output.t = 42.uint32().type() + output: {"t": "uint32"} + + - name: "type of uint64" + mapping: | + output.t = 42.uint64().type() + output: {"t": "uint64"} + + # --- Float types --- + + - name: "type of float64 literal" + mapping: | + output.t = 3.14.type() + output: {"t": "float64"} + + - name: "type of float64 zero" + mapping: | + output.t = 0.0.type() + output: {"t": "float64"} + + - name: "type of float32" + mapping: | + output.t = 3.14.float32().type() + output: {"t": "float32"} + + # --- Bool --- + + - name: "type of true" + mapping: | + output.t = true.type() + output: {"t": "bool"} + + - name: "type of false" + mapping: | + output.t = false.type() + output: {"t": "bool"} + + # --- Null --- + + - name: "type of null" + mapping: | + output.t = null.type() + output: {"t": "null"} + + # --- Bytes --- + + - name: "type of bytes" + mapping: | + output.t = "hello".bytes().type() + output: {"t": "bytes"} + + # --- Timestamp --- + + - name: "type of timestamp from constructor" + mapping: | + output.t = timestamp(2024, 3, 1).type() + output: {"t": "timestamp"} + + - name: "type of timestamp from now" + mapping: | + output.t = now().type() + output: {"t": "timestamp"} + + - name: "type of timestamp from parse" + mapping: | + output.t = "2024-03-01T00:00:00Z".ts_parse().type() + output: {"t": "timestamp"} + + # --- Array --- + + - name: "type of array" + mapping: | + output.t = [1, 2, 3].type() + output: {"t": "array"} + + - name: "type of empty array" + mapping: | + output.t = [].type() + output: {"t": "array"} + + # --- Object --- + + - name: "type of object" + mapping: | + output.t = {"a": 1}.type() + output: {"t": "object"} + + - name: "type of empty object" + mapping: | + output.t = {}.type() + output: {"t": "object"} + + # --- Type checking pattern --- + + - name: "type comparison for runtime check" + mapping: | + $v = 42 + output.is_int = $v.type() == "int64" + output.is_str = $v.type() == "string" + output: {"is_int": true, "is_str": false} + + - name: "type of null is not object" + mapping: | + output.is_obj = null.type() == "object" + output.is_null = null.type() == "null" + output: {"is_obj": false, "is_null": true} + + # --- Type from input --- + + - name: "type of input string field" + input: {"name": "Alice"} + mapping: | + output.t = input.name.type() + output: {"t": "string"} + + - name: "type of input number field" + input: {"count": 42} + mapping: | + output.t = input.count.type() + output: {"t": "int64"} + + - name: "type of input null field" + input: {"missing": null} + mapping: | + output.t = input.missing.type() + output: {"t": "null"} + + - name: "type of input array field" + input: {"items": [1, 2]} + mapping: | + output.t = input.items.type() + output: {"t": "array"} + + - name: "type of input object field" + input: {"user": {"name": "Alice"}} + mapping: | + output.t = input.user.type() + output: {"t": "object"} + + # --- Void: type() is not callable --- + + - name: "type on void is error" + mapping: | + output.t = (if false { 42 }).type() + error: "void" diff --git a/internal/bloblang2/spec/tests/types/void.yaml b/internal/bloblang2/spec/tests/types/void.yaml new file mode 100644 index 000000000..3e01ffb04 --- /dev/null +++ b/internal/bloblang2/spec/tests/types/void.yaml @@ -0,0 +1,184 @@ +description: "Void behavior in every context — void is not a type, it is the absence of a value" + +tests: + # --- Output field assignment: void skips assignment --- + + - name: "void skips output assignment (no prior value)" + mapping: | + output.x = if false { "hello" } + output: {} + + - name: "void preserves prior output value" + mapping: | + output.status = "pending" + output.status = if false { "override" } + output: {"status": "pending"} + + - name: "void from else-if chain without final else" + mapping: | + output.tier = "default" + output.tier = if false { "gold" } else if false { "silver" } + output: {"tier": "default"} + + - name: "void from non-exhaustive match" + mapping: | + output.sound = "unknown" + output.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + output: {"sound": "unknown"} + + # --- Variable declaration: runtime error --- + + - name: "void in variable declaration is runtime error (if)" + mapping: | + $x = if false { 42 } + error: "void" + + - name: "void in variable declaration is runtime error (match)" + mapping: | + $x = match "nope" { + "a" => 1, + } + error: "void" + + # --- Variable reassignment: void skips --- + + - name: "void skips variable reassignment" + mapping: | + $x = 10 + $x = if false { 42 } + output.result = $x + output: {"result": 10} + + - name: "void skips variable reassignment from match" + mapping: | + $x = "original" + $x = match "nope" { + "a" => "found", + } + output.result = $x + output: {"result": "original"} + + # --- Collection literal: error --- + + - name: "void in array literal is error" + mapping: | + output.arr = [1, if false { 2 }, 3] + error: "void" + + - name: "void in object literal is error" + mapping: | + output.obj = {"a": 1, "b": if false { 2 }} + error: "void" + + - name: "void in array from match is error" + mapping: | + output.arr = [match "x" { "y" => 1 }] + error: "void" + + # --- Function/map argument: error --- + + - name: "void as map argument is error" + mapping: | + map double(val) { val * 2 } + output.result = double(if false { 42 }) + error: "void" + + # --- .or() rescues void --- + + - name: "or rescues void from if-without-else" + mapping: | + output.result = (if false { "hello" }).or("default") + output: {"result": "default"} + + - name: "or rescues void from non-exhaustive match" + mapping: | + output.result = (match "bird" { "cat" => "meow" }).or("unknown") + output: {"result": "unknown"} + + - name: "or does not trigger when value exists" + mapping: | + output.result = (if true { "hello" }).or("default") + output: {"result": "hello"} + + - name: "or short-circuits argument on non-void" + mapping: | + output.result = (if true { "hello" }).or(throw("should not run")) + output: {"result": "hello"} + + # --- .catch() passes void through unchanged --- + + - name: "catch does not trigger on void" + mapping: | + output.x = "prior" + output.x = (if false { 1 }).catch(err -> 0) + output: {"x": "prior"} + + - name: "catch passes void through then method errors" + mapping: | + output.result = (if false { 1 }).catch(err -> 0).string().catch(err -> "caught") + output: {"result": "caught"} + + # --- Method calls on void: error --- + + - name: "type on void is error" + mapping: | + output.t = (if false { 42 }).type() + error: "void" + + - name: "string on void is error" + mapping: | + output.s = (if false { "hello" }).string() + error: "void" + + - name: "length on void is error" + mapping: | + output.l = (if false { [1, 2] }).length() + error: "void" + + - name: "uppercase on void is error" + mapping: | + output.s = (if false { "hello" }).uppercase() + error: "void" + + # --- Expression operand: error --- + + - name: "void plus number is error" + mapping: | + output.result = (if false { 42 }) + 1 + error: "void" + + - name: "number plus void is error" + mapping: | + output.result = 1 + (if false { 42 }) + error: "void" + + - name: "void in boolean negation is error" + mapping: | + output.result = !(if false { true }) + error: "void" + + - name: "void equality comparison is error" + mapping: | + output.result = (if false { 42 }) == 42 + error: "void" + + # --- Void rescued with or then used in variable declaration --- + + - name: "or rescues void for variable declaration" + mapping: | + $x = (if false { 42 }).or(0) + output.result = $x + output: {"result": 0} + + # --- Void vs deleted distinction --- + + - name: "void preserves prior value while deleted removes it" + mapping: | + output.a = "exists" + output.b = "exists" + output.a = if false { "override" } + output.b = deleted() + output: {"a": "exists"} diff --git a/internal/bloblang2/spec/tests/variables/bare_ident_resolution.yaml b/internal/bloblang2/spec/tests/variables/bare_ident_resolution.yaml new file mode 100644 index 000000000..f154f908e --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/bare_ident_resolution.yaml @@ -0,0 +1,89 @@ +description: > + Bare identifier resolution — bare identifiers (without $ prefix) must NOT + resolve to variables. Variables require the $ prefix for both declaration + and reference. Bare identifiers resolve only to map parameters, lambda + parameters, match-as bindings, map names (in call/method-arg context), + and standard library functions (in call/method-arg context). + +tests: + # --- Bare identifier must not resolve to a $variable --- + + - name: "bare identifier does not resolve to variable of same name" + mapping: | + $foo = "hello world" + output = foo + compile_error: "undeclared" + + - name: "bare identifier in expression does not resolve to variable" + mapping: | + $x = 10 + output.v = x + 1 + compile_error: "undeclared" + + - name: "bare identifier in method chain does not resolve to variable" + mapping: | + $name = "alice" + output.v = name.uppercase() + compile_error: "undeclared" + + - name: "bare identifier in array literal does not resolve to variable" + mapping: | + $val = 42 + output.v = [val] + compile_error: "undeclared" + + - name: "bare identifier in object value does not resolve to variable" + mapping: | + $val = 42 + output.v = {"key": val} + compile_error: "undeclared" + + - name: "bare identifier in if condition does not resolve to variable" + mapping: | + $flag = true + output.v = if flag { "yes" } else { "no" } + compile_error: "undeclared" + + - name: "bare identifier with $ prefix works correctly" + mapping: | + $foo = "hello world" + output = $foo + output: "hello world" + + # --- Bare identifiers that ARE valid (parameters, match-as bindings) --- + + - name: "bare identifier as lambda parameter is valid" + mapping: | + output.v = [1, 2, 3].map(x -> x * 2) + output: {"v": [2, 4, 6]} + + - name: "bare identifier as match-as binding is valid" + mapping: | + output.v = match 42 as val { + val > 0 => "positive", + _ => "other", + } + output: {"v": "positive"} + + - name: "bare identifier as map parameter is valid" + mapping: | + map double(x) { x * 2 } + output.v = double(21) + output: {"v": 42} + + # --- Variable with same name as parameter does not leak through bare ident --- + + - name: "variable does not shadow lambda parameter via bare ident" + mapping: | + $x = 999 + output.v = [1, 2, 3].map(x -> x * 2) + output: {"v": [2, 4, 6]} + + - name: "bare ident after lambda still requires $ for variable" + mapping: | + $x = 10 + output.items = [1, 2].map(x -> x + 1) + output.v = $x + output: + items: [2, 3] + v: 10 diff --git a/internal/bloblang2/spec/tests/variables/copy_on_write.yaml b/internal/bloblang2/spec/tests/variables/copy_on_write.yaml new file mode 100644 index 000000000..bf32d40eb --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/copy_on_write.yaml @@ -0,0 +1,185 @@ +description: "Copy-on-write semantics: independence from input, from output, between variables, nested mutation independence" + +tests: + # --- Independence from input --- + + - name: "variable copy from input is independent" + input: {"user": {"name": "Alice", "age": 30}} + mapping: | + $data = input.user + $data.name = "Bob" + output.var_name = $data.name + output.input_name = input.user.name + output: {"var_name": "Bob", "input_name": "Alice"} + + - name: "variable copy from input nested field is independent" + input: {"config": {"settings": {"theme": "dark", "lang": "en"}}} + mapping: | + $settings = input.config.settings + $settings.theme = "light" + output.var_theme = $settings.theme + output.input_theme = input.config.settings.theme + output: {"var_theme": "light", "input_theme": "dark"} + + - name: "variable copy from input array is independent" + input: {"items": [1, 2, 3]} + mapping: | + $arr = input.items + $arr[0] = 99 + output.var_first = $arr[0] + output.input_first = input.items[0] + output: {"var_first": 99, "input_first": 1} + + - name: "variable copy from entire input is independent" + input: {"a": 1, "b": 2} + mapping: | + $copy = input + $copy.a = 100 + output.var_a = $copy.a + output.input_a = input.a + output: {"var_a": 100, "input_a": 1} + + # --- Independence from output --- + + - name: "variable snapshot of output is independent from later output changes" + mapping: | + output.user.name = "Alice" + $snap = output.user + output.user.name = "Bob" + output.snap_name = $snap.name + output: {"user": {"name": "Bob"}, "snap_name": "Alice"} + + - name: "mutating variable snapshot does not affect output" + mapping: | + output.data = {"x": 1, "y": 2} + $snap = output.data + $snap.x = 99 + output.snap_x = $snap.x + output.original_x = output.data.x + output: {"data": {"x": 1, "y": 2}, "snap_x": 99, "original_x": 1} + + - name: "variable snapshot of output array is independent" + mapping: | + output.items = [10, 20, 30] + $snap = output.items + output.items[0] = 99 + output.snap_first = $snap[0] + output: {"items": [99, 20, 30], "snap_first": 10} + + # --- Independence between variables --- + + - name: "copy between variables is independent" + mapping: | + $a = {"x": 1} + $b = $a + $b.x = 2 + output.a = $a.x + output.b = $b.x + output: {"a": 1, "b": 2} + + - name: "copy between variables reverse mutation" + mapping: | + $a = {"x": 1} + $b = $a + $a.x = 99 + output.a = $a.x + output.b = $b.x + output: {"a": 99, "b": 1} + + - name: "multiple copies from same source are independent" + mapping: | + $source = {"val": 0} + $copy1 = $source + $copy2 = $source + $copy1.val = 1 + $copy2.val = 2 + output.source = $source.val + output.c1 = $copy1.val + output.c2 = $copy2.val + output: {"source": 0, "c1": 1, "c2": 2} + + - name: "chain of copies are all independent" + mapping: | + $a = {"v": "a"} + $b = $a + $b.v = "b" + $c = $b + $c.v = "c" + output.a = $a.v + output.b = $b.v + output.c = $c.v + output: {"a": "a", "b": "b", "c": "c"} + + - name: "array copy between variables is independent" + mapping: | + $a = [1, 2, 3] + $b = $a + $b[0] = 99 + output.a = $a + output.b = $b + output: {"a": [1, 2, 3], "b": [99, 2, 3]} + + # --- Nested mutation independence --- + + - name: "nested object mutation independent from input" + input: {"record": {"address": {"city": "London", "zip": "SW1"}}} + mapping: | + $rec = input.record + $rec.address.city = "Paris" + output.var_city = $rec.address.city + output.input_city = input.record.address.city + output: {"var_city": "Paris", "input_city": "London"} + + - name: "nested array mutation independent between variables" + mapping: | + $a = {"items": [1, 2, 3]} + $b = $a + $b.items[0] = 99 + output.a_first = $a.items[0] + output.b_first = $b.items[0] + output: {"a_first": 1, "b_first": 99} + + - name: "deeply nested mutation independent between variables" + mapping: | + $a = {"level1": {"level2": {"level3": "original"}}} + $b = $a + $b.level1.level2.level3 = "modified" + output.a_val = $a.level1.level2.level3 + output.b_val = $b.level1.level2.level3 + output: {"a_val": "original", "b_val": "modified"} + + - name: "nested array in object mutation independent from output" + mapping: | + output.data = {"tags": ["a", "b", "c"]} + $snap = output.data + output.data.tags[0] = "z" + output.snap_tag = $snap.tags[0] + output: {"data": {"tags": ["z", "b", "c"]}, "snap_tag": "a"} + + - name: "adding nested field to copy does not affect source" + mapping: | + $a = {"user": {"name": "Alice"}} + $b = $a + $b.user.email = "alice@example.com" + output.a_email = $a.user.email + output.b_email = $b.user.email + output: {"a_email": null, "b_email": "alice@example.com"} + + - name: "deleting nested field in copy does not affect source" + mapping: | + $a = {"user": {"name": "Alice", "age": 30}} + $b = $a + $b.user.age = deleted() + output.a = $a.user + output.b = $b.user + output: {"a": {"name": "Alice", "age": 30}, "b": {"name": "Alice"}} + + # --- Independence with whole-document copies --- + + - name: "output = input then mutate output leaves input unchanged" + input: {"name": "Alice", "score": 100} + mapping: | + output = input + output.name = "Bob" + output.original_name = input.name + output: {"name": "Bob", "score": 100, "original_name": "Alice"} diff --git a/internal/bloblang2/spec/tests/variables/declaration.yaml b/internal/bloblang2/spec/tests/variables/declaration.yaml new file mode 100644 index 000000000..b2bb04512 --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/declaration.yaml @@ -0,0 +1,155 @@ +description: "Variable declaration: basic usage, types, use-before-declare errors, void and deleted errors" + +tests: + # --- Basic declaration and use --- + + - name: "declare and use integer variable" + mapping: | + $x = 42 + output.v = $x + output: {"v": 42} + + - name: "declare and use string variable" + mapping: | + $name = "hello" + output.v = $name + output: {"v": "hello"} + + - name: "declare and use boolean variable" + mapping: | + $flag = true + output.v = $flag + output: {"v": true} + + - name: "declare and use null variable" + mapping: | + $val = null + output.v = $val + output: {"v": null} + + - name: "declare and use float variable" + mapping: | + $pi = 3.14 + output.v = $pi + output: {"v": 3.14} + + - name: "declare and use array variable" + mapping: | + $arr = [1, 2, 3] + output.v = $arr + output: {"v": [1, 2, 3]} + + - name: "declare and use object variable" + mapping: | + $obj = {"a": 1, "b": 2} + output.v = $obj + output: {"v": {"a": 1, "b": 2}} + + - name: "declare variable from expression" + mapping: | + $x = 10 + 5 + output.v = $x + output: {"v": 15} + + - name: "declare variable from input field" + input: {"name": "Alice"} + mapping: | + $n = input.name + output.v = $n + output: {"v": "Alice"} + + - name: "declare variable from other variable" + mapping: | + $a = 100 + $b = $a + output.v = $b + output: {"v": 100} + + - name: "multiple independent variables" + mapping: | + $x = 1 + $y = 2 + $z = 3 + output.sum = $x + $y + $z + output: {"sum": 6} + + - name: "variable used in expression" + mapping: | + $base = 10 + output.v = $base * 2 + 1 + output: {"v": 21} + + # --- Use before declare is compile error --- + + - name: "use undeclared variable is compile error" + mapping: | + output.v = $x + compile_error: "undeclared" + + - name: "use variable before its declaration is compile error" + mapping: | + output.v = $x + $x = 42 + compile_error: "undeclared" + + - name: "reference undeclared variable in expression is compile error" + mapping: | + $y = $x + 1 + compile_error: "undeclared" + + # --- Void in declaration is runtime error --- + + - name: "void from if-without-else in declaration is runtime error" + mapping: | + $x = if false { 42 } + error: "void" + + - name: "void from non-exhaustive match in declaration is runtime error" + mapping: | + $x = match "nope" { + "a" => 1, + } + error: "void" + + - name: "void from else-if chain without final else in declaration is runtime error" + input: {"score": 10} + mapping: | + $tier = if false { "gold" } else if false { "silver" } + error: "void" + + # --- Deleted in declaration is runtime error --- + + - name: "deleted in variable declaration is runtime error" + mapping: | + $x = deleted() + error: "deleted" + + # --- Void rescued with or is ok --- + + - name: "or rescues void for variable declaration" + mapping: | + $x = (if false { 42 }).or(0) + output.v = $x + output: {"v": 0} + + - name: "if-else provides value for variable declaration" + mapping: | + $x = if false { 42 } else { 0 } + output.v = $x + output: {"v": 0} + + # --- Variable holds bytes --- + + - name: "declare variable with bytes value" + mapping: | + $b = "hello".bytes() + output.v = $b + output: {"v": {_type: "bytes", value: "aGVsbG8="}} + + # --- Variable holds timestamp --- + + - name: "declare variable with timestamp value" + mapping: | + $t = "2024-01-15T10:30:00Z".ts_parse() + output.v = $t + output: {"v": {_type: "timestamp", value: "2024-01-15T10:30:00Z"}} diff --git a/internal/bloblang2/spec/tests/variables/dynamic_assignment.yaml b/internal/bloblang2/spec/tests/variables/dynamic_assignment.yaml new file mode 100644 index 000000000..ece36609e --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/dynamic_assignment.yaml @@ -0,0 +1,78 @@ +description: > + Dynamic index assignments — assigning to output or variable paths using + computed indices from variables or expressions. + +tests: + # --- Variable as object key --- + + - name: "output with variable string key" + mapping: | + $key = "name" + output.data[$key] = "Alice" + output: {"data": {"name": "Alice"}} + + - name: "output with variable integer index" + mapping: | + $idx = 0 + output.arr[$idx] = "first" + output: {"arr": ["first"]} + + - name: "output with multiple variable keys" + mapping: | + $k1 = "a" + $k2 = "b" + output[$k1] = 1 + output[$k2] = 2 + output: {"a": 1, "b": 2} + + # --- Nested dynamic paths --- + + - name: "nested dynamic keys" + mapping: | + $outer = "data" + $inner = "name" + output[$outer] = {} + output[$outer][$inner] = "Alice" + output: {"data": {"name": "Alice"}} + + # --- Dynamic assignment on variables --- + + - name: "variable path with dynamic key" + mapping: | + $obj = {"a": 1, "b": 2} + $key = "a" + $obj[$key] = 99 + output = $obj + output: {"a": 99, "b": 2} + + - name: "variable path with dynamic index" + mapping: | + $arr = [10, 20, 30] + $idx = 1 + $arr[$idx] = 99 + output = $arr + output: [10, 99, 30] + + # --- Computed expressions as keys --- + + - name: "computed string key from concatenation" + mapping: | + $prefix = "key" + output[$prefix + "_1"] = "val" + output: {"key_1": "val"} + + - name: "computed key from method result" + mapping: | + $name = "Hello" + output[$name.lowercase()] = true + output: {"hello": true} + + # --- Deleted with dynamic keys --- + + - name: "delete object field with dynamic key" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3} + $key = "b" + $obj[$key] = deleted() + output = $obj + output: {"a": 1, "c": 3} diff --git a/internal/bloblang2/spec/tests/variables/expr_body_path_assign.yaml b/internal/bloblang2/spec/tests/variables/expr_body_path_assign.yaml new file mode 100644 index 000000000..0d9c2906d --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/expr_body_path_assign.yaml @@ -0,0 +1,144 @@ +description: > + Path assignments inside expression bodies — variable path mutations + like $obj[$key] = val in if-expressions, match-expressions, lambda + blocks, and nested combinations. These must resolve to the existing + variable slot, not shadow it. + +tests: + # --- Path assignment in if-expression body --- + + - name: "path assign with dynamic key in if-expression" + mapping: | + $key = "x" + output.v = if true { + $obj = {} + $obj[$key] = 42 + $obj + } else { {} } + output: {"v": {"x": 42}} + + - name: "path assign in else branch reads correct slot" + mapping: | + output.v = if false { + "never" + } else { + $result = {"base": true} + $result.extra = "added" + $result + } + output: {"v": {"base": true, "extra": "added"}} + + - name: "path assign in if-expression both branches" + mapping: | + $flag = true + output.v = if $flag { + $data = {"branch": "then"} + $data.tag = "tagged" + $data + } else { + $data = {"branch": "else"} + $data.tag = "tagged" + $data + } + output: {"v": {"branch": "then", "tag": "tagged"}} + + # --- Path assignment in lambda block body --- + + - name: "path assign in map lambda block" + mapping: | + output.v = [1, 2, 3].map(x -> { + $item = {"val": x} + $item.doubled = x * 2 + $item + }) + output: {"v": [{"val": 1, "doubled": 2}, {"val": 2, "doubled": 4}, {"val": 3, "doubled": 6}]} + + - name: "path assign with dynamic key in map lambda" + mapping: | + $keys = ["a", "b", "c"] + output.v = [0, 1, 2].map(i -> { + $obj = {} + $obj[$keys[i]] = i * 10 + $obj + }) + output: {"v": [{"a": 0}, {"b": 10}, {"c": 20}]} + + - name: "path assign in fold accumulator" + mapping: | + output.v = ["x", "y", "z"].fold({}, (acc, key) -> { + $a = acc + $a[key] = true + $a + }) + output: {"v": {"x": true, "y": true, "z": true}} + + - name: "path assign in fold with index tracking" + mapping: | + output.v = ["a", "b", "c"].enumerate().fold({}, (acc, e) -> { + $a = acc + $a[e.value] = e.index + $a + }) + output: {"v": {"a": 0, "b": 1, "c": 2}} + + # --- Nested path assignments across lambda + if --- + + - name: "path assign via statement if inside lambda" + mapping: | + output.v = [1, -2, 3, -4].map(x -> { + $item = {"value": x} + $item.sign = if x > 0 { "positive" } else { "negative" } + $item + }) + output: + v: + - value: 1 + sign: "positive" + - value: -2 + sign: "negative" + - value: 3 + sign: "positive" + - value: -4 + sign: "negative" + + - name: "path assign in nested lambdas" + mapping: | + output.v = [[1, 2], [3, 4]].map(row -> { + $result = {"items": []} + $result.items = row.map(x -> x * 10) + $result + }) + output: {"v": [{"items": [10, 20]}, {"items": [30, 40]}]} + + # --- Multiple path assigns in lambda block --- + + - name: "multiple path assigns in lambda block accumulate" + mapping: | + output.v = [1].map(x -> { + $obj = {} + $obj.a = 1 + $obj.b = 2 + $obj.c = 3 + $obj + }) + output: {"v": [{"a": 1, "b": 2, "c": 3}]} + + - name: "path assign to array index in lambda block" + mapping: | + output.v = [1].map(x -> { + $arr = [0, 0, 0] + $arr[0] = 10 + $arr[1] = 20 + $arr[2] = 30 + $arr + }) + output: {"v": [[10, 20, 30]]} + + - name: "path assign then read modified field in lambda" + mapping: | + output.v = [1].map(x -> { + $data = {"count": 0} + $data.count = 5 + $data.count + 10 + }) + output: {"v": [15]} diff --git a/internal/bloblang2/spec/tests/variables/nested_scope_mutations.yaml b/internal/bloblang2/spec/tests/variables/nested_scope_mutations.yaml new file mode 100644 index 000000000..241fc410f --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/nested_scope_mutations.yaml @@ -0,0 +1,167 @@ +description: > + Variable mutation patterns across nested scopes — statement-mode + write-through in nested if/match, expression-mode shadowing, + and interactions between the two contexts. + +tests: + # --- Statement-mode nested if: both levels modify outer --- + + - name: "nested if-statements both modify outer variable" + mapping: | + $x = 0 + if true { + $x = 1 + if true { + $x = 2 + } + } + output.v = $x + output: {"v": 2} + + - name: "nested if-statement inner false — outer modified only" + mapping: | + $x = 0 + if true { + $x = 1 + if false { + $x = 2 + } + } + output.v = $x + output: {"v": 1} + + - name: "match-statement arms modify outer variable" + mapping: | + $result = "init" + match "b" { + "a" => { $result = "alpha" }, + "b" => { $result = "beta" }, + _ => { $result = "other" }, + } + output.v = $result + output: {"v": "beta"} + + - name: "nested match inside if modifies outer" + mapping: | + $val = 0 + if true { + match "x" { + "x" => { $val = 42 }, + _ => { $val = -1 }, + } + } + output.v = $val + output: {"v": 42} + + # --- Expression-mode shadowing does not affect outer --- + + - name: "if-expression shadows outer variable" + mapping: | + $x = "outer" + output.inner = if true { + $x = "inner" + $x + } else { "nope" } + output.outer = $x + output: {"inner": "inner", "outer": "outer"} + + - name: "match-expression shadows outer variable" + mapping: | + $x = "outer" + output.matched = match "go" { + "go" => { + $x = "matched" + $x + }, + _ => "nope", + } + output.outer = $x + output: {"matched": "matched", "outer": "outer"} + + - name: "lambda shadows outer variable" + mapping: | + $x = 100 + output.mapped = [1, 2, 3].map(x -> { + $x = x * 10 + $x + }) + output.outer = $x + output: {"mapped": [10, 20, 30], "outer": 100} + + # --- Statement modifies then expression reads --- + + - name: "statement modifies variable, expression reads it" + mapping: | + $data = {"status": "pending"} + if true { + $data.status = "done" + } + output.v = if true { $data.status } else { "unknown" } + output: {"v": "done"} + + # --- Expression body variable not visible outside --- + + - name: "variable declared in if-expression not visible outside" + mapping: | + output.v = if true { + $temp = 42 + $temp + } else { 0 } + output.leaked = $temp + compile_error: "undeclared" + + - name: "variable declared in match-expression not visible outside" + mapping: | + output.v = match "a" { + "a" => { + $inner = "hello" + $inner + }, + _ => "nope", + } + output.leaked = $inner + compile_error: "undeclared" + + - name: "variable declared in lambda not visible outside" + mapping: | + output.v = [1].map(x -> { + $inner = x * 10 + $inner + }) + output.leaked = $inner + compile_error: "undeclared" + + # --- Path assign in statement mode modifies outer --- + + - name: "path assign in if-statement modifies outer object" + mapping: | + $config = {"debug": false, "verbose": false} + if true { + $config.debug = true + } + output.v = $config + output: {"v": {"debug": true, "verbose": false}} + + - name: "path assign in match-statement modifies outer object" + mapping: | + $config = {"level": "info"} + match "debug" { + "debug" => { $config.level = "debug" }, + "trace" => { $config.level = "trace" }, + } + output.v = $config + output: {"v": {"level": "debug"}} + + # --- Complex nested: statement with expression inside --- + + - name: "if-statement body uses expression that shadows" + mapping: | + $items = [] + if true { + $items = [1, 2, 3].map(x -> { + $doubled = x * 2 + $doubled + }) + } + output.v = $items + output: {"v": [2, 4, 6]} diff --git a/internal/bloblang2/spec/tests/variables/path_assignment.yaml b/internal/bloblang2/spec/tests/variables/path_assignment.yaml new file mode 100644 index 000000000..45a8366af --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/path_assignment.yaml @@ -0,0 +1,235 @@ +description: "Variable path assignment: field mutation, index mutation, auto-creation, gap filling, deleted removal, wrong intermediate type errors" + +tests: + # --- Field assignment --- + + - name: "assign field on object variable" + mapping: | + $obj = {"name": "Alice"} + $obj.name = "Bob" + output.v = $obj + output: {"v": {"name": "Bob"}} + + - name: "add new field to object variable" + mapping: | + $obj = {"a": 1} + $obj.b = 2 + output.v = $obj + output: {"v": {"a": 1, "b": 2}} + + - name: "assign nested field on object variable" + mapping: | + $obj = {"user": {"name": "Alice"}} + $obj.user.name = "Bob" + output.v = $obj + output: {"v": {"user": {"name": "Bob"}}} + + - name: "auto-create intermediate object for field assignment" + mapping: | + $obj = {} + $obj.user.name = "Alice" + output.v = $obj + output: {"v": {"user": {"name": "Alice"}}} + + - name: "deeply nested auto-creation" + mapping: | + $obj = {} + $obj.a.b.c.d = 42 + output.v = $obj + output: {"v": {"a": {"b": {"c": {"d": 42}}}}} + + # --- Index assignment --- + + - name: "assign index on array variable" + mapping: | + $arr = [10, 20, 30] + $arr[1] = 99 + output.v = $arr + output: {"v": [10, 99, 30]} + + - name: "assign to end of array" + mapping: | + $arr = [1, 2, 3] + $arr[3] = 4 + output.v = $arr + output: {"v": [1, 2, 3, 4]} + + - name: "gaps filled with null" + mapping: | + $arr = [1] + $arr[3] = 99 + output.v = $arr + output: {"v": [1, null, null, 99]} + + - name: "assign to index zero of empty array" + mapping: | + $arr = [] + $arr[0] = "first" + output.v = $arr + output: {"v": ["first"]} + + - name: "negative index assignment" + mapping: | + $arr = [10, 20, 30] + $arr[-1] = 99 + output.v = $arr + output: {"v": [10, 20, 99]} + + # --- Auto-creation based on index type --- + + - name: "auto-create array for numeric index on empty object field" + mapping: | + $obj = {} + $obj.items[0] = "first" + output.v = $obj + output: {"v": {"items": ["first"]}} + + - name: "auto-create object for string field on empty object field" + mapping: | + $obj = {} + $obj.nested.key = "value" + output.v = $obj + output: {"v": {"nested": {"key": "value"}}} + + # --- Deleted removes field --- + + - name: "deleted removes field from variable object" + mapping: | + $obj = {"a": 1, "b": 2, "c": 3} + $obj.b = deleted() + output.v = $obj + output: {"v": {"a": 1, "c": 3}} + + - name: "deleted removes nested field from variable object" + mapping: | + $obj = {"user": {"name": "Alice", "age": 30}} + $obj.user.age = deleted() + output.v = $obj + output: {"v": {"user": {"name": "Alice"}}} + + # --- Deleted removes array element (shifts remaining) --- + + - name: "deleted removes array element and shifts" + mapping: | + $arr = [10, 20, 30, 40] + $arr[1] = deleted() + output.v = $arr + output: {"v": [10, 30, 40]} + + - name: "deleted removes first array element" + mapping: | + $arr = [10, 20, 30] + $arr[0] = deleted() + output.v = $arr + output: {"v": [20, 30]} + + - name: "deleted removes last array element" + mapping: | + $arr = [10, 20, 30] + $arr[-1] = deleted() + output.v = $arr + output: {"v": [10, 20]} + + # --- Wrong intermediate type errors --- + + - name: "field assignment on string variable is error" + mapping: | + $val = "hello" + $val.field = "x" + error: "field" + + - name: "field assignment on integer variable is error" + mapping: | + $val = 42 + $val.field = "x" + error: "field" + + - name: "field assignment on boolean variable is error" + mapping: | + $val = true + $val.field = "x" + error: "field" + + - name: "index assignment on string variable is error" + mapping: | + $val = "hello" + $val[0] = "H" + error: "index" + + - name: "nested path wrong intermediate type is error" + mapping: | + $obj = {"name": "Alice"} + $obj.name.first = "A" + error: "field" + + # --- Multiple path assignments --- + + - name: "multiple field assignments build up object" + mapping: | + $record = {} + $record.name = "Alice" + $record.age = 30 + $record.active = true + output.v = $record + output: {"v": {"name": "Alice", "age": 30, "active": true}} + + - name: "mixed field and index assignments" + mapping: | + $data = {"scores": []} + $data.scores[0] = 90 + $data.scores[1] = 85 + $data.name = "Alice" + output.v = $data + output: {"v": {"scores": [90, 85], "name": "Alice"}} + + # --- Path assignment to an undeclared variable is a declaration (Section 3.7) --- + + - name: "field path on undeclared variable auto-creates as object" + mapping: | + $user.name = "Alice" + output.v = $user + output: {"v": {"name": "Alice"}} + + - name: "nested field path on undeclared variable auto-creates intermediates" + mapping: | + $user.address.city = "London" + output.v = $user + output: {"v": {"address": {"city": "London"}}} + + - name: "numeric index on undeclared variable auto-creates as array" + mapping: | + $arr[0] = "first" + output.v = $arr + output: {"v": ["first"]} + + - name: "numeric index with gap on undeclared variable fills with null" + mapping: | + $arr[2] = "third" + output.v = $arr + output: {"v": [null, null, "third"]} + + - name: "dynamic string index on undeclared variable auto-creates as object" + mapping: | + $key = "name" + $obj[$key] = "Alice" + output.v = $obj + output: {"v": {"name": "Alice"}} + + - name: "dynamic int index on undeclared variable auto-creates as array" + mapping: | + $i = 0 + $arr[$i] = "first" + output.v = $arr + output: {"v": ["first"]} + + - name: "further path assignments refer to the same declared variable" + mapping: | + $record.name = "Alice" + $record.age = 30 + output.v = $record + output: {"v": {"name": "Alice", "age": 30}} + + - name: "reading an undeclared variable is a compile error" + mapping: | + output.v = $never + compile_error: "$never" diff --git a/internal/bloblang2/spec/tests/variables/reassignment.yaml b/internal/bloblang2/spec/tests/variables/reassignment.yaml new file mode 100644 index 000000000..5389e835f --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/reassignment.yaml @@ -0,0 +1,209 @@ +description: "Variable reassignment: same-scope mutation, statement context outer modification, block-scoped new vars, pre-declare pattern" + +tests: + # --- Same-scope reassignment (mutation) --- + + - name: "reassign variable in same scope" + mapping: | + $x = 1 + $x = 2 + output.v = $x + output: {"v": 2} + + - name: "reassign variable multiple times" + mapping: | + $x = 1 + $x = 2 + $x = 3 + $x = 4 + output.v = $x + output: {"v": 4} + + - name: "reassign variable to different type" + mapping: | + $x = 42 + $x = "hello" + output.v = $x + output: {"v": "hello"} + + - name: "reassign variable using its own value" + mapping: | + $x = 10 + $x = $x + 5 + output.v = $x + output: {"v": 15} + + - name: "reassign variable accumulation" + mapping: | + $sum = 0 + $sum = $sum + 1 + $sum = $sum + 2 + $sum = $sum + 3 + output.v = $sum + output: {"v": 6} + + - name: "void skips variable reassignment" + mapping: | + $x = 10 + $x = if false { 42 } + output.v = $x + output: {"v": 10} + + - name: "void from match skips variable reassignment" + mapping: | + $x = "original" + $x = match "nope" { + "a" => "found", + } + output.v = $x + output: {"v": "original"} + + # --- Statement context: if-statement modifies outer variable --- + + - name: "if-statement outer variable modification" + mapping: | + $value = 10 + if input.flag { + $value = 20 + } + output.v = $value + cases: + - name: "modifies when true" + input: {"flag": true} + output: {"v": 20} + - name: "unchanged when false" + input: {"flag": false} + output: {"v": 10} + + - name: "if-else statement modifies outer variable in else branch" + input: {"flag": false} + mapping: | + $value = "initial" + if input.flag { + $value = "from-if" + } else { + $value = "from-else" + } + output.v = $value + output: {"v": "from-else"} + + - name: "match statement outer variable modification" + mapping: | + $result = "none" + match input.kind { + "a" => { + $result = "found-a" + }, + "b" => { + $result = "found-b" + }, + } + output.v = $result + cases: + - name: "modifies on match" + input: {"kind": "b"} + output: {"v": "found-b"} + - name: "unchanged on no match" + input: {"kind": "c"} + output: {"v": "none"} + + # --- New variables in statement blocks are block-scoped --- + + - name: "new variable in if-statement block not visible outside" + input: {"flag": true} + mapping: | + if input.flag { + $local = "hello" + output.inner = $local + } + output.outer = $local + compile_error: "undeclared" + + - name: "new variable in match statement block not visible outside" + mapping: | + match "a" { + "a" => { + $local = "found" + output.inner = $local + }, + } + output.outer = $local + compile_error: "undeclared" + + - name: "new variable in else block not visible outside" + input: {"flag": false} + mapping: | + if input.flag { + output.x = 1 + } else { + $temp = "hello" + output.y = $temp + } + output.z = $temp + compile_error: "undeclared" + + # --- Pre-declare pattern --- + + - name: "pre-declare pattern with if-statement" + mapping: | + $temp = null + if input.flag { + $temp = "found" + } + output.v = $temp + cases: + - name: "true branch assigns" + input: {"flag": true} + output: {"v": "found"} + - name: "false branch keeps null" + input: {"flag": false} + output: {"v": null} + + - name: "pre-declare pattern with match statement" + input: {"kind": "gold"} + mapping: | + $discount = 0 + match input.kind { + "gold" => { + $discount = 20 + }, + "silver" => { + $discount = 10 + }, + } + output.v = $discount + output: {"v": 20} + + - name: "pre-declare pattern with nested if-statements" + input: {"a": true, "b": true} + mapping: | + $msg = "none" + if input.a { + $msg = "a" + if input.b { + $msg = "a+b" + } + } + output.v = $msg + output: {"v": "a+b"} + + - name: "pre-declare pattern modification and further use" + input: {"items": ["x", "y", "z"]} + mapping: | + $count = 0 + $count = input.items.length() + output.v = $count + output: {"v": 3} + + # --- Statement context reassignment uses new value after block --- + + - name: "outer variable modified in statement reflects after block" + input: {"score": 95} + mapping: | + $tier = "bronze" + if input.score >= 90 { + $tier = "gold" + } + output.tier = $tier + output.check = $tier == "gold" + output: {"tier": "gold", "check": true} diff --git a/internal/bloblang2/spec/tests/variables/scope_boundaries.yaml b/internal/bloblang2/spec/tests/variables/scope_boundaries.yaml new file mode 100644 index 000000000..8539ea8fe --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/scope_boundaries.yaml @@ -0,0 +1,138 @@ +description: > + Variable scope boundary semantics — variables crossing if/match/lambda + boundaries, statement-mode write-through, and expression-mode shadowing. + +tests: + # --- Statement-mode write-through --- + + - name: "if-statement branch modifies outer variable" + mapping: | + $x = 1 + if true { + $x = 99 + } + output.v = $x + output: {"v": 99} + + - name: "if-statement false branch does not execute" + mapping: | + $x = 1 + if false { + $x = 99 + } + output.v = $x + output: {"v": 1} + + - name: "nested if-statements both modify outer" + mapping: | + $x = 0 + if true { + $x = 1 + if true { + $x = 2 + } + } + output.v = $x + output: {"v": 2} + + - name: "match-statement arm modifies outer variable" + mapping: | + $x = 0 + match "a" { + "a" => { + $x = 42 + } + } + output.v = $x + output: {"v": 42} + + # --- Block-scoped variables --- + + - name: "variable declared in if-branch not visible outside" + mapping: | + if true { + $inner = 10 + } + output.v = $inner + compile_error: "undeclared" + + - name: "variable declared in match-arm not visible outside" + mapping: | + match "a" { + "a" => { + $inner = 10 + } + } + output.v = $inner + compile_error: "undeclared" + + # --- Expression-mode shadowing --- + + - name: "if-expression shadows outer variable" + mapping: | + $x = "outer" + $result = if true { + $x = "shadow" + $x + } + output.outer = $x + output.result = $result + output: + outer: "outer" + result: "shadow" + + - name: "match-expression shadows outer variable" + mapping: | + $x = "outer" + $result = match "a" { + "a" => { + $x = "shadow" + $x + } + } + output.outer = $x + output.result = $result + output: + outer: "outer" + result: "shadow" + + # --- Combined patterns --- + + - name: "pre-declare then write in both branches" + mapping: | + $result = "" + if input.flag { + $result = "yes" + } else { + $result = "no" + } + output.v = $result + cases: + - name: "true branch" + input: {"flag": true} + output: {"v": "yes"} + - name: "false branch" + input: {"flag": false} + output: {"v": "no"} + + - name: "outer variable and block-scoped variable coexist" + mapping: | + $x = "outer" + if true { + $x = "modified" + $y = "local" + output.local = $y + } + output.outer = $x + output: + outer: "modified" + local: "local" + + - name: "lambda reads outer variable modified by if-statement" + mapping: | + $factor = 1 + if true { + $factor = 10 + } + output.v = [1, 2, 3].map(x -> x * $factor) + output: {"v": [10, 20, 30]} diff --git a/internal/bloblang2/spec/tests/variables/shadowing.yaml b/internal/bloblang2/spec/tests/variables/shadowing.yaml new file mode 100644 index 000000000..4f2380f0b --- /dev/null +++ b/internal/bloblang2/spec/tests/variables/shadowing.yaml @@ -0,0 +1,222 @@ +description: "Variable shadowing in expression contexts: if-expression, match expression, lambda bodies; outer unchanged; same-scope reassignment within expression" + +tests: + # --- If-expression shadows outer variable --- + + - name: "if-expression variable shadowing" + mapping: | + $value = 10 + output.inner = if input.flag { + $value = 20 + $value + } + output.outer = $value + cases: + - name: "shadow when true" + input: {"flag": true} + output: {"inner": 20, "outer": 10} + - name: "outer unchanged when false" + input: {"flag": false} + output: {"outer": 10} + + - name: "if-else expression shadowing" + mapping: | + $x = "original" + output.result = if input.flag { + $x = "from-if" + $x + } else { + $x = "from-else" + $x + } + output.outer = $x + cases: + - name: "if branch shadow" + input: {"flag": true} + output: {"result": "from-if", "outer": "original"} + - name: "else branch shadow" + input: {"flag": false} + output: {"result": "from-else", "outer": "original"} + + # --- Match expression shadows outer variable --- + + - name: "match expression shadows outer variable" + input: {"kind": "a"} + mapping: | + $val = "outer" + output.result = match input.kind { + "a" => { + $val = "inner-a" + $val + }, + _ => "default", + } + output.outer = $val + output: {"result": "inner-a", "outer": "outer"} + + - name: "match expression shadow in different arms" + input: {"kind": "b"} + mapping: | + $val = "outer" + output.result = match input.kind { + "a" => { + $val = "inner-a" + $val + }, + "b" => { + $val = "inner-b" + $val + }, + _ => "default", + } + output.outer = $val + output: {"result": "inner-b", "outer": "outer"} + + - name: "match expression default arm also shadows" + input: {"kind": "c"} + mapping: | + $val = 0 + output.result = match input.kind { + "a" => { + $val = 1 + $val + }, + _ => { + $val = 99 + $val + }, + } + output.outer = $val + output: {"result": 99, "outer": 0} + + # --- Same-scope reassignment within expression body --- + + - name: "reassignment within if-expression body is mutation not shadow" + input: {"flag": true} + mapping: | + $outer = "untouched" + output.result = if input.flag { + $x = 1 + $x = 2 + $x = 3 + $x + } + output.outer = $outer + output: {"result": 3, "outer": "untouched"} + + - name: "reassignment within match-expression body is mutation" + input: {"kind": "a"} + mapping: | + output.result = match input.kind { + "a" => { + $acc = 0 + $acc = $acc + 10 + $acc = $acc + 20 + $acc + }, + _ => 0, + } + output: {"result": 30} + + # --- New variables in expression block are block-scoped --- + + - name: "new variable in if-expression not visible outside" + input: {"flag": true} + mapping: | + output.result = if input.flag { + $local = 42 + $local + } + output.after = $local + compile_error: "undeclared" + + - name: "new variable in match-expression not visible outside" + mapping: | + output.result = match "a" { + "a" => { + $inner = "found" + $inner + }, + _ => "default", + } + output.after = $inner + compile_error: "undeclared" + + # --- Nested expression contexts --- + + - name: "output assignment in expression context is compile error" + input: {"a": true, "b": true} + mapping: | + $x = "top" + output.inner = if input.a { + $x = "level1" + output.nested = if input.b { + $x = "level2" + $x + } + $x + } + output.outer = $x + compile_error: "output" + + - name: "shadow does not leak between sibling expressions" + input: {"flag": true} + mapping: | + $x = "original" + output.first = if input.flag { + $x = "first" + $x + } + output.second = if input.flag { + $x + } + output.outer = $x + output: {"first": "first", "second": "original", "outer": "original"} + + # --- Lambda expression context shadows --- + + - name: "lambda body shadows outer variable" + mapping: | + $x = "outer" + output.result = [1, 2, 3].map(n -> { + $x = n * 10 + $x + }) + output.outer = $x + output: {"result": [10, 20, 30], "outer": "outer"} + + # --- Map body expression context shadows --- + + - name: "map body shadows outer variable via expression context" + mapping: | + map add_label(data) { + $tag = "inner" + {"value": data, "tag": $tag} + } + $tag = "outer" + output.result = add_label(42) + output.tag = $tag + output: {"result": {"value": 42, "tag": "inner"}, "tag": "outer"} + + # --- Expression context: outer variable read but not mutated --- + + - name: "expression reads outer variable without mutating it" + input: {"flag": true} + mapping: | + $base = 100 + output.result = if input.flag { + $base + 50 + } + output.base = $base + output: {"result": 150, "base": 100} + + - name: "match expression reads outer variable without mutating it" + input: {"multiplier": 3} + mapping: | + $base = 10 + output.result = match input.multiplier { + 3 => $base * 3, + _ => $base, + } + output.base = $base + output: {"result": 30, "base": 10} From 8b190ecbfc0c11a8f322944ebd7850112a5d03a9 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Thu, 30 Apr 2026 15:57:28 +0100 Subject: [PATCH 06/20] bloblang(v2): Add internal/bloblang2 package and Taskfile Adds the top-level internal/bloblang2 entry point: bloblang2.go exposes the runtime to other internal packages, benchmark_test.go provides a corpus-wide V1 vs V2 benchmark harness, and the Taskfile ties together Go, TypeScript, and tree-sitter tests behind a unified action-first surface (build, test, demo, clean). Also includes the PARITY.md V1 method tracking table, the package README, and REMAINING.md. --- internal/bloblang2/PARITY.md | 296 ++++++++++++++++ internal/bloblang2/README.md | 173 +++++++++ internal/bloblang2/REMAINING.md | 28 ++ internal/bloblang2/Taskfile.yml | 164 +++++++++ internal/bloblang2/benchmark_test.go | 510 +++++++++++++++++++++++++++ internal/bloblang2/bloblang2.go | 51 +++ internal/bloblang2/bloblang2_test.go | 11 + 7 files changed, 1233 insertions(+) create mode 100644 internal/bloblang2/PARITY.md create mode 100644 internal/bloblang2/README.md create mode 100644 internal/bloblang2/REMAINING.md create mode 100644 internal/bloblang2/Taskfile.yml create mode 100644 internal/bloblang2/benchmark_test.go create mode 100644 internal/bloblang2/bloblang2.go create mode 100644 internal/bloblang2/bloblang2_test.go diff --git a/internal/bloblang2/PARITY.md b/internal/bloblang2/PARITY.md new file mode 100644 index 000000000..9f5ae70ba --- /dev/null +++ b/internal/bloblang2/PARITY.md @@ -0,0 +1,296 @@ +# Bloblang V1 → V2 Plugin Parity + +Tracking sheet for porting V1 stdlib functions and methods to V2 within this +repo. Generated from a sweep of `internal/bloblang/query/` (V1 internal stdlib) +and `internal/impl/pure/bloblang_*.go` (V1 plugin registrations) against +`internal/bloblang2/go/pratt/eval/{stdlib,stdlib_lambda}.go` (V2 internal +stdlib). + +Deprecated V1 entries are excluded. + +## Status legend + +- ✅ ported / available in V2 +- ⏸ deferred (named-and-known follow-up) +- ❌ V1-only by architectural choice; V2 won't port +- ➕ V2-only addition (no V1 equivalent) + +> **Batches 1, 2, and 3 are complete.** Pure ports live in +> `internal/impl/pure/bloblangv2_*.go`; non-pure (clock / randomness) +> live in `internal/impl/io/bloblangv2_*.go`. Message-coupled stdlib +> (batch_index, content, error, …) is registered internally in +> `internal/bloblang2/go/pratt/eval/stdlib_message.go` and wired +> through `Executor.QueryMessage(MessageContext)`. +> +> The remaining open items are tracked in [Deferred](#deferred-work) +> below — a `format` API redesign and several V1 → V2 migrator +> idiom-shift rules. + +## Plugin lambda parameters (resolved) + +`bloblangv2.PluginSpec` now supports lambda-typed parameters through +`NewLambdaParam(name)`. Plugin authors retrieve the lambda as a +`bloblangv2.Lambda` callable via `ParsedParams.GetLambda(name)` and +invoke it with positional argument values; bare map references +(spec §5.5) are accepted on the call-site and synthesised into +single-parameter lambdas automatically. + +Plumbing sketch: + +- `eval.MethodSpec` / `FunctionSpec` gained a `PluginFn` dispatch shape + that receives the interpreter alongside unevaluated AST args. The + plugin layer routes specs with any `paramKindLambda` parameter + through this path, evaluates non-lambda args eagerly, and wraps + lambda positions into a `Lambda` closure that calls back through + `interp.CallLambda`. +- `eval.Interpreter.CallLambda` and `ExtractLambdaOrMapRef` are + exported for the plugin layer to use. +- Static-arg folding is bypassed for plugins with lambda params + (lambdas are not values), matching V2 stdlib semantics. + +## Architectural choices we are NOT porting + +| V1 name | V2 equivalent / reason | +|---------|------------------------| +| `from`, `from_all` | V1 batch context. V2 is per-message; batch operations belong in the processor layer. | +| `apply` | V1 named-map invocation. V2 has explicit `map name(arg) { ... }` and bare-name calls (spec §5). | +| `map_each` | Superseded by V2 `map`, which already takes a lambda. | +| `not` | V2 has the `!` operator (spec §3). | + +## Comparison + +### Type coercion + +| Name | Type | Status | Notes | +|---|---|---|---| +| `bool` | method | ✅ | | +| `bytes` | method | ✅ | | +| `number` | method | ✅ | V2 has typed `int64`/`float64`/etc.; `number` would be a permissive coerce. | +| `string` | method | ✅ | | +| `timestamp` | method | ✅ | | +| `type` | method | ✅ | | + +### Strings + +| Name | Type | Status | Notes | +|---|---|---|---| +| `capitalize` | method | ✅ | | +| `decode` | method | ✅ | | +| `encode` | method | ✅ | | +| `escape_html` | method | ✅ | | +| `escape_url_query` | method | ✅ | | +| `filepath_join` | method | ✅ | Path manipulation — pure, no FS access. | +| `filepath_split` | method | ✅ | | +| `format` | method | ✅ | V2 takes a single array argument (`"%s".format([args])`); migrator rewrites V1 variadic callsites. | +| `parse_form_url_encoded` | method | ✅ | | +| `has_prefix` | method | ✅ | | +| `has_suffix` | method | ✅ | | +| `hash` | method | ✅ | | +| `index_of` | method | ✅ | | +| `join` | method | ✅ | | +| `lowercase` | method | ✅ | | +| `parse_url` | method | ✅ | | +| `quote` | method | ✅ | | +| `repeat` | method | ✅ | | +| `replace` | method | ✅ | | +| `replace_all` | method | ✅ | | +| `replace_all_many` | method | ✅ | | +| `replace_many` | method | ✅ | | +| `reverse` | method | ✅ | | +| `split` | method | ✅ | | +| `trim` | method | ✅ | | +| `trim_prefix` | method | ✅ | | +| `trim_suffix` | method | ✅ | | +| `unescape_html` | method | ✅ | | +| `unescape_url_query` | method | ✅ | | +| `unquote` | method | ✅ | | +| `uppercase` | method | ✅ | | + +### Numbers + +| Name | Type | Status | Notes | +|---|---|---|---| +| `abs` | method | ✅ | | +| `bitwise_and` | method | ✅ | | +| `bitwise_or` | method | ✅ | | +| `bitwise_xor` | method | ✅ | | +| `ceil` | method | ✅ | | +| `cos` | method | ✅ | | +| `floor` | method | ✅ | | +| `log` | method | ✅ | | +| `log10` | method | ✅ | | +| `pi` | function | ✅ | | +| `pow` | method | ✅ | | +| `round` | method | ✅ | | +| `sin` | method | ✅ | | +| `tan` | method | ✅ | | + +### Arrays / sequences + +| Name | Type | Status | Notes | +|---|---|---|---| +| `all` | method | ✅ | | +| `any` | method | ✅ | | +| `append` | method | ✅ | | +| `collapse` | method | ✅ | | +| `contains` | method | ✅ | | +| `enumerated` | method | ✅ | V2 has `enumerate`; confirm shape parity. | +| `filter` | method | ✅ | | +| `find` | method | ✅ | V2's `index_of` covers V1 `find(value) → index`; V2's stdlib `find(lambda)` returns the matching element instead. | +| `find_all` | method | ✅ | | +| `find_all_by` | method | ✅ | Lambda predicate via plugin lambda support. | +| `find_by` | method | ✅ | Lambda predicate via plugin lambda support. | +| `flatten` | method | ✅ | | +| `fold` | method | ✅ | | +| `index` | method | ✅ | | +| `key_values` | method | ✅ | | +| `length` | method | ✅ | | +| `map_each` | method | ❌ | V2 `map` covers it. | +| `max` | method | ✅ | | +| `min` | method | ✅ | | +| `not_empty` | method | ✅ | | +| `slice` | method | ✅ | | +| `sort` | method | ✅ | | +| `sort_by` | method | ✅ | | +| `sum` | method | ✅ | | +| `unique` | method | ✅ | | +| `without` | method | ✅ | | +| `zip` | method | ✅ | V2 takes a single array-of-arrays argument (V1 was variadic). | + +### Objects + +| Name | Type | Status | Notes | +|---|---|---|---| +| `array` | method | ✅ | | +| `assign` | method | ✅ | | +| `exists` | method | ✅ | | +| `explode` | method | ✅ | | +| `get` | method | ✅ | | +| `keys` | method | ✅ | | +| `map_each_key` | method | ✅ | Lambda predicate via plugin lambda support. | +| `merge` | method | ✅ | | +| `values` | method | ✅ | | +| `with` | method | ✅ | V2 takes a single array of dot-paths (V1 was variadic). | + +### Regex + +| Name | Type | Status | Notes | +|---|---|---|---| +| `re_find_all` | method | ✅ | | +| `re_find_all_object` | method | ✅ | | +| `re_find_all_submatch` | method | ✅ | | +| `re_find_object` | method | ✅ | | +| `re_match` | method | ✅ | | +| `re_replace` | method | ✅ | | +| `re_replace_all` | method | ✅ | | + +### Time + +| Name | Type | Status | Notes | +|---|---|---|---| +| `now` | function | ✅ | | +| `parse_duration` | method | ✅ | | +| `timestamp_unix` | function | ✅ | Reads current time — non-pure. | +| `timestamp_unix_micro` | function | ✅ | | +| `timestamp_unix_milli` | function | ✅ | | +| `timestamp_unix_nano` | function | ✅ | | +| `ts_add` | method | ✅ | | +| `ts_format` | method | ✅ | Accepts strftime format strings (V1 `ts_strftime` is the same shape). | +| `ts_parse` | method | ✅ | Accepts strptime format strings (V1 `ts_strptime` is the same shape). | +| `ts_round` | method | ✅ | | +| `ts_sub` | method | ✅ | | +| `ts_tz` | method | ✅ | | +| `ts_unix` / `_milli` / `_micro` / `_nano` | method | ✅ | | +| `format_timestamp_strftime` / `parse_timestamp_strptime` | method | ✅ | Migrator renames to V2 `ts_format` / `ts_parse` (both V1 and V2 use strftime / strptime). | +| `format_timestamp_unix` / `_milli` / `_micro` / `_nano` | method | ✅ | Migrator renames to V2 `ts_unix` / `_milli` / `_micro` / `_nano`. | +| `format_timestamp` / `parse_timestamp` | method | ⏸ | V1 uses Go's reference-time layout; V2 `ts_format` / `ts_parse` use strftime/strptime exclusively. Migrator emits a Note pointing at the strftime variant — manual format-string conversion required. | +| `ts_strftime` / `ts_strptime` | method | ✅ | V2 has `ts_format` / `ts_parse`; migrator renames V1 callsites. | + +### Encoding / parsing + +| Name | Type | Status | Notes | +|---|---|---|---| +| `compress` | method | ✅ | V2 takes the same `algorithm` + optional `level` parameters. | +| `decompress` | method | ✅ | | +| `format_json` | method | ✅ | | +| `format_yaml` | method | ✅ | | +| `infer_schema` | method | ❌ | V1-specific JSON schema utility; not ported pending demand. | +| `json_schema` | method | ✅ | | +| `parse_csv` | method | ✅ | | +| `parse_json` | method | ✅ | | +| `parse_yaml` | method | ✅ | | +| `squash` | method | ❌ | V1-specific JSON schema utility; not ported pending demand. | + +### Crypto / IDs + +| Name | Type | Status | Notes | +|---|---|---|---| +| `decrypt_aes` | method | ✅ | Deterministic given key — pure. | +| `encrypt_aes` | method | ✅ | | +| `ksuid` | function | ✅ | Generates IDs; non-pure. | +| `nanoid` | function | ✅ | | +| `uuid_v4` | function | ✅ | | +| `uuid_v5` | method | ✅ | Deterministic — pure. | +| `uuid_v7` | function | ✅ | Timestamp-based; non-pure. | + +### Error handling + +| Name | Type | Status | Notes | +|---|---|---|---| +| `catch` | method | ✅ | | +| `deleted` | function | ✅ | | +| `error` | function | ✅ | V2 returns structured `{what: string}` (was string in V1); migrator rewrites V1 `error()` → V2 `error().what`. | +| `error_source_label` | function | ❌ | V1 backwards-compat workaround; V2 surfaces source.* on the structured `error()` object in a future iteration. | +| `error_source_name` | function | ❌ | See `error_source_label`. | +| `error_source_path` | function | ❌ | See `error_source_label`. | +| `errored` | function | ✅ | | +| `not_null` | method | ✅ | | +| `or` | method | ✅ | | +| `throw` | function | ✅ | | + +### Message / pipeline context + +| Name | Type | Status | Notes | +|---|---|---|---| +| `batch_index` | function | ✅ | Bound via `Executor.QueryMessage(MessageContext)`. | +| `batch_size` | function | ✅ | | +| `content` | function | ✅ | Returns the raw bytes via `MessageContext.Bytes()`. | +| `json` | function | ❌ | Redundant in V2 — `input` is the parsed body; `content().parse_json()` re-parses from bytes. Migrator emits a Note. | +| `metadata` | function | ❌ | Redundant in V2 — `input@[key]` covers the read form. Migrator rewrites `metadata()` / `metadata("k")` to `input@` / `input@["k"]`. | +| `meta` | function | ❌ | V1's string-only metadata reader; replaced by V2 `input@`. Migrator rewrites with a type-change Note. | +| `root_meta` | function | ❌ | Redundant in V2 — `output@[key]` covers the read form. Migrator rewrites accordingly. | +| `tracing_id` | function | ✅ | Backed by `MessageContext.TraceID()`. | +| `tracing_span` | function | ✅ | Backed by `MessageContext.Span()`. | + +### V2-only additions (➕) + +Methods: `char`, `collect`, `filter_entries`, `float32`, `float64`, `has_key`, +`int32`, `int64`, `into`, `iter`, `map_entries`, `map_keys`, `map_values`, +`uint32`, `uint64`, `without_index`. + +Functions: `day`, `hour`, `minute`, `random_int`, `range`, `second`, +`timestamp`, `void`. + +## Deferred work + +### Plugin / stdlib + +- **`format` method** — V1 was variadic (`"%s/%d".format(name, age)`). + V2 dropped variadic params. Needs an array-param API redesign + (e.g. `"%s/%d".format([name, age])`) before the port can land. + +### Migrator idiom-shift rules + +Open follow-ups in the V1 → V2 migrator: + +- V1 `.format_timestamp(fmt)` / `.parse_timestamp(fmt)` — V1 uses + Go's reference-time layout, V2's `ts_format` / `ts_parse` use + strftime/strptime. The format strings are not interchangeable, so + the migrator can't safely auto-rename. A future enhancement could + translate the format string at migrate time. For now, a Note + points the user at the strftime-variant V1 method. +- V1 `error_source_label()` / `_name()` / `_path()` — no V2 + equivalent yet; revisit once `error()` grows the structured + `source.*` fields (deferred from batch 3). +- V1 `json(path)` — no auto-rewrite. Migrator emits a Note pointing + at `input` / `content().parse_json()`. diff --git a/internal/bloblang2/README.md b/internal/bloblang2/README.md new file mode 100644 index 000000000..f2fddf037 --- /dev/null +++ b/internal/bloblang2/README.md @@ -0,0 +1,173 @@ +# Bloblang V2 + +A redesign of the Bloblang mapping language for Redpanda Connect V5. Bloblang V2 is backed by a formal specification, designed for explicit context management, predictable behavior, and first-class tooling support. + +See [`spec/PROPOSAL.md`](spec/PROPOSAL.md) for the motivation and design rationale. + +## Directory Layout + +- **`spec/`** — Language specification (numbered markdown files) and YAML conformance test suite +- **`go/`** — Go reference implementation (Pratt parser, tree-walking interpreter, spec test runner framework) +- **`go/lsp/`** — Editor-agnostic LSP server for diagnostics and completions +- **`ts/`** — TypeScript implementation (scanner, parser, optimizer, resolver, interpreter, stdlib) +- **`tree-sitter/`** — Tree-sitter grammar for editor tooling (syntax highlighting, code folding) +- **`plugins/nvim/`** — Neovim plugin (filetype detection, tree-sitter highlighting, LSP client) +- **`migrator/`** — V1→V2 translator library, V1 AST + corpus packages, and a side-by-side playground (see [`migrator/README.md`](migrator/README.md) for the full details) +- **`demo/`** — Interactive web playground for V2 with live execution and syntax highlighting +- **`speccondenser/`** — Developer tool that runs prompt-based agent exams against the V2 spec to measure its "condenseability" for downstream tooling + +## Spec (`spec/`) + +The specification is split across 13 numbered markdown files covering the full language — from lexical structure and type system through to the standard library. Start with [`spec/README.md`](spec/README.md) for the table of contents and a quick syntax reference. + +The `spec/tests/` directory contains ~130 YAML test files organized into subdirectories by topic (types, operators, control flow, maps, lambdas, error handling, stdlib, edge cases, etc.). These are the canonical conformance tests — any correct implementation must pass them all. See [`spec/tests/README.md`](spec/tests/README.md) for the test schema. + +## Go Implementation (`go/`) + +A Pratt-parser-based compiler and tree-walking interpreter in `go/pratt/`. The compilation pipeline is: + +1. **Parse** — `syntax.Parse()` produces an AST +2. **Optimize** — `syntax.Optimize()` does path collapse, constant folding, dead code elimination +3. **Resolve** — `syntax.Resolve()` performs semantic checks (name resolution, arity validation) and annotates AST nodes with opcode IDs and variable stack slot indices +4. **Execute** — `eval.NewWithStdlib()` + `interp.Run()` tree-walks the AST using opcode dispatch and stack-based variable access + +The `go/spectest/` package provides a reusable test runner. Any implementation that satisfies the `spectest.Interpreter` interface can be validated against the full spec test suite. The top-level `bloblang2_test.go` does exactly this for the Go implementation: + +```go +func TestBloblangV2Spec(t *testing.T) { + spectest.RunT(t, "spec/tests", &Interp{}) +} +``` + +## TypeScript Implementation (`ts/`) + +A full TypeScript implementation of the Bloblang V2 language: scanner, parser, optimizer, resolver, interpreter, and standard library. Follows the same compilation pipeline as the Go implementation and is validated against the spec conformance suite via Vitest. + +The bundled output (`bloblang2.mjs`) is used by the demo playground for browser-side execution. + +## LSP Server (`go/lsp/`) + +A minimal LSP server that wraps the Go compiler pipeline (Parse, Optimize, Resolve) to provide real-time editor feedback over JSON-RPC/stdio. Any editor that speaks LSP can use it. + +**Diagnostics** — parse errors, undeclared variables, unknown functions/methods, arity mismatches, scope violations, map isolation, duplicate map names. + +**Completions** — context-aware: methods after `.`, variables after `$`, keywords + stdlib functions + user-defined maps otherwise. + +Build the binary with `task build:nvim:lsp` (output at `plugins/nvim/bin/bloblang2-lsp`). Point any LSP client at it with stdio transport — no arguments needed. + +## Neovim Plugin (`plugins/nvim/`) + +A Neovim plugin providing filetype detection (`.blobl2`), tree-sitter syntax highlighting, and LSP client wiring for diagnostics and completions. + +**Setup** (vim-plug): + +```vim +Plug '~/path/to/bloblang2/plugins/nvim' +" After plug#end(): +lua require("bloblang2").setup() +``` + +**Build prerequisites** (run once, and after grammar changes): + +```sh +task build:nvim:parser # Compile tree-sitter .so +task build:nvim:lsp # Build LSP binary +``` + +**Verify:** open a `.blobl2` file and run `:checkhealth bloblang2`. + +## Tree-sitter Grammar (`tree-sitter/`) + +A full tree-sitter grammar for Bloblang V2, suitable for syntax highlighting, code folding, and editor integration. Uses an external scanner (`src/scanner.c`) for context-sensitive newline handling. + +Build tasks (requires `node_modules/tree-sitter-cli`): + +```sh +task build:tree-sitter:generate # Generate parser from grammar.js +task test:tree-sitter # Run corpus tests +task build:tree-sitter:wasm # Compile to WASM +task build:tree-sitter # Full rebuild (generate, test, WASM, sync into demo) +``` + +## Migrator (`migrator/`) + +A Go library + playground that translates V1 Bloblang mappings to V2, flagging every point at which the semantics have to shift. 100% fidelity isn't the goal — V2 is a deliberate redesign that fixes V1 ambiguities — but every shift the translator introduces is recorded on a `Report` so a human can audit before cutover. + +Key pieces: + +- **`migrator/v1ast/`** — self-contained V1 parser and AST. Preserves source positions on every node and carries standalone comments + blank lines as trivia so they survive the round trip. +- **`migrator/v1spec/`** — V2 spec tests translated to V1, run against the official V1 interpreter. Acts as a pin on V1 behaviour for the translator to target. +- **`migrator/translator/`** — `Migrate(v1Source, Options) (*Report, error)`. Rewrite rules live in `methods.go`; the statement/expression walker lives in `translate.go`. V1 comments and blank lines propagate to the V2 output via a `TriviaSet` embedded on each AST node. + +```go +rep, err := translator.Migrate(v1Source, translator.Options{Verbose: true, Files: imports}) +// rep.V2Mapping — translated text +// rep.Changes — per-site divergence notes (Severity, RuleID, SpecRef, Explanation) +// rep.Coverage — exact vs rewritten vs unsupported node counts +``` + +Testing is layered: V1 parser roundtrip, per-rule unit tests, per-method translation audit (explicit V1→V2 assertions with no warning-as-free-pass escape hatch), contract tests, end-to-end corpus regression (translate → V2-compile → V2-execute → diff against V1's expected output), and property tests. See [`migrator/README.md`](migrator/README.md) for the full write-up. + +## Demo Playgrounds + +Two local web playgrounds are available. Both share the tree-sitter WASM and TypeScript bundle built by `task build:demo`. + +**`demo/` — V2 playground.** Write V2 mappings with tree-sitter-powered syntax highlighting and autocomplete, execute them against JSON input, and see results live. An engine selector toggles between browser-side execution (via the TypeScript interpreter) and server-side execution (via the Go interpreter). + +```sh +task demo +# Builds demo assets then opens http://localhost:4195 in your browser +``` + +**`migrator/demo/` — V1→V2 migrator playground.** Four panes: JSON input top-left, output top-right (with a V1-engine / V2-engine toggle), V1 mapping bottom-left (editable), translated V2 mapping bottom-right (read-only, tree-sitter-highlighted). A case-study dropdown loads any of the real-world V1 mappings from `migrator/v1spec/tests/case_studies/` (GA4 clickstream, Stripe invoice, OTLP traces, GitHub webhook, …) straight into the editor. A notes strip under the V2 pane surfaces every translation warning the migrator recorded (method-does-not-exist, semantic shifts, scoping differences, etc.), and comments + blank lines from the V1 source round-trip into the V2 output. + +```sh +task demo:migrator +# Opens http://localhost:4196 in your browser +``` + +## Performance + +The Go implementation is benchmarked against Bloblang V1 using two non-trivial case studies (Stripe invoice normalization and GitHub webhook processing). V2 is faster than V1 on both, with lower memory usage and fewer allocations: + +| | V2 | V1 | V2 vs V1 | +|---|---|---|---| +| Stripe (ns/op) | 8,483 | 9,620 | 12% faster | +| Stripe (B/op) | 5,112 | 7,528 | 32% less memory | +| Stripe (allocs/op) | 96 | 128 | 25% fewer | +| GitHub (ns/op) | 11,283 | 13,568 | 17% faster | +| GitHub (B/op) | 6,643 | 7,227 | 8% less memory | +| GitHub (allocs/op) | 155 | 189 | 18% fewer | + +Key optimizations in the Go interpreter: + +- **Opcode dispatch** — stdlib methods and functions are assigned compile-time integer IDs by the resolver. The interpreter dispatches via slice index instead of map lookup. +- **Variable stack** — the resolver assigns stack slot indices to all variables and parameters. The interpreter uses a flat `[]any` stack with frame-based indexing instead of a linked scope chain. +- **Interpreter reuse** — compiled mappings reuse the interpreter and its stack across executions. +- **Zero-alloc iterator args** — lambda arguments in iterator methods (map, filter, etc.) use stack-allocated buffers instead of heap-allocated slices. + +Run the benchmarks with `task test:go` or directly: + +```sh +go test ./... -bench=. -benchtime=3s -run='^$' +``` + +## Building & Testing + +All build and test tasks are managed via [go-task](https://taskfile.dev) from the `bloblang2/` directory. Run `task --list` for the full list. Key tasks: + +```sh +task test # Run all tests (Go + TypeScript + tree-sitter) +task test:go # Go spec conformance and unit tests only +task test:ts # TypeScript spec conformance and unit tests only +task test:tree-sitter # Tree-sitter corpus tests only +task test:v1spec # Run the V1 corpus against the official V1 interpreter + # (the migrator's ground-truth pin for V1 semantics) + +task build # Build all artifacts (tree-sitter, TS bundle, nvim plugin) +task build:demo # Build just the demo assets (tree-sitter WASM + TS bundle) +task demo # Build demo assets and launch the V2 playground +task demo:migrator # Launch the V1→V2 migrator playground + +task clean # Remove all build artifacts +``` diff --git a/internal/bloblang2/REMAINING.md b/internal/bloblang2/REMAINING.md new file mode 100644 index 000000000..f6b25cd1a --- /dev/null +++ b/internal/bloblang2/REMAINING.md @@ -0,0 +1,28 @@ +# Bloblang V2 — Remaining Items + +Work outstanding to reach parity with the V1 integration in `public/service`. + +## Custom lint rules + +The built-in parse-based lint is now wired (see `LintBloblangV2Mapping` in `internal/docs/bloblang.go`), so `bloblang_v2` fields surface compile errors at config-load time. + +Still missing: a public surface for field authors to attach **custom** V2 lint rules, equivalent to `FieldBloblang`'s custom-rule pathway. Plugin authors who currently rely on custom V1 lint rules (e.g. to require a particular field on the mapping result) have no V2-side hook. + +## Interpolated strings + +V2 does not yet plug into the interpolated-string surface. `public/bloblangv2.Environment` has no `NewField` analog of the V1 surface, and `public/service/config_interpolated_string.go` still calls the V1 environment only. + +User-visible consequence: a plugin registered as a V2-only method **will not be available inside `${! ... }` fields**, even when the field's host component also accepts a `bloblang_v2` mapping field. Users of V1 + V2 plugins in the same binary should be warned about this in release notes. + +Out of scope for the initial V2 integration; design needed before wiring. + +## Plugin bridge between V1 and V2 + +V1 and V2 maintain separate plugin registries (`public/bloblang.Environment` and `public/bloblangv2.Environment` respectively), and plugins registered against one are invisible to the other. The current V1 stdlib parity ports under `internal/impl/{pure,io}/bloblangv2_*.go` had to be written by hand against the V2 plugin API. + +Two follow-ups worth considering: + +- A bridging helper that adapts a V1 plugin spec into a V2 registration. The adapter would have to declare any semantic shifts (variadic → array arguments, error-object shape, etc.) explicitly so authors opt in rather than silently accept a behavioural delta. +- A migration guide for plugin maintainers that mirrors the per-method notes in `PARITY.md`. + +Out of scope for this branch; tracked here so users porting plugins know the work isn't done. diff --git a/internal/bloblang2/Taskfile.yml b/internal/bloblang2/Taskfile.yml new file mode 100644 index 000000000..36e575151 --- /dev/null +++ b/internal/bloblang2/Taskfile.yml @@ -0,0 +1,164 @@ +version: "3" + +# Component Taskfiles are included as implementation details — their tasks +# stay invocable (e.g. `task ts:bundle`, shown via `task --list-all`) but are +# hidden from the default `task --list` output. The canonical surface for +# users lives in this file and is organised action-first: `build:X`, +# `test:X`, `clean:X`, `demo:X`. +includes: + tree-sitter: + taskfile: tree-sitter/Taskfile.yml + dir: tree-sitter + internal: true + ts: + taskfile: ts/Taskfile.yml + dir: ts + internal: true + nvim: + taskfile: plugins/nvim/Taskfile.yml + dir: plugins/nvim + internal: true + +vars: + # Go tests must run from the benthos repo root for module resolution. + BENTHOS_ROOT: "{{.TASKFILE_DIR}}/../.." + +tasks: + # --- Tests --------------------------------------------------------------- + + test: + desc: Run all tests (Go, TypeScript, tree-sitter) + cmds: + - task: test:go + - task: test:ts + - task: test:tree-sitter + + test:go: + desc: Run Go spec conformance and unit tests + dir: "{{.BENTHOS_ROOT}}" + cmds: + - go test ./internal/bloblang2/... + + test:ts: + desc: Run the TypeScript spec conformance and unit tests + cmds: + - task: ts:test + + test:tree-sitter: + desc: Run the tree-sitter corpus tests + cmds: + - task: tree-sitter:test + + test:v1spec: + desc: Run V1 mappings under ./migrator/v1spec/tests against the V1 interpreter + dir: "{{.BENTHOS_ROOT}}" + cmds: + - go test ./internal/bloblang2/migrator/v1spec/... -run TestBloblangV1Spec {{.CLI_ARGS}} + + # --- Fuzzing ------------------------------------------------------------- + + fuzz: + desc: Run all Bloblang V2 parser fuzz targets briefly (override duration via FUZZTIME) + cmds: + - task: fuzz:scanner + - task: fuzz:parse + + fuzz:scanner: + desc: Fuzz the scanner (override duration via FUZZTIME, e.g. FUZZTIME=2m task fuzz:scanner) + dir: "{{.BENTHOS_ROOT}}" + vars: + FUZZTIME: '{{default "30s" .FUZZTIME}}' + cmds: + - go test ./internal/bloblang2/go/pratt/syntax -run=^$ -fuzz=^FuzzScanner$ -fuzztime={{.FUZZTIME}} + + fuzz:parse: + desc: Fuzz the parser (override duration via FUZZTIME, e.g. FUZZTIME=2m task fuzz:parse) + dir: "{{.BENTHOS_ROOT}}" + vars: + FUZZTIME: '{{default "30s" .FUZZTIME}}' + cmds: + - go test ./internal/bloblang2/go/pratt/syntax -run=^$ -fuzz=^FuzzParse$ -fuzztime={{.FUZZTIME}} + + # --- Build --------------------------------------------------------------- + + build: + desc: Build all artifacts (tree-sitter WASM, TS bundle, Neovim plugin) + cmds: + - task: build:tree-sitter + - task: build:ts + - task: build:nvim + + build:demo: + desc: Build shared assets consumed by the demo playgrounds (tree-sitter WASM + TS bundle) + cmds: + - task: tree-sitter:sync-demo + - task: ts:bundle + + build:ts: + desc: Compile and bundle the TypeScript implementation + cmds: + - task: ts:bundle + + build:tree-sitter: + desc: Generate the grammar, run corpus tests, build WASM, and sync into the demo + cmds: + - task: tree-sitter:all + + build:tree-sitter:generate: + desc: Regenerate the tree-sitter parser from grammar.js + cmds: + - task: tree-sitter:generate + + build:tree-sitter:wasm: + desc: Compile the tree-sitter grammar to WASM (requires Docker or emscripten) + cmds: + - task: tree-sitter:build-wasm + + build:nvim: + desc: Build the Neovim tree-sitter parser and bloblang2-lsp binary + cmds: + - task: nvim:default + + build:nvim:lsp: + desc: Build the bloblang2-lsp binary (output at plugins/nvim/bin/bloblang2-lsp) + cmds: + - task: nvim:lsp + + build:nvim:parser: + desc: Compile the tree-sitter parser into a shared library for Neovim + cmds: + - task: nvim:parser + + # --- Demo playgrounds ---------------------------------------------------- + + demo: + desc: Run the Bloblang V2 playground on http://localhost:4195 + deps: [build:demo] + dir: "{{.BENTHOS_ROOT}}" + cmds: + - go run ./internal/bloblang2/demo + + demo:migrator: + desc: Run the V1→V2 migrator playground on http://localhost:4196 + deps: [build:demo] + dir: "{{.BENTHOS_ROOT}}" + cmds: + - go run ./internal/bloblang2/migrator/demo + + # --- Housekeeping -------------------------------------------------------- + + clean: + desc: Remove all build artifacts + cmds: + - task: clean:ts + - task: clean:nvim + + clean:ts: + desc: Remove TypeScript build artifacts + cmds: + - task: ts:clean + + clean:nvim: + desc: Remove Neovim plugin build artifacts + cmds: + - task: nvim:clean diff --git a/internal/bloblang2/benchmark_test.go b/internal/bloblang2/benchmark_test.go new file mode 100644 index 000000000..ce24fdfb3 --- /dev/null +++ b/internal/bloblang2/benchmark_test.go @@ -0,0 +1,510 @@ +package bloblang2 + +import ( + "encoding/json" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/public/bloblang" + + // Register v1 bloblang methods (timestamps, etc.) that live outside the + // core query package. + _ "github.com/redpanda-data/benthos/v4/internal/impl/pure" +) + +// stripeInput is the Stripe invoice.paid webhook payload used by both benchmarks. +// All integers use int64 because the v2 type system requires explicit integer +// width (it does not recognise Go's plain int). +var stripeInput = map[string]any{ + "type": "invoice.paid", + "created": int64(1709251200), + "data": map[string]any{ + "object": map[string]any{ + "id": "in_1OqR3m", + "number": "INV-2024-0218", + "customer": "cus_PaB3xK", + "customer_email": "ops@megacorp.io", + "customer_name": "MegaCorp Engineering", + "metadata": map[string]any{ + "internal_account_id": "acct-00482", + "provisioning": `{"tier":"growth","seats":25,"features":["sso","audit_log"]}`, + "salesforce_opp_id": "006Dn000002XLPQ", + }, + "status": "paid", + "subscription": "sub_1NrT7a", + "currency": "usd", + "subtotal": int64(14900), + "tax": int64(1200), + "total": int64(16100), + "status_transitions": map[string]any{ + "paid_at": int64(1709251200), + "voided_at": nil, + }, + "lines": map[string]any{ + "data": []any{ + map[string]any{ + "amount": int64(9900), + "description": "Growth Plan", + "quantity": int64(1), + "product": "prod_Growth", + }, + map[string]any{ + "amount": int64(5000), + "description": "Extra seats", + "quantity": int64(5), + "product": "prod_Seats", + }, + }, + }, + }, + }, +} + +// githubInput is the GitHub PR webhook payload used by both benchmarks. +var githubInput = map[string]any{ + "action": "opened", + "number": int64(42), + "pull_request": map[string]any{ + "title": "feat: add retry logic to payment processor", + "body": "## Summary\nAdds exponential backoff.\n\nCloses #38\nRelated: #35, #40", + "html_url": "https://github.com/acme/payments/pull/42", + "state": "open", + "draft": false, + "additions": int64(347), + "deletions": int64(42), + "changed_files": int64(8), + "created_at": "2024-01-15T14:30:00Z", + "user": map[string]any{"login": "alice-dev"}, + "head": map[string]any{"ref": "feat/payment-retry"}, + "base": map[string]any{"ref": "main"}, + "labels": []any{ + map[string]any{"name": "enhancement"}, + map[string]any{"name": "payments"}, + map[string]any{"name": "needs-review"}, + }, + "requested_reviewers": []any{ + map[string]any{"login": "bob-reviewer"}, + map[string]any{"login": "carol-lead"}, + }, + "requested_teams": []any{ + map[string]any{"name": "platform-team"}, + }, + }, +} + +// --------------------------------------------------------------------------- +// Bloblang V2 benchmarks +// --------------------------------------------------------------------------- + +const v2StripeMapping = ` +$inv = input.data.object + +$provisioning = $inv.metadata.provisioning.parse_json() + +$line_items = $inv.lines.data.map(item -> { + "description": item.description, + "amount_dollars": item.amount.float64() / 100.0, + "quantity": item.quantity, + "product_id": item.product, +}) + +output.invoice_id = $inv.id +output.invoice_number = $inv.number +output.event_type = input.type +output.customer = { + "id": $inv.customer, + "name": $inv.customer_name, + "email": $inv.customer_email, +} +output.provisioning = $provisioning +output.currency = $inv.currency.uppercase() +output.subtotal_dollars = $inv.subtotal.float64() / 100.0 +output.tax_dollars = $inv.tax.float64() / 100.0 +output.total_dollars = $inv.total.float64() / 100.0 +output.line_items = $line_items +output.status = $inv.status + +output.paid_at = $inv.status_transitions.paid_at + .ts_from_unix().string() + +output.subscription_id = $inv.subscription + +output.external_refs = $inv.metadata + .without(["provisioning"]) + .map_keys(k -> k.replace_all("_", "-")) +` + +const v2GithubMapping = ` +$pr = input.pull_request +$url_parts = $pr.html_url.split("/") +$repo = $url_parts[3] + "/" + $url_parts[4] + +$total_changes = $pr.additions + $pr.deletions +$size_category = match { + $total_changes > 300 => "large", + $total_changes > 100 => "medium", + _ => "small", +} + +$issue_refs = $pr.body.re_find_all("#\\d+") + .map(ref -> ref.trim_prefix("#").int64()) + .sort() + .unique() + +$reviewers = $pr.requested_reviewers.map(r -> r.login) + .concat($pr.requested_teams.map(t -> t.name)) + .sort() + +output.event_type = "pr_" + input.action +output.repo = $repo +output.pr_number = input.number +output.title = $pr.title +output.author = $pr.user.login +output.url = $pr.html_url +output.branch = $pr.head.ref + " -> " + $pr.base.ref +output.labels = $pr.labels.map(l -> l.name).sort() +output.reviewers = $reviewers +output.size = { + "additions": $pr.additions, + "deletions": $pr.deletions, + "files": $pr.changed_files, + "category": $size_category, +} +output.referenced_issues = $issue_refs +output.is_feature = $pr.title.has_prefix("feat") +output.summary = "[" + $repo + "] " + $pr.user.login + " " + input.action + " #" + input.number.string() + ": " + $pr.title + " (" + $size_category + ", " + $pr.changed_files.string() + " files)" +` + +// jsonNormalize round-trips a value through JSON so that both v1 and v2 +// outputs use the same numeric types (float64 for all numbers) and can be +// compared with reflect.DeepEqual via their JSON representation. +func jsonNormalize(t testing.TB, v any) string { + t.Helper() + b, err := json.Marshal(v) + if err != nil { + t.Fatalf("json.Marshal: %v", err) + } + // Re-indent for readable diffs on failure. + var pretty json.RawMessage = b + out, err := json.MarshalIndent(pretty, "", " ") + if err != nil { + t.Fatalf("json.MarshalIndent: %v", err) + } + return string(out) +} + +func compileV2Impl(t testing.TB, mapping string) *compiledMapping { + t.Helper() + prog, errs := syntax.Parse(mapping, "", nil) + if len(errs) > 0 { + t.Fatalf("v2 parse errors: %s", syntax.FormatErrors(errs)) + } + syntax.Optimize(prog) + methods, functions := eval.StdlibNames() + methodOpcodes, functionOpcodes := eval.StdlibOpcodes() + if resolveErrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, + Functions: functions, + MethodOpcodes: methodOpcodes, + FunctionOpcodes: functionOpcodes, + }); len(resolveErrs) > 0 { + t.Fatalf("v2 resolve errors: %s", syntax.FormatErrors(resolveErrs)) + } + return &compiledMapping{ + prog: prog, + interp: eval.NewWithStdlib(prog), + } +} + +func compileV2T(t testing.TB, mapping string) *compiledMapping { + return compileV2Impl(t, mapping) +} + +func compileV1T(t testing.TB, mapping string) *bloblang.Executor { + t.Helper() + exec, err := bloblang.Parse(mapping) + if err != nil { + t.Fatalf("v1 parse error: %v", err) + } + return exec +} + +func compileV2(b *testing.B, mapping string) *compiledMapping { + return compileV2Impl(b, mapping) +} + +func BenchmarkV2StripeInvoice(b *testing.B) { + m := compileV2(b, v2StripeMapping) + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + out, _, _, err := m.Exec(stripeInput, nil) + if err != nil { + b.Fatal(err) + } + if out == nil { + b.Fatal("nil output") + } + } +} + +func BenchmarkV2GithubWebhook(b *testing.B) { + m := compileV2(b, v2GithubMapping) + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + out, _, _, err := m.Exec(githubInput, nil) + if err != nil { + b.Fatal(err) + } + if out == nil { + b.Fatal("nil output") + } + } +} + +// --------------------------------------------------------------------------- +// Bloblang V1 benchmarks (same transformations) +// --------------------------------------------------------------------------- + +const v1StripeMapping = ` +let inv = this.data.object + +let provisioning = $inv.metadata.provisioning.parse_json() + +let line_items = $inv.lines.data.map_each(item -> { + "description": item.description, + "amount_dollars": item.amount.number() / 100, + "quantity": item.quantity, + "product_id": item.product, +}) + +root.invoice_id = $inv.id +root.invoice_number = $inv.number +root.event_type = this.type +root.customer = { + "id": $inv.customer, + "name": $inv.customer_name, + "email": $inv.customer_email, +} +root.provisioning = $provisioning +root.currency = $inv.currency.uppercase() +root.subtotal_dollars = $inv.subtotal.number() / 100 +root.tax_dollars = $inv.tax.number() / 100 +root.total_dollars = $inv.total.number() / 100 +root.line_items = $line_items +root.status = $inv.status + +root.paid_at = $inv.status_transitions.paid_at.ts_format("2006-01-02T15:04:05Z", "UTC") + +root.subscription_id = $inv.subscription + +root.external_refs = $inv.metadata.without("provisioning").map_each_key(k -> k.replace_all("_", "-")) +` + +const v1GithubMapping = ` +let pr = this.pull_request +let repo = $pr.html_url.re_find_all("[^/]+").slice(2, 4).join("/") + +let total_changes = $pr.additions + $pr.deletions +let size_category = if $total_changes > 300 { "large" } else if $total_changes > 100 { "medium" } else { "small" } + +let issue_refs = $pr.body.re_find_all("#\\d+").map_each(ref -> ref.trim_prefix("#").number()).sort().unique() + +let reviewers = $pr.requested_reviewers.map_each(r -> r.login).merge($pr.requested_teams.map_each(t -> t.name)).sort() + +root.event_type = "pr_" + this.action +root.repo = $repo +root.pr_number = this.number +root.title = $pr.title +root.author = $pr.user.login +root.url = $pr.html_url +root.branch = $pr.head.ref + " -> " + $pr.base.ref +root.labels = $pr.labels.map_each(l -> l.name).sort() +root.reviewers = $reviewers +root.size = { + "additions": $pr.additions, + "deletions": $pr.deletions, + "files": $pr.changed_files, + "category": $size_category, +} +root.referenced_issues = $issue_refs +root.is_feature = $pr.title.has_prefix("feat") +root.summary = "[" + $repo + "] " + $pr.user.login + " " + this.action + " #" + this.number.string() + ": " + $pr.title + " (" + $size_category + ", " + $pr.changed_files.string() + " files)" +` + +func compileV1(b *testing.B, mapping string) *bloblang.Executor { + b.Helper() + + exec, err := bloblang.Parse(mapping) + if err != nil { + b.Fatalf("v1 parse error: %v", err) + } + return exec +} + +func BenchmarkV1StripeInvoice(b *testing.B) { + exec := compileV1(b, v1StripeMapping) + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + out, err := exec.Query(stripeInput) + if err != nil { + b.Fatal(err) + } + if out == nil { + b.Fatal("nil output") + } + } +} + +func BenchmarkV1GithubWebhook(b *testing.B) { + exec := compileV1(b, v1GithubMapping) + b.ReportAllocs() + b.ResetTimer() + for b.Loop() { + out, err := exec.Query(githubInput) + if err != nil { + b.Fatal(err) + } + if out == nil { + b.Fatal("nil output") + } + } +} + +// --------------------------------------------------------------------------- +// Expected outputs (from the spec case studies, JSON-normalized) +// --------------------------------------------------------------------------- + +const expectedStripeJSON = `{ + "currency": "USD", + "customer": { + "email": "ops@megacorp.io", + "id": "cus_PaB3xK", + "name": "MegaCorp Engineering" + }, + "event_type": "invoice.paid", + "external_refs": { + "internal-account-id": "acct-00482", + "salesforce-opp-id": "006Dn000002XLPQ" + }, + "invoice_id": "in_1OqR3m", + "invoice_number": "INV-2024-0218", + "line_items": [ + { + "amount_dollars": 99, + "description": "Growth Plan", + "product_id": "prod_Growth", + "quantity": 1 + }, + { + "amount_dollars": 50, + "description": "Extra seats", + "product_id": "prod_Seats", + "quantity": 5 + } + ], + "paid_at": "2024-03-01T00:00:00Z", + "provisioning": { + "features": [ + "sso", + "audit_log" + ], + "seats": 25, + "tier": "growth" + }, + "status": "paid", + "subscription_id": "sub_1NrT7a", + "subtotal_dollars": 149, + "tax_dollars": 12, + "total_dollars": 161 +}` + +const expectedGithubJSON = `{ + "author": "alice-dev", + "branch": "feat/payment-retry -> main", + "event_type": "pr_opened", + "is_feature": true, + "labels": [ + "enhancement", + "needs-review", + "payments" + ], + "pr_number": 42, + "referenced_issues": [ + 35, + 38, + 40 + ], + "repo": "acme/payments", + "reviewers": [ + "bob-reviewer", + "carol-lead", + "platform-team" + ], + "size": { + "additions": 347, + "category": "large", + "deletions": 42, + "files": 8 + }, + "summary": "[acme/payments] alice-dev opened #42: feat: add retry logic to payment processor (large, 8 files)", + "title": "feat: add retry logic to payment processor", + "url": "https://github.com/acme/payments/pull/42" +}` + +// --------------------------------------------------------------------------- +// Output validation tests +// --------------------------------------------------------------------------- + +func TestBenchmarkOutputs(t *testing.T) { + tests := []struct { + name string + expected string + }{ + {"Stripe", expectedStripeJSON}, + {"Github", expectedGithubJSON}, + } + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + var v2Mapping, v1Mapping string + var input map[string]any + switch tt.name { + case "Stripe": + v2Mapping, v1Mapping, input = v2StripeMapping, v1StripeMapping, stripeInput + case "Github": + v2Mapping, v1Mapping, input = v2GithubMapping, v1GithubMapping, githubInput + } + + // Normalize expected output via JSON round-trip (sorts keys, + // collapses integer types to float64). + want := jsonNormalize(t, json.RawMessage(tt.expected)) + + t.Run("V2", func(t *testing.T) { + m := compileV2T(t, v2Mapping) + out, _, _, err := m.Exec(input, nil) + if err != nil { + t.Fatalf("exec: %v", err) + } + got := jsonNormalize(t, out) + if got != want { + t.Fatalf("output mismatch\nwant:\n%s\ngot:\n%s", want, got) + } + }) + + t.Run("V1", func(t *testing.T) { + exec := compileV1T(t, v1Mapping) + out, err := exec.Query(input) + if err != nil { + t.Fatalf("exec: %v", err) + } + got := jsonNormalize(t, out) + if got != want { + t.Fatalf("output mismatch\nwant:\n%s\ngot:\n%s", want, got) + } + }) + }) + } +} diff --git a/internal/bloblang2/bloblang2.go b/internal/bloblang2/bloblang2.go new file mode 100644 index 000000000..3bdbbadff --- /dev/null +++ b/internal/bloblang2/bloblang2.go @@ -0,0 +1,51 @@ +// Package bloblang2 provides a Bloblang V2 implementation that satisfies +// the spectest.Interpreter interface. +package bloblang2 + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/spectest" +) + +// Interp implements spectest.Interpreter for Bloblang V2. +type Interp struct{} + +// Compile parses and compiles a Bloblang V2 mapping. +func (i *Interp) Compile(mapping string, files map[string]string) (spectest.Mapping, error) { + prog, errs := syntax.Parse(mapping, "", files) + if len(errs) > 0 { + return nil, &spectest.CompileError{Message: syntax.FormatErrors(errs)} + } + + // Optimization pass: path collapse, constant folding, dead code elimination. + syntax.Optimize(prog) + + // Name resolution pass: semantic checks + opcode annotation. + methods, functions := eval.StdlibNames() + methodOpcodes, functionOpcodes := eval.StdlibOpcodes() + resolveErrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, + Functions: functions, + MethodOpcodes: methodOpcodes, + FunctionOpcodes: functionOpcodes, + }) + if len(resolveErrs) > 0 { + return nil, &spectest.CompileError{Message: syntax.FormatErrors(resolveErrs)} + } + + return &compiledMapping{ + prog: prog, + interp: eval.NewWithStdlib(prog), + }, nil +} + +type compiledMapping struct { + prog *syntax.Program + interp *eval.Interpreter +} + +// Exec runs the compiled mapping against input and metadata. +func (m *compiledMapping) Exec(input any, metadata map[string]any) (any, map[string]any, bool, error) { + return m.interp.Run(input, metadata) +} diff --git a/internal/bloblang2/bloblang2_test.go b/internal/bloblang2/bloblang2_test.go new file mode 100644 index 000000000..0be3f7868 --- /dev/null +++ b/internal/bloblang2/bloblang2_test.go @@ -0,0 +1,11 @@ +package bloblang2 + +import ( + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/spectest" +) + +func TestBloblangV2Spec(t *testing.T) { + spectest.RunT(t, "spec/tests", &Interp{}) +} From 448cb127dffc5b316fbcae628c31083b0171ee65 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Wed, 22 Apr 2026 10:55:10 +0100 Subject: [PATCH 07/20] bloblang(v2): Add LSP server and Neovim plugin Adds an LSP server under internal/bloblang2/go/lsp/ (with a small bloblang2-lsp binary) that exposes the V2 parser/resolver as diagnostics and completions over the standard JSON-RPC protocol. Ships a matching Neovim plugin under internal/bloblang2/plugins/nvim/ that wires .blobl2 filetype detection, syntax highlighting via the tree-sitter grammar, and the LSP client. --- .../go/lsp/cmd/bloblang2-lsp/main.go | 15 + internal/bloblang2/go/lsp/completion.go | 192 ++++++++++ internal/bloblang2/go/lsp/completion_test.go | 296 +++++++++++++++ internal/bloblang2/go/lsp/diagnostics.go | 71 ++++ internal/bloblang2/go/lsp/diagnostics_test.go | 228 ++++++++++++ internal/bloblang2/go/lsp/document.go | 63 ++++ internal/bloblang2/go/lsp/document_test.go | 130 +++++++ internal/bloblang2/go/lsp/protocol.go | 140 +++++++ internal/bloblang2/go/lsp/server.go | 266 ++++++++++++++ internal/bloblang2/go/lsp/server_test.go | 342 ++++++++++++++++++ internal/bloblang2/plugins/nvim/.gitignore | 3 + internal/bloblang2/plugins/nvim/Taskfile.yml | 53 +++ .../plugins/nvim/ftdetect/bloblang2.lua | 5 + .../plugins/nvim/ftplugin/blobl2.lua | 1 + .../plugins/nvim/lua/bloblang2/health.lua | 40 ++ .../plugins/nvim/lua/bloblang2/init.lua | 83 +++++ .../nvim/queries/bloblang2/highlights.scm | 77 ++++ 17 files changed, 2005 insertions(+) create mode 100644 internal/bloblang2/go/lsp/cmd/bloblang2-lsp/main.go create mode 100644 internal/bloblang2/go/lsp/completion.go create mode 100644 internal/bloblang2/go/lsp/completion_test.go create mode 100644 internal/bloblang2/go/lsp/diagnostics.go create mode 100644 internal/bloblang2/go/lsp/diagnostics_test.go create mode 100644 internal/bloblang2/go/lsp/document.go create mode 100644 internal/bloblang2/go/lsp/document_test.go create mode 100644 internal/bloblang2/go/lsp/protocol.go create mode 100644 internal/bloblang2/go/lsp/server.go create mode 100644 internal/bloblang2/go/lsp/server_test.go create mode 100644 internal/bloblang2/plugins/nvim/.gitignore create mode 100644 internal/bloblang2/plugins/nvim/Taskfile.yml create mode 100644 internal/bloblang2/plugins/nvim/ftdetect/bloblang2.lua create mode 100644 internal/bloblang2/plugins/nvim/ftplugin/blobl2.lua create mode 100644 internal/bloblang2/plugins/nvim/lua/bloblang2/health.lua create mode 100644 internal/bloblang2/plugins/nvim/lua/bloblang2/init.lua create mode 100644 internal/bloblang2/plugins/nvim/queries/bloblang2/highlights.scm diff --git a/internal/bloblang2/go/lsp/cmd/bloblang2-lsp/main.go b/internal/bloblang2/go/lsp/cmd/bloblang2-lsp/main.go new file mode 100644 index 000000000..a0a1482de --- /dev/null +++ b/internal/bloblang2/go/lsp/cmd/bloblang2-lsp/main.go @@ -0,0 +1,15 @@ +// bloblang2-lsp is a minimal LSP server for Bloblang V2 mappings. +package main + +import ( + "os" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/lsp" +) + +func main() { + s := lsp.NewServer(os.Stdin, os.Stdout) + if err := s.Run(); err != nil { + os.Exit(1) + } +} diff --git a/internal/bloblang2/go/lsp/completion.go b/internal/bloblang2/go/lsp/completion.go new file mode 100644 index 000000000..69225fbdd --- /dev/null +++ b/internal/bloblang2/go/lsp/completion.go @@ -0,0 +1,192 @@ +package lsp + +import ( + "fmt" + "sort" + "strings" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// completionEngine provides completion items for Bloblang V2. +type completionEngine struct { + keywords []completionItem + functions []completionItem + methods []completionItem +} + +func newCompletionEngine(methods map[string]syntax.MethodInfo, functions map[string]syntax.FunctionInfo) *completionEngine { + e := &completionEngine{} + + // Keywords. + for _, kw := range []string{ + "if", "else", "match", "as", "map", "import", + "input", "output", "true", "false", "null", + } { + kind := completionKindKeyword + if kw == "true" || kw == "false" || kw == "null" { + kind = completionKindValue + } + e.keywords = append(e.keywords, completionItem{ + Label: kw, + Kind: kind, + }) + } + + // Global functions. + names := make([]string, 0, len(functions)) + for name := range functions { + names = append(names, name) + } + sort.Strings(names) + for _, name := range names { + fi := functions[name] + detail := formatFunctionArity(name, fi) + e.functions = append(e.functions, completionItem{ + Label: name, + Kind: completionKindFunction, + Detail: detail, + InsertText: name + "()", + }) + } + + // Methods. + methodNames := make([]string, 0, len(methods)) + for name := range methods { + methodNames = append(methodNames, name) + } + sort.Strings(methodNames) + for _, name := range methodNames { + e.methods = append(e.methods, completionItem{ + Label: name, + Kind: completionKindMethod, + InsertText: name + "()", + }) + } + + return e +} + +func formatFunctionArity(name string, fi syntax.FunctionInfo) string { + if fi.Total == 0 { + return name + "()" + } + if fi.Required == fi.Total { + return fmt.Sprintf("%s(%d args)", name, fi.Required) + } + return fmt.Sprintf("%s(%d to %d args)", name, fi.Required, fi.Total) +} + +func (e *completionEngine) complete(text string, prog *syntax.Program, pos position, ctx *completionContext) []completionItem { + trigger := "" + if ctx != nil { + trigger = ctx.TriggerCharacter + } + + // Determine trigger from the character before the cursor if not provided. + if trigger == "" && pos.Character > 0 { + lines := strings.Split(text, "\n") + if pos.Line < len(lines) { + line := lines[pos.Line] + if pos.Character <= len(line) { + ch := line[pos.Character-1] + switch ch { + case '.': + trigger = "." + case '$': + trigger = "$" + case '@': + trigger = "@" + } + } + } + } + + switch trigger { + case ".": + return e.methods + case "$": + return e.variableCompletions(prog) + case "@": + // After @ (metadata access) — no specific completions. + return nil + default: + return e.generalCompletions(prog) + } +} + +// variableCompletions returns variables from the last successful parse. +func (e *completionEngine) variableCompletions(prog *syntax.Program) []completionItem { + if prog == nil { + return nil + } + + seen := make(map[string]bool) + + for _, stmt := range prog.Stmts { + collectVariables(stmt, seen) + } + + names := make([]string, 0, len(seen)) + for name := range seen { + names = append(names, name) + } + sort.Strings(names) + + var items []completionItem + for _, name := range names { + items = append(items, completionItem{ + Label: "$" + name, + Kind: completionKindVariable, + }) + } + return items +} + +// collectVariables walks statements to find variable assignments. +func collectVariables(stmt syntax.Stmt, seen map[string]bool) { + switch s := stmt.(type) { + case *syntax.Assignment: + if s.Target.Root == syntax.AssignVar && s.Target.VarName != "" { + seen[s.Target.VarName] = true + } + case *syntax.IfStmt: + for _, b := range s.Branches { + for _, inner := range b.Body { + collectVariables(inner, seen) + } + } + for _, inner := range s.Else { + collectVariables(inner, seen) + } + case *syntax.MatchStmt: + for _, c := range s.Cases { + if body, ok := c.Body.([]syntax.Stmt); ok { + for _, inner := range body { + collectVariables(inner, seen) + } + } + } + } +} + +// generalCompletions returns keywords, functions, and map names. +func (e *completionEngine) generalCompletions(prog *syntax.Program) []completionItem { + items := make([]completionItem, 0, len(e.keywords)+len(e.functions)+10) + items = append(items, e.keywords...) + items = append(items, e.functions...) + + // Add user-defined map names. + if prog != nil { + for _, m := range prog.Maps { + items = append(items, completionItem{ + Label: m.Name, + Kind: completionKindFunction, + Detail: "map " + m.Name, + InsertText: m.Name + "()", + }) + } + } + + return items +} diff --git a/internal/bloblang2/go/lsp/completion_test.go b/internal/bloblang2/go/lsp/completion_test.go new file mode 100644 index 000000000..d95280a84 --- /dev/null +++ b/internal/bloblang2/go/lsp/completion_test.go @@ -0,0 +1,296 @@ +package lsp + +import ( + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +func TestFormatFunctionArity(t *testing.T) { + tests := []struct { + name string + fn string + fi syntax.FunctionInfo + want string + }{ + {"no args", "uuid_v4", syntax.FunctionInfo{Required: 0, Total: 0}, "uuid_v4()"}, + {"one required", "throw", syntax.FunctionInfo{Required: 1, Total: 1}, "throw(1 args)"}, + {"two required", "random_int", syntax.FunctionInfo{Required: 2, Total: 2}, "random_int(2 args)"}, + {"mixed", "foo", syntax.FunctionInfo{Required: 1, Total: 3}, "foo(1 to 3 args)"}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + got := formatFunctionArity(tt.fn, tt.fi) + if got != tt.want { + t.Errorf("formatFunctionArity(%q, %+v) = %q, want %q", tt.fn, tt.fi, got, tt.want) + } + }) + } +} + +func TestNewCompletionEngine(t *testing.T) { + methods, functions := eval.StdlibNames() + e := newCompletionEngine(methods, functions) + + // Keywords should include standard language keywords. + keywordLabels := labelSet(e.keywords) + for _, kw := range []string{"if", "else", "match", "map", "input", "output"} { + if !keywordLabels[kw] { + t.Errorf("missing keyword %q", kw) + } + } + + // Constants should have value kind. + for _, item := range e.keywords { + if item.Label == "true" || item.Label == "false" || item.Label == "null" { + if item.Kind != completionKindValue { + t.Errorf("keyword %q has kind %d, want %d (value)", item.Label, item.Kind, completionKindValue) + } + } + } + + // Should have functions from stdlib. + fnLabels := labelSet(e.functions) + for _, fn := range []string{"uuid_v4", "now", "throw", "deleted", "random_int", "range", "timestamp"} { + if !fnLabels[fn] { + t.Errorf("missing function %q", fn) + } + } + for _, item := range e.functions { + if item.Kind != completionKindFunction { + t.Errorf("function %q has kind %d, want %d", item.Label, item.Kind, completionKindFunction) + } + if !strings.HasSuffix(item.InsertText, "()") { + t.Errorf("function %q insertText = %q, want suffix ()", item.Label, item.InsertText) + } + } + + // Should have methods from stdlib. + methodLabels := labelSet(e.methods) + for _, m := range []string{"uppercase", "lowercase", "length", "filter", "map", "fold", "keys", "values", "catch", "or"} { + if !methodLabels[m] { + t.Errorf("missing method %q", m) + } + } + for _, item := range e.methods { + if item.Kind != completionKindMethod { + t.Errorf("method %q has kind %d, want %d", item.Label, item.Kind, completionKindMethod) + } + } +} + +func TestCompleteTriggerRouting(t *testing.T) { + methods, functions := eval.StdlibNames() + e := newCompletionEngine(methods, functions) + prog := mustParseProg(t, "$x = 1\noutput = $x") + + t.Run("dot trigger returns methods", func(t *testing.T) { + items := e.complete("input.", prog, position{Line: 0, Character: 6}, &completionContext{ + TriggerKind: 2, + TriggerCharacter: ".", + }) + if len(items) == 0 { + t.Fatal("expected method completions after dot") + } + for _, item := range items { + if item.Kind != completionKindMethod { + t.Errorf("got kind %d for %q after dot, want method", item.Kind, item.Label) + } + } + }) + + t.Run("dollar trigger returns variables", func(t *testing.T) { + items := e.complete("$x = 1\noutput = $", prog, position{Line: 1, Character: 10}, &completionContext{ + TriggerKind: 2, + TriggerCharacter: "$", + }) + if len(items) == 0 { + t.Fatal("expected variable completions after $") + } + found := false + for _, item := range items { + if item.Label == "$x" { + found = true + } + if item.Kind != completionKindVariable { + t.Errorf("got kind %d for %q after $, want variable", item.Kind, item.Label) + } + } + if !found { + t.Error("expected $x in variable completions") + } + }) + + t.Run("no trigger returns general completions", func(t *testing.T) { + items := e.complete("output = ", prog, position{Line: 0, Character: 9}, nil) + if len(items) == 0 { + t.Fatal("expected general completions") + } + labels := labelSet(items) + // Should include keywords and functions. + if !labels["if"] { + t.Error("missing keyword 'if' in general completions") + } + if !labels["uuid_v4"] { + t.Error("missing function 'uuid_v4' in general completions") + } + }) + + t.Run("at-sign trigger returns nil", func(t *testing.T) { + items := e.complete("output@", prog, position{Line: 0, Character: 7}, &completionContext{ + TriggerKind: 2, + TriggerCharacter: "@", + }) + if items != nil { + t.Errorf("expected nil after @, got %d items", len(items)) + } + }) +} + +func TestCompleteTriggerDetectionFromText(t *testing.T) { + methods, functions := eval.StdlibNames() + e := newCompletionEngine(methods, functions) + + t.Run("detect dot from text when no trigger context", func(t *testing.T) { + items := e.complete("input.", nil, position{Line: 0, Character: 6}, nil) + if len(items) == 0 { + t.Fatal("expected method completions when cursor after dot") + } + if items[0].Kind != completionKindMethod { + t.Errorf("expected method kind, got %d", items[0].Kind) + } + }) + + t.Run("detect dollar from text when no trigger context", func(t *testing.T) { + prog := mustParseProg(t, "$foo = 1\noutput = $foo") + items := e.complete("$foo = 1\noutput = $", prog, position{Line: 1, Character: 10}, nil) + if len(items) == 0 { + t.Fatal("expected variable completions when cursor after $") + } + }) +} + +func TestVariableCompletions(t *testing.T) { + methods, functions := eval.StdlibNames() + e := newCompletionEngine(methods, functions) + + t.Run("nil program returns nil", func(t *testing.T) { + items := e.variableCompletions(nil) + if items != nil { + t.Errorf("expected nil, got %d items", len(items)) + } + }) + + t.Run("extracts top-level variables", func(t *testing.T) { + prog := mustParseProg(t, "$alpha = 1\n$beta = 2\noutput = $alpha + $beta") + items := e.variableCompletions(prog) + labels := labelSet(items) + if !labels["$alpha"] || !labels["$beta"] { + t.Errorf("expected $alpha and $beta, got %v", labelSlice(items)) + } + }) + + t.Run("deduplicates variables", func(t *testing.T) { + prog := mustParseProg(t, "$x = 1\n$x = 2\noutput = $x") + items := e.variableCompletions(prog) + if len(items) != 1 { + t.Errorf("expected 1 unique variable, got %d", len(items)) + } + }) + + t.Run("extracts variables from if branches", func(t *testing.T) { + prog := mustParseProg(t, "if true {\n $inner = 1\n output = $inner\n}") + items := e.variableCompletions(prog) + labels := labelSet(items) + if !labels["$inner"] { + t.Errorf("expected $inner from if body, got %v", labelSlice(items)) + } + }) + + t.Run("extracts variables from match cases", func(t *testing.T) { + prog := mustParseProg(t, "match input {\n 1 => {\n $matched = true\n output = $matched\n }\n}") + items := e.variableCompletions(prog) + labels := labelSet(items) + if !labels["$matched"] { + t.Errorf("expected $matched from match body, got %v", labelSlice(items)) + } + }) + + t.Run("results are sorted", func(t *testing.T) { + prog := mustParseProg(t, "$zebra = 1\n$alpha = 2\n$mid = 3\noutput = $zebra") + items := e.variableCompletions(prog) + for i := 1; i < len(items); i++ { + if items[i].Label < items[i-1].Label { + t.Errorf("items not sorted: %q before %q", items[i-1].Label, items[i].Label) + } + } + }) +} + +func TestGeneralCompletionsIncludesMaps(t *testing.T) { + methods, functions := eval.StdlibNames() + e := newCompletionEngine(methods, functions) + + prog := mustParseProg(t, "map double(x) {\n x * 2\n}\noutput = double(input)") + items := e.generalCompletions(prog) + labels := labelSet(items) + if !labels["double"] { + t.Error("expected user map 'double' in general completions") + } + + // Verify the map completion has correct metadata. + for _, item := range items { + if item.Label == "double" { + if item.Kind != completionKindFunction { + t.Errorf("map completion kind = %d, want %d", item.Kind, completionKindFunction) + } + if item.Detail != "map double" { + t.Errorf("map completion detail = %q, want %q", item.Detail, "map double") + } + } + } +} + +func TestGeneralCompletionsNilProgram(t *testing.T) { + methods, functions := eval.StdlibNames() + e := newCompletionEngine(methods, functions) + + items := e.generalCompletions(nil) + if len(items) == 0 { + t.Fatal("expected keywords and functions even with nil program") + } + labels := labelSet(items) + if !labels["if"] || !labels["uuid_v4"] { + t.Error("expected at least keywords and stdlib functions") + } +} + +// mustParseProg parses a bloblang2 source and returns the optimized program. +func mustParseProg(t *testing.T, source string) *syntax.Program { + t.Helper() + prog, errs := syntax.Parse(source, "", nil) + if len(errs) > 0 { + t.Fatalf("parse error: %s", syntax.FormatErrors(errs)) + } + syntax.Optimize(prog) + return prog +} + +func labelSet(items []completionItem) map[string]bool { + s := make(map[string]bool, len(items)) + for _, item := range items { + s[item.Label] = true + } + return s +} + +func labelSlice(items []completionItem) []string { + s := make([]string, len(items)) + for i, item := range items { + s[i] = item.Label + } + return s +} diff --git a/internal/bloblang2/go/lsp/diagnostics.go b/internal/bloblang2/go/lsp/diagnostics.go new file mode 100644 index 000000000..17011d9f3 --- /dev/null +++ b/internal/bloblang2/go/lsp/diagnostics.go @@ -0,0 +1,71 @@ +package lsp + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// diagnose runs the parse+resolve pipeline and publishes diagnostics. +func (s *Server) diagnose(uri string) { + text, _, ok := s.docs.get(uri) + if !ok { + return + } + + var diagnostics []diagnostic + + prog, parseErrs := syntax.Parse(text, "", nil) + if len(parseErrs) > 0 { + for _, e := range parseErrs { + diagnostics = append(diagnostics, posErrorToDiagnostic(e)) + } + s.sendNotification("textDocument/publishDiagnostics", publishDiagnosticsParams{ + URI: uri, + Diagnostics: diagnostics, + }) + return + } + + syntax.Optimize(prog) + + resolveErrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: s.stdlibMethods, + Functions: s.stdlibFunctions, + }) + for _, e := range resolveErrs { + diagnostics = append(diagnostics, posErrorToDiagnostic(e)) + } + + // Store the program for completion use (even if there are resolve errors, + // the parse was successful so we have a valid AST). + s.docs.setProgram(uri, prog) + + if diagnostics == nil { + diagnostics = []diagnostic{} + } + + s.sendNotification("textDocument/publishDiagnostics", publishDiagnosticsParams{ + URI: uri, + Diagnostics: diagnostics, + }) +} + +func posErrorToDiagnostic(e syntax.PosError) diagnostic { + // Pos is 1-based; LSP positions are 0-based. + line := e.Pos.Line - 1 + col := e.Pos.Column - 1 + if line < 0 { + line = 0 + } + if col < 0 { + col = 0 + } + return diagnostic{ + Range: lspRange{ + Start: position{Line: line, Character: col}, + End: position{Line: line, Character: col}, + }, + Severity: severityError, + Source: "bloblang2", + Message: e.Msg, + } +} diff --git a/internal/bloblang2/go/lsp/diagnostics_test.go b/internal/bloblang2/go/lsp/diagnostics_test.go new file mode 100644 index 000000000..c2fa7a7a9 --- /dev/null +++ b/internal/bloblang2/go/lsp/diagnostics_test.go @@ -0,0 +1,228 @@ +package lsp + +import ( + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +func TestPosErrorToDiagnostic(t *testing.T) { + tests := []struct { + name string + err syntax.PosError + wantLine int + wantCol int + wantMsg string + }{ + { + name: "normal 1-based to 0-based conversion", + err: syntax.PosError{Pos: syntax.Pos{Line: 5, Column: 10}, Msg: "undeclared variable $x"}, + wantLine: 4, + wantCol: 9, + wantMsg: "undeclared variable $x", + }, + { + name: "first position", + err: syntax.PosError{Pos: syntax.Pos{Line: 1, Column: 1}, Msg: "unexpected token"}, + wantLine: 0, + wantCol: 0, + wantMsg: "unexpected token", + }, + { + name: "zero pos clamped to zero", + err: syntax.PosError{Pos: syntax.Pos{Line: 0, Column: 0}, Msg: "bad"}, + wantLine: 0, + wantCol: 0, + wantMsg: "bad", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + d := posErrorToDiagnostic(tt.err) + if d.Range.Start.Line != tt.wantLine { + t.Errorf("start line = %d, want %d", d.Range.Start.Line, tt.wantLine) + } + if d.Range.Start.Character != tt.wantCol { + t.Errorf("start character = %d, want %d", d.Range.Start.Character, tt.wantCol) + } + if d.Range.End.Line != tt.wantLine { + t.Errorf("end line = %d, want %d", d.Range.End.Line, tt.wantLine) + } + if d.Range.End.Character != tt.wantCol { + t.Errorf("end character = %d, want %d", d.Range.End.Character, tt.wantCol) + } + if d.Severity != severityError { + t.Errorf("severity = %d, want %d", d.Severity, severityError) + } + if d.Source != "bloblang2" { + t.Errorf("source = %q, want %q", d.Source, "bloblang2") + } + if d.Message != tt.wantMsg { + t.Errorf("message = %q, want %q", d.Message, tt.wantMsg) + } + }) + } +} + +func TestDiagnose(t *testing.T) { + tests := []struct { + name string + source string + wantCount int + wantSubstr string // substring expected in first diagnostic message + }{ + { + name: "valid mapping produces no diagnostics", + source: `output = input.name`, + wantCount: 0, + }, + { + name: "valid mapping with variable", + source: "$x = input.name\noutput = $x", + wantCount: 0, + }, + { + name: "parse error on invalid syntax", + source: `output = =`, + wantCount: 1, + wantSubstr: "expected expression", + }, + { + name: "undeclared variable", + source: `output = $missing`, + wantCount: 1, + wantSubstr: "undeclared variable", + }, + { + name: "unknown function", + source: `output = bogus()`, + wantCount: 1, + wantSubstr: "unknown function", + }, + { + name: "arity mismatch", + source: `output = uuid_v4("extra")`, + wantCount: 1, + wantSubstr: "accepts at most", + }, + { + name: "method arity mismatch - missing required arg", + source: `output = input.test.map()`, + wantCount: 1, + wantSubstr: "requires at least 1 arguments", + }, + { + name: "method arity mismatch - too many args", + source: `output = input.test.encode("base64", "extra")`, + wantCount: 1, + wantSubstr: "accepts at most 1 arguments", + }, + { + name: "method with optional args is fine", + source: `output = input.test.format_json()`, + wantCount: 0, + }, + { + name: "input inside map body", + source: "map foo(x) {\n input\n}\noutput = foo(1)", + wantCount: 1, + wantSubstr: "cannot access input inside a map body", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + s := newTestServer() + uri := "file:///test.blobl2" + s.docs.open(uri, tt.source) + s.diagnose(uri) + + diags := s.lastDiagnostics(uri) + if len(diags) != tt.wantCount { + t.Errorf("got %d diagnostics, want %d", len(diags), tt.wantCount) + for _, d := range diags { + t.Logf(" diagnostic: %s", d.Message) + } + return + } + if tt.wantCount > 0 && tt.wantSubstr != "" { + msg := diags[0].Message + if !contains(msg, tt.wantSubstr) { + t.Errorf("diagnostic message %q does not contain %q", msg, tt.wantSubstr) + } + } + }) + } +} + +func newTestServer() *testServer { + s := NewServer(nullReader{}, &discardWriter{}) + return &testServer{Server: s, notifications: make(map[string][]diagnostic)} +} + +type testServer struct { + *Server + notifications map[string][]diagnostic +} + +// Override sendNotification to capture diagnostics without writing JSON-RPC. +func (ts *testServer) diagnose(uri string) { + text, _, ok := ts.docs.get(uri) + if !ok { + return + } + + var diagnostics []diagnostic + + prog, parseErrs := syntax.Parse(text, "", nil) + if len(parseErrs) > 0 { + for _, e := range parseErrs { + diagnostics = append(diagnostics, posErrorToDiagnostic(e)) + } + ts.notifications[uri] = diagnostics + return + } + + syntax.Optimize(prog) + + resolveErrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: ts.stdlibMethods, + Functions: ts.stdlibFunctions, + }) + for _, e := range resolveErrs { + diagnostics = append(diagnostics, posErrorToDiagnostic(e)) + } + + ts.docs.setProgram(uri, prog) + + if diagnostics == nil { + diagnostics = []diagnostic{} + } + ts.notifications[uri] = diagnostics +} + +func (ts *testServer) lastDiagnostics(uri string) []diagnostic { + return ts.notifications[uri] +} + +type nullReader struct{} + +func (nullReader) Read([]byte) (int, error) { return 0, nil } + +type discardWriter struct{} + +func (discardWriter) Write(p []byte) (int, error) { return len(p), nil } + +func contains(s, substr string) bool { + return len(s) >= len(substr) && searchSubstring(s, substr) +} + +func searchSubstring(s, substr string) bool { + for i := 0; i <= len(s)-len(substr); i++ { + if s[i:i+len(substr)] == substr { + return true + } + } + return false +} diff --git a/internal/bloblang2/go/lsp/document.go b/internal/bloblang2/go/lsp/document.go new file mode 100644 index 000000000..a0546f5f8 --- /dev/null +++ b/internal/bloblang2/go/lsp/document.go @@ -0,0 +1,63 @@ +package lsp + +import ( + "sync" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// documentState holds the text and last successful parse of a document. +type documentState struct { + text string + prog *syntax.Program // nil if never parsed successfully +} + +// documentStore is a thread-safe in-memory store of open documents. +type documentStore struct { + mu sync.Mutex + docs map[string]*documentState +} + +func newDocumentStore() *documentStore { + return &documentStore{docs: make(map[string]*documentState)} +} + +func (s *documentStore) open(uri, text string) { + s.mu.Lock() + defer s.mu.Unlock() + s.docs[uri] = &documentState{text: text} +} + +func (s *documentStore) update(uri, text string) { + s.mu.Lock() + defer s.mu.Unlock() + if doc, ok := s.docs[uri]; ok { + doc.text = text + } else { + s.docs[uri] = &documentState{text: text} + } +} + +func (s *documentStore) close(uri string) { + s.mu.Lock() + defer s.mu.Unlock() + delete(s.docs, uri) +} + +func (s *documentStore) get(uri string) (string, *syntax.Program, bool) { + s.mu.Lock() + defer s.mu.Unlock() + doc, ok := s.docs[uri] + if !ok { + return "", nil, false + } + return doc.text, doc.prog, true +} + +func (s *documentStore) setProgram(uri string, prog *syntax.Program) { + s.mu.Lock() + defer s.mu.Unlock() + if doc, ok := s.docs[uri]; ok { + doc.prog = prog + } +} diff --git a/internal/bloblang2/go/lsp/document_test.go b/internal/bloblang2/go/lsp/document_test.go new file mode 100644 index 000000000..35e7f21a6 --- /dev/null +++ b/internal/bloblang2/go/lsp/document_test.go @@ -0,0 +1,130 @@ +package lsp + +import ( + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +func TestDocumentStoreOpenAndGet(t *testing.T) { + s := newDocumentStore() + + s.open("file:///a.blobl2", "output = input") + + text, prog, ok := s.get("file:///a.blobl2") + if !ok { + t.Fatal("expected document to exist after open") + } + if text != "output = input" { + t.Errorf("text = %q, want %q", text, "output = input") + } + if prog != nil { + t.Error("expected nil program before any parse") + } +} + +func TestDocumentStoreGetMissing(t *testing.T) { + s := newDocumentStore() + + _, _, ok := s.get("file:///missing.blobl2") + if ok { + t.Error("expected missing document to return ok=false") + } +} + +func TestDocumentStoreUpdate(t *testing.T) { + s := newDocumentStore() + + s.open("file:///a.blobl2", "old content") + s.update("file:///a.blobl2", "new content") + + text, _, ok := s.get("file:///a.blobl2") + if !ok { + t.Fatal("expected document to exist after update") + } + if text != "new content" { + t.Errorf("text = %q, want %q", text, "new content") + } +} + +func TestDocumentStoreUpdateCreatesIfMissing(t *testing.T) { + s := newDocumentStore() + + s.update("file:///new.blobl2", "created via update") + + text, _, ok := s.get("file:///new.blobl2") + if !ok { + t.Fatal("expected document to exist after upsert") + } + if text != "created via update" { + t.Errorf("text = %q, want %q", text, "created via update") + } +} + +func TestDocumentStoreClose(t *testing.T) { + s := newDocumentStore() + + s.open("file:///a.blobl2", "content") + s.close("file:///a.blobl2") + + _, _, ok := s.get("file:///a.blobl2") + if ok { + t.Error("expected document to be gone after close") + } +} + +func TestDocumentStoreCloseNonExistent(t *testing.T) { + s := newDocumentStore() + // Should not panic. + s.close("file:///nope.blobl2") +} + +func TestDocumentStoreSetProgram(t *testing.T) { + s := newDocumentStore() + s.open("file:///a.blobl2", "output = input") + + prog := &syntax.Program{} + s.setProgram("file:///a.blobl2", prog) + + _, got, ok := s.get("file:///a.blobl2") + if !ok { + t.Fatal("expected document to exist") + } + if got != prog { + t.Error("expected setProgram to store the program") + } +} + +func TestDocumentStoreSetProgramMissing(t *testing.T) { + s := newDocumentStore() + // Should not panic when setting program on non-existent document. + s.setProgram("file:///missing.blobl2", &syntax.Program{}) +} + +func TestDocumentStoreMultipleDocuments(t *testing.T) { + s := newDocumentStore() + + s.open("file:///a.blobl2", "aaa") + s.open("file:///b.blobl2", "bbb") + + textA, _, okA := s.get("file:///a.blobl2") + textB, _, okB := s.get("file:///b.blobl2") + + if !okA || textA != "aaa" { + t.Errorf("doc a: ok=%v, text=%q", okA, textA) + } + if !okB || textB != "bbb" { + t.Errorf("doc b: ok=%v, text=%q", okB, textB) + } + + s.close("file:///a.blobl2") + + _, _, okA = s.get("file:///a.blobl2") + _, _, okB = s.get("file:///b.blobl2") + if okA { + t.Error("doc a should be gone after close") + } + if !okB { + t.Error("doc b should still exist") + } +} diff --git a/internal/bloblang2/go/lsp/protocol.go b/internal/bloblang2/go/lsp/protocol.go new file mode 100644 index 000000000..3fe8d1991 --- /dev/null +++ b/internal/bloblang2/go/lsp/protocol.go @@ -0,0 +1,140 @@ +// Package lsp implements a minimal LSP server for Bloblang V2. +package lsp + +import "encoding/json" + +// JSON-RPC message types. + +type jsonrpcMessage struct { + JSONRPC string `json:"jsonrpc"` + ID json.RawMessage `json:"id,omitempty"` + Method string `json:"method,omitempty"` + Params json.RawMessage `json:"params,omitempty"` + Result any `json:"result,omitempty"` + Error *jsonrpcError `json:"error,omitempty"` +} + +type jsonrpcError struct { + Code int `json:"code"` + Message string `json:"message"` +} + +// LSP protocol types — minimal subset. + +type initializeResult struct { + Capabilities serverCapabilities `json:"capabilities"` + ServerInfo serverInfo `json:"serverInfo,omitempty"` +} + +type serverInfo struct { + Name string `json:"name"` + Version string `json:"version,omitempty"` +} + +type serverCapabilities struct { + TextDocumentSync textDocumentSyncOptions `json:"textDocumentSync"` + CompletionProvider *completionOptions `json:"completionProvider,omitempty"` +} + +type textDocumentSyncOptions struct { + OpenClose bool `json:"openClose"` + Change int `json:"change"` // 1 = Full +} + +type completionOptions struct { + TriggerCharacters []string `json:"triggerCharacters,omitempty"` +} + +// Document events. + +type didOpenTextDocumentParams struct { + TextDocument textDocumentItem `json:"textDocument"` +} + +type textDocumentItem struct { + URI string `json:"uri"` + LanguageID string `json:"languageId"` + Version int `json:"version"` + Text string `json:"text"` +} + +type didChangeTextDocumentParams struct { + TextDocument versionedTextDocumentIdentifier `json:"textDocument"` + ContentChanges []textDocumentContentChangeEvent `json:"contentChanges"` +} + +type versionedTextDocumentIdentifier struct { + URI string `json:"uri"` + Version int `json:"version"` +} + +type textDocumentContentChangeEvent struct { + Text string `json:"text"` +} + +type didCloseTextDocumentParams struct { + TextDocument textDocumentIdentifier `json:"textDocument"` +} + +type textDocumentIdentifier struct { + URI string `json:"uri"` +} + +// Diagnostics. + +type publishDiagnosticsParams struct { + URI string `json:"uri"` + Diagnostics []diagnostic `json:"diagnostics"` +} + +type diagnostic struct { + Range lspRange `json:"range"` + Severity int `json:"severity"` // 1=Error, 2=Warning, 3=Info, 4=Hint + Source string `json:"source,omitempty"` + Message string `json:"message"` +} + +type lspRange struct { + Start position `json:"start"` + End position `json:"end"` +} + +type position struct { + Line int `json:"line"` // 0-based + Character int `json:"character"` // 0-based +} + +// Completions. + +type completionParams struct { + TextDocument textDocumentIdentifier `json:"textDocument"` + Position position `json:"position"` + Context *completionContext `json:"context,omitempty"` +} + +type completionContext struct { + TriggerKind int `json:"triggerKind"` // 1=Invoked, 2=TriggerCharacter + TriggerCharacter string `json:"triggerCharacter,omitempty"` +} + +type completionItem struct { + Label string `json:"label"` + Kind int `json:"kind,omitempty"` + Detail string `json:"detail,omitempty"` + InsertText string `json:"insertText,omitempty"` + Documentation string `json:"documentation,omitempty"` +} + +// completionItemKind constants. +const ( + completionKindMethod = 2 + completionKindFunction = 3 + completionKindVariable = 6 + completionKindKeyword = 14 + completionKindValue = 12 +) + +// diagnosticSeverity constants. +const ( + severityError = 1 +) diff --git a/internal/bloblang2/go/lsp/server.go b/internal/bloblang2/go/lsp/server.go new file mode 100644 index 000000000..155a2ad44 --- /dev/null +++ b/internal/bloblang2/go/lsp/server.go @@ -0,0 +1,266 @@ +package lsp + +import ( + "bufio" + "encoding/json" + "errors" + "fmt" + "io" + "log" + "os" + "strconv" + "strings" + "sync" + "time" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// Server is a minimal LSP server for Bloblang V2. +type Server struct { + in *bufio.Reader + out io.Writer + mu sync.Mutex // protects writes to out + + docs *documentStore + completion *completionEngine + + // Cached stdlib metadata. + stdlibMethods map[string]syntax.MethodInfo + stdlibFunctions map[string]syntax.FunctionInfo + + // Debounce timers per URI. + timersMu sync.Mutex + timers map[string]*time.Timer + + shutdown bool + logger *log.Logger +} + +// NewServer creates a new LSP server reading from in and writing to out. +func NewServer(in io.Reader, out io.Writer) *Server { + methods, functions := eval.StdlibNames() + s := &Server{ + in: bufio.NewReader(in), + out: out, + docs: newDocumentStore(), + stdlibMethods: methods, + stdlibFunctions: functions, + timers: make(map[string]*time.Timer), + logger: log.New(os.Stderr, "[bloblang2-lsp] ", log.LstdFlags), + } + s.completion = newCompletionEngine(methods, functions) + return s +} + +// Run starts the server loop. It blocks until the connection is closed. +func (s *Server) Run() error { + for { + msg, err := s.readMessage() + if err != nil { + if err == io.EOF { + return nil + } + return fmt.Errorf("read: %w", err) + } + s.handleMessage(msg) + if msg.Method == "exit" { + if s.shutdown { + return nil + } + return errors.New("exit without shutdown") + } + } +} + +func (s *Server) readMessage() (*jsonrpcMessage, error) { + // Read headers. + var contentLength int + for { + line, err := s.in.ReadString('\n') + if err != nil { + return nil, err + } + line = strings.TrimSpace(line) + if line == "" { + break // end of headers + } + if strings.HasPrefix(line, "Content-Length:") { + val := strings.TrimSpace(strings.TrimPrefix(line, "Content-Length:")) + contentLength, err = strconv.Atoi(val) + if err != nil { + return nil, fmt.Errorf("invalid Content-Length: %w", err) + } + } + } + if contentLength == 0 { + return nil, errors.New("missing Content-Length header") + } + + // Read body. + body := make([]byte, contentLength) + if _, err := io.ReadFull(s.in, body); err != nil { + return nil, fmt.Errorf("read body: %w", err) + } + + var msg jsonrpcMessage + if err := json.Unmarshal(body, &msg); err != nil { + return nil, fmt.Errorf("unmarshal: %w", err) + } + return &msg, nil +} + +func (s *Server) sendResponse(id json.RawMessage, result any) { + s.writeMessage(&jsonrpcMessage{ + JSONRPC: "2.0", + ID: id, + Result: result, + }) +} + +func (s *Server) sendNotification(method string, params any) { + raw, _ := json.Marshal(params) + s.writeMessage(&jsonrpcMessage{ + JSONRPC: "2.0", + Method: method, + Params: raw, + }) +} + +func (s *Server) writeMessage(msg *jsonrpcMessage) { + body, err := json.Marshal(msg) + if err != nil { + s.logger.Printf("marshal error: %v", err) + return + } + + s.mu.Lock() + defer s.mu.Unlock() + header := fmt.Sprintf("Content-Length: %d\r\n\r\n", len(body)) + if _, err := io.WriteString(s.out, header); err != nil { + s.logger.Printf("write header error: %v", err) + return + } + if _, err := s.out.Write(body); err != nil { + s.logger.Printf("write body error: %v", err) + } +} + +func (s *Server) handleMessage(msg *jsonrpcMessage) { + switch msg.Method { + case "initialize": + s.handleInitialize(msg) + case "initialized": + // no-op + case "shutdown": + s.shutdown = true + s.sendResponse(msg.ID, nil) + case "exit": + // Handled by Run() loop — exit is signalled via s.shutdown. + case "textDocument/didOpen": + s.handleDidOpen(msg) + case "textDocument/didChange": + s.handleDidChange(msg) + case "textDocument/didClose": + s.handleDidClose(msg) + case "textDocument/completion": + s.handleCompletion(msg) + default: + // Unknown method — if it has an ID it's a request, respond with method not found. + if msg.ID != nil { + s.writeMessage(&jsonrpcMessage{ + JSONRPC: "2.0", + ID: msg.ID, + Error: &jsonrpcError{Code: -32601, Message: "method not found: " + msg.Method}, + }) + } + } +} + +func (s *Server) handleInitialize(msg *jsonrpcMessage) { + s.sendResponse(msg.ID, initializeResult{ + Capabilities: serverCapabilities{ + TextDocumentSync: textDocumentSyncOptions{ + OpenClose: true, + Change: 1, // Full + }, + CompletionProvider: &completionOptions{ + TriggerCharacters: []string{".", "$", "@"}, + }, + }, + ServerInfo: serverInfo{ + Name: "bloblang2-lsp", + Version: "0.1.0", + }, + }) +} + +func (s *Server) handleDidOpen(msg *jsonrpcMessage) { + var params didOpenTextDocumentParams + if err := json.Unmarshal(msg.Params, ¶ms); err != nil { + s.logger.Printf("didOpen unmarshal: %v", err) + return + } + s.docs.open(params.TextDocument.URI, params.TextDocument.Text) + s.diagnose(params.TextDocument.URI) +} + +func (s *Server) handleDidChange(msg *jsonrpcMessage) { + var params didChangeTextDocumentParams + if err := json.Unmarshal(msg.Params, ¶ms); err != nil { + s.logger.Printf("didChange unmarshal: %v", err) + return + } + if len(params.ContentChanges) > 0 { + // Full sync: take the last change event. + text := params.ContentChanges[len(params.ContentChanges)-1].Text + s.docs.update(params.TextDocument.URI, text) + s.debounceDiagnose(params.TextDocument.URI) + } +} + +func (s *Server) handleDidClose(msg *jsonrpcMessage) { + var params didCloseTextDocumentParams + if err := json.Unmarshal(msg.Params, ¶ms); err != nil { + s.logger.Printf("didClose unmarshal: %v", err) + return + } + // Clear diagnostics before closing. + s.sendNotification("textDocument/publishDiagnostics", publishDiagnosticsParams{ + URI: params.TextDocument.URI, + Diagnostics: []diagnostic{}, + }) + s.docs.close(params.TextDocument.URI) +} + +func (s *Server) handleCompletion(msg *jsonrpcMessage) { + var params completionParams + if err := json.Unmarshal(msg.Params, ¶ms); err != nil { + s.logger.Printf("completion unmarshal: %v", err) + s.sendResponse(msg.ID, []completionItem{}) + return + } + + text, prog, ok := s.docs.get(params.TextDocument.URI) + if !ok { + s.sendResponse(msg.ID, []completionItem{}) + return + } + + items := s.completion.complete(text, prog, params.Position, params.Context) + s.sendResponse(msg.ID, items) +} + +// debounceDiagnose schedules a diagnosis for the given URI after a short delay. +func (s *Server) debounceDiagnose(uri string) { + s.timersMu.Lock() + defer s.timersMu.Unlock() + + if t, ok := s.timers[uri]; ok { + t.Stop() + } + s.timers[uri] = time.AfterFunc(80*time.Millisecond, func() { + s.diagnose(uri) + }) +} diff --git a/internal/bloblang2/go/lsp/server_test.go b/internal/bloblang2/go/lsp/server_test.go new file mode 100644 index 000000000..9a12e09c3 --- /dev/null +++ b/internal/bloblang2/go/lsp/server_test.go @@ -0,0 +1,342 @@ +package lsp + +import ( + "bytes" + "encoding/json" + "fmt" + "io" + "strings" + "testing" +) + +// lspMessage builds a JSON-RPC message with Content-Length header. +func lspMessage(msg jsonrpcMessage) []byte { + body, _ := json.Marshal(msg) + return []byte(fmt.Sprintf("Content-Length: %d\r\n\r\n%s", len(body), body)) +} + +func requestMsg(id int, method string, params any) jsonrpcMessage { + raw, _ := json.Marshal(params) + return jsonrpcMessage{ + JSONRPC: "2.0", + ID: json.RawMessage(fmt.Sprintf("%d", id)), + Method: method, + Params: raw, + } +} + +func notifyMsg(method string, params any) jsonrpcMessage { + raw, _ := json.Marshal(params) + return jsonrpcMessage{ + JSONRPC: "2.0", + Method: method, + Params: raw, + } +} + +// parseResponses reads all JSON-RPC messages from a buffer. +func parseResponses(data []byte) ([]jsonrpcMessage, error) { + var msgs []jsonrpcMessage + r := bytes.NewReader(data) + for r.Len() > 0 { + // Read Content-Length header. + var header string + for { + b, err := r.ReadByte() + if err != nil { + if err == io.EOF && len(msgs) > 0 { + return msgs, nil + } + return msgs, err + } + header += string(b) + if strings.HasSuffix(header, "\r\n\r\n") { + break + } + } + + var contentLength int + for _, line := range strings.Split(header, "\r\n") { + if strings.HasPrefix(line, "Content-Length:") { + val := strings.TrimSpace(strings.TrimPrefix(line, "Content-Length:")) + if _, err := fmt.Sscanf(val, "%d", &contentLength); err != nil { + return msgs, fmt.Errorf("invalid Content-Length: %w", err) + } + } + } + if contentLength == 0 { + continue + } + + body := make([]byte, contentLength) + if _, err := io.ReadFull(r, body); err != nil { + return msgs, err + } + + var msg jsonrpcMessage + if err := json.Unmarshal(body, &msg); err != nil { + return msgs, err + } + msgs = append(msgs, msg) + } + return msgs, nil +} + +func TestServerInitializeShutdown(t *testing.T) { + var input bytes.Buffer + var output bytes.Buffer + + // Send initialize, initialized, shutdown, exit. + input.Write(lspMessage(requestMsg(1, "initialize", map[string]any{ + "rootUri": "file:///workspace", + }))) + input.Write(lspMessage(notifyMsg("initialized", struct{}{}))) + input.Write(lspMessage(requestMsg(2, "shutdown", nil))) + input.Write(lspMessage(notifyMsg("exit", nil))) + + s := NewServer(&input, &output) + if err := s.Run(); err != nil { + t.Fatalf("Run() error: %v", err) + } + + msgs, err := parseResponses(output.Bytes()) + if err != nil { + t.Fatalf("parseResponses: %v", err) + } + + // Expect initialize response and shutdown response. + if len(msgs) < 2 { + t.Fatalf("expected at least 2 responses, got %d", len(msgs)) + } + + // Check initialize response. + initResp := msgs[0] + if string(initResp.ID) != "1" { + t.Errorf("initialize response ID = %s, want 1", initResp.ID) + } + if initResp.Error != nil { + t.Errorf("initialize returned error: %s", initResp.Error.Message) + } + + // Verify capabilities are present. + raw, _ := json.Marshal(initResp.Result) + var result initializeResult + if err := json.Unmarshal(raw, &result); err != nil { + t.Fatalf("unmarshal init result: %v", err) + } + if result.ServerInfo.Name != "bloblang2-lsp" { + t.Errorf("server name = %q, want %q", result.ServerInfo.Name, "bloblang2-lsp") + } + if !result.Capabilities.TextDocumentSync.OpenClose { + t.Error("expected openClose = true") + } + if result.Capabilities.CompletionProvider == nil { + t.Error("expected completionProvider to be set") + } +} + +func TestServerDidOpenPublishesDiagnostics(t *testing.T) { + var input bytes.Buffer + var output bytes.Buffer + + // Initialize, then open a file with an error. + input.Write(lspMessage(requestMsg(1, "initialize", map[string]any{}))) + input.Write(lspMessage(notifyMsg("initialized", struct{}{}))) + input.Write(lspMessage(notifyMsg("textDocument/didOpen", didOpenTextDocumentParams{ + TextDocument: textDocumentItem{ + URI: "file:///test.blobl2", + LanguageID: "blobl2", + Version: 1, + Text: "output = $undeclared", + }, + }))) + input.Write(lspMessage(requestMsg(2, "shutdown", nil))) + input.Write(lspMessage(notifyMsg("exit", nil))) + + s := NewServer(&input, &output) + if err := s.Run(); err != nil { + t.Fatalf("Run() error: %v", err) + } + + msgs, err := parseResponses(output.Bytes()) + if err != nil { + t.Fatalf("parseResponses: %v", err) + } + + // Find the publishDiagnostics notification. + var diagParams publishDiagnosticsParams + found := false + for _, msg := range msgs { + if msg.Method == "textDocument/publishDiagnostics" { + if err := json.Unmarshal(msg.Params, &diagParams); err != nil { + t.Fatalf("unmarshal diagnostics: %v", err) + } + found = true + break + } + } + + if !found { + t.Fatal("expected publishDiagnostics notification") + } + if diagParams.URI != "file:///test.blobl2" { + t.Errorf("diagnostics URI = %q, want %q", diagParams.URI, "file:///test.blobl2") + } + if len(diagParams.Diagnostics) == 0 { + t.Fatal("expected at least one diagnostic for undeclared variable") + } + if !strings.Contains(diagParams.Diagnostics[0].Message, "undeclared") { + t.Errorf("diagnostic message = %q, expected it to contain 'undeclared'", diagParams.Diagnostics[0].Message) + } +} + +func TestServerDidOpenCleanFile(t *testing.T) { + var input bytes.Buffer + var output bytes.Buffer + + input.Write(lspMessage(requestMsg(1, "initialize", map[string]any{}))) + input.Write(lspMessage(notifyMsg("initialized", struct{}{}))) + input.Write(lspMessage(notifyMsg("textDocument/didOpen", didOpenTextDocumentParams{ + TextDocument: textDocumentItem{ + URI: "file:///clean.blobl2", + LanguageID: "blobl2", + Version: 1, + Text: "output = input.name.uppercase()", + }, + }))) + input.Write(lspMessage(requestMsg(2, "shutdown", nil))) + input.Write(lspMessage(notifyMsg("exit", nil))) + + s := NewServer(&input, &output) + if err := s.Run(); err != nil { + t.Fatalf("Run() error: %v", err) + } + + msgs, err := parseResponses(output.Bytes()) + if err != nil { + t.Fatalf("parseResponses: %v", err) + } + + for _, msg := range msgs { + if msg.Method == "textDocument/publishDiagnostics" { + var diagParams publishDiagnosticsParams + if err := json.Unmarshal(msg.Params, &diagParams); err != nil { + t.Fatalf("unmarshal: %v", err) + } + if len(diagParams.Diagnostics) != 0 { + t.Errorf("expected 0 diagnostics for clean file, got %d", len(diagParams.Diagnostics)) + for _, d := range diagParams.Diagnostics { + t.Logf(" diagnostic: %s", d.Message) + } + } + return + } + } + t.Fatal("expected publishDiagnostics notification for clean file") +} + +func TestServerCompletion(t *testing.T) { + var input bytes.Buffer + var output bytes.Buffer + + input.Write(lspMessage(requestMsg(1, "initialize", map[string]any{}))) + input.Write(lspMessage(notifyMsg("initialized", struct{}{}))) + // Open a file so the document store has content. + input.Write(lspMessage(notifyMsg("textDocument/didOpen", didOpenTextDocumentParams{ + TextDocument: textDocumentItem{ + URI: "file:///comp.blobl2", + LanguageID: "blobl2", + Version: 1, + Text: "$foo = 1\noutput = input.", + }, + }))) + // Request completion after the dot. + input.Write(lspMessage(requestMsg(2, "textDocument/completion", completionParams{ + TextDocument: textDocumentIdentifier{URI: "file:///comp.blobl2"}, + Position: position{Line: 1, Character: 15}, + Context: &completionContext{TriggerKind: 2, TriggerCharacter: "."}, + }))) + input.Write(lspMessage(requestMsg(3, "shutdown", nil))) + input.Write(lspMessage(notifyMsg("exit", nil))) + + s := NewServer(&input, &output) + if err := s.Run(); err != nil { + t.Fatalf("Run() error: %v", err) + } + + msgs, err := parseResponses(output.Bytes()) + if err != nil { + t.Fatalf("parseResponses: %v", err) + } + + // Find the completion response (id=2). + var completionResp jsonrpcMessage + found := false + for _, msg := range msgs { + if string(msg.ID) == "2" { + completionResp = msg + found = true + break + } + } + if !found { + t.Fatal("expected completion response with id=2") + } + + raw, _ := json.Marshal(completionResp.Result) + var items []completionItem + if err := json.Unmarshal(raw, &items); err != nil { + t.Fatalf("unmarshal completion items: %v", err) + } + if len(items) == 0 { + t.Fatal("expected completion items after dot") + } + + // All items should be methods. + labels := make(map[string]bool) + for _, item := range items { + labels[item.Label] = true + if item.Kind != completionKindMethod { + t.Errorf("completion item %q has kind %d, want %d (method)", item.Label, item.Kind, completionKindMethod) + } + } + if !labels["uppercase"] { + t.Error("expected 'uppercase' in method completions") + } + if !labels["filter"] { + t.Error("expected 'filter' in method completions") + } +} + +func TestServerUnknownMethodReturnsError(t *testing.T) { + var input bytes.Buffer + var output bytes.Buffer + + input.Write(lspMessage(requestMsg(1, "initialize", map[string]any{}))) + input.Write(lspMessage(notifyMsg("initialized", struct{}{}))) + input.Write(lspMessage(requestMsg(2, "textDocument/bogus", nil))) + input.Write(lspMessage(requestMsg(3, "shutdown", nil))) + input.Write(lspMessage(notifyMsg("exit", nil))) + + s := NewServer(&input, &output) + if err := s.Run(); err != nil { + t.Fatalf("Run() error: %v", err) + } + + msgs, err := parseResponses(output.Bytes()) + if err != nil { + t.Fatalf("parseResponses: %v", err) + } + + for _, msg := range msgs { + if string(msg.ID) == "2" { + if msg.Error == nil { + t.Error("expected error response for unknown method") + } else if msg.Error.Code != -32601 { + t.Errorf("error code = %d, want -32601", msg.Error.Code) + } + return + } + } + t.Fatal("expected response for unknown method request") +} diff --git a/internal/bloblang2/plugins/nvim/.gitignore b/internal/bloblang2/plugins/nvim/.gitignore new file mode 100644 index 000000000..fbf44fd86 --- /dev/null +++ b/internal/bloblang2/plugins/nvim/.gitignore @@ -0,0 +1,3 @@ +# Build artifacts +parser/ +bin/ diff --git a/internal/bloblang2/plugins/nvim/Taskfile.yml b/internal/bloblang2/plugins/nvim/Taskfile.yml new file mode 100644 index 000000000..b8370397d --- /dev/null +++ b/internal/bloblang2/plugins/nvim/Taskfile.yml @@ -0,0 +1,53 @@ +version: "3" + +vars: + TS_SRC: "{{.TASKFILE_DIR}}/../../tree-sitter/src" + PARSER_OUT: "{{.TASKFILE_DIR}}/parser/bloblang2.so" + LSP_OUT: "{{.TASKFILE_DIR}}/bin/bloblang2-lsp" + # Use -dynamiclib on macOS, -shared on Linux. + SHARED_FLAG: + sh: | + if [ "$(uname)" = "Darwin" ]; then + echo "-dynamiclib" + else + echo "-shared" + fi + +tasks: + default: + desc: Build tree-sitter parser and LSP binary + deps: [parser, lsp] + + parser: + desc: Compile the tree-sitter parser into a shared library + cmds: + - mkdir -p parser + - >- + cc {{.SHARED_FLAG}} -fPIC -O2 + -o {{.PARSER_OUT}} + {{.TS_SRC}}/parser.c {{.TS_SRC}}/scanner.c + -I{{.TS_SRC}} + sources: + - "{{.TS_SRC}}/parser.c" + - "{{.TS_SRC}}/scanner.c" + generates: + - "{{.PARSER_OUT}}" + + lsp: + desc: Build the bloblang2-lsp binary + dir: "{{.TASKFILE_DIR}}/../../../.." + vars: + LSP_SRC: "{{.TASKFILE_DIR}}/../../go/lsp" + cmds: + - mkdir -p {{.TASKFILE_DIR}}/bin + - go build -o {{.TASKFILE_DIR}}/bin/bloblang2-lsp ./internal/bloblang2/go/lsp/cmd/bloblang2-lsp + sources: + - "{{.LSP_SRC}}/*.go" + - "{{.LSP_SRC}}/cmd/bloblang2-lsp/*.go" + generates: + - "{{.LSP_OUT}}" + + clean: + desc: Remove build artifacts + cmds: + - rm -f {{.PARSER_OUT}} {{.LSP_OUT}} diff --git a/internal/bloblang2/plugins/nvim/ftdetect/bloblang2.lua b/internal/bloblang2/plugins/nvim/ftdetect/bloblang2.lua new file mode 100644 index 000000000..8bbf6f870 --- /dev/null +++ b/internal/bloblang2/plugins/nvim/ftdetect/bloblang2.lua @@ -0,0 +1,5 @@ +vim.filetype.add({ + extension = { + blobl2 = "blobl2", + }, +}) diff --git a/internal/bloblang2/plugins/nvim/ftplugin/blobl2.lua b/internal/bloblang2/plugins/nvim/ftplugin/blobl2.lua new file mode 100644 index 000000000..ea55aa57c --- /dev/null +++ b/internal/bloblang2/plugins/nvim/ftplugin/blobl2.lua @@ -0,0 +1 @@ +vim.bo.commentstring = "# %s" diff --git a/internal/bloblang2/plugins/nvim/lua/bloblang2/health.lua b/internal/bloblang2/plugins/nvim/lua/bloblang2/health.lua new file mode 100644 index 000000000..795b6c5b4 --- /dev/null +++ b/internal/bloblang2/plugins/nvim/lua/bloblang2/health.lua @@ -0,0 +1,40 @@ +local M = {} + +local function plugin_dir() + local source = debug.getinfo(1, "S").source:sub(2) + return vim.fn.fnamemodify(source, ":h:h:h") +end + +M.check = function() + vim.health.start("bloblang2") + + -- Check Neovim version. + if vim.fn.has("nvim-0.10") == 1 then + vim.health.ok("Neovim >= 0.10") + else + vim.health.error("Neovim >= 0.10 required for vim.treesitter.language.add()") + end + + local dir = plugin_dir() + + -- Check tree-sitter parser. + local so_path = dir .. "/parser/bloblang2.so" + local dylib_path = dir .. "/parser/bloblang2.dylib" + if vim.fn.filereadable(so_path) == 1 then + vim.health.ok("Tree-sitter parser found: " .. so_path) + elseif vim.fn.filereadable(dylib_path) == 1 then + vim.health.ok("Tree-sitter parser found: " .. dylib_path) + else + vim.health.warn("Tree-sitter parser not found. Run 'task parser' in " .. dir) + end + + -- Check LSP binary. + local lsp_path = dir .. "/bin/bloblang2-lsp" + if vim.fn.executable(lsp_path) == 1 then + vim.health.ok("LSP binary found: " .. lsp_path) + else + vim.health.warn("LSP binary not found. Run 'task lsp' in " .. dir) + end +end + +return M diff --git a/internal/bloblang2/plugins/nvim/lua/bloblang2/init.lua b/internal/bloblang2/plugins/nvim/lua/bloblang2/init.lua new file mode 100644 index 000000000..85517c6e4 --- /dev/null +++ b/internal/bloblang2/plugins/nvim/lua/bloblang2/init.lua @@ -0,0 +1,83 @@ +local M = {} + +--- Resolve the root directory of this plugin. +local function plugin_dir() + local source = debug.getinfo(1, "S").source:sub(2) -- strip leading @ + -- .../plugins/nvim/lua/bloblang2/init.lua -> .../plugins/nvim + return vim.fn.fnamemodify(source, ":h:h:h") +end + +local _dir = plugin_dir() + +--- Default options. +local defaults = { + lsp = { + cmd = nil, -- defaults to /bin/bloblang2-lsp + enabled = true, + }, +} + +--- Merged user options. +local opts = {} + +--- Register the tree-sitter parser and enable highlighting. +local function setup_treesitter() + local parser_path = _dir .. "/parser/bloblang2.so" + if vim.fn.filereadable(parser_path) == 0 then + -- Try .dylib for macOS builds. + parser_path = _dir .. "/parser/bloblang2.dylib" + end + if vim.fn.filereadable(parser_path) == 0 then + vim.notify("[bloblang2] tree-sitter parser not found. Run 'task parser' in the plugin directory.", vim.log.levels.WARN) + return + end + + vim.treesitter.language.add("bloblang2", { + path = parser_path, + filetype = "blobl2", + }) +end + +--- Start the LSP client for a buffer. +local function start_lsp(buf) + if not opts.lsp.enabled then + return + end + + local cmd = opts.lsp.cmd + if not cmd then + local bin = _dir .. "/bin/bloblang2-lsp" + if vim.fn.executable(bin) == 0 then + vim.notify("[bloblang2] LSP binary not found. Run 'task lsp' in the plugin directory.", vim.log.levels.WARN) + return + end + cmd = { bin } + end + + vim.lsp.start({ + name = "bloblang2-lsp", + cmd = cmd, + root_dir = vim.fs.root(buf, { ".git" }) or vim.fn.getcwd(), + }) +end + +--- Setup the bloblang2 plugin. +---@param user_opts? table +function M.setup(user_opts) + opts = vim.tbl_deep_extend("force", defaults, user_opts or {}) + + setup_treesitter() + + vim.api.nvim_create_autocmd("FileType", { + pattern = "blobl2", + callback = function(args) + -- Enable tree-sitter highlighting. + vim.treesitter.start(args.buf, "bloblang2") + + -- Start the LSP client. + start_lsp(args.buf) + end, + }) +end + +return M diff --git a/internal/bloblang2/plugins/nvim/queries/bloblang2/highlights.scm b/internal/bloblang2/plugins/nvim/queries/bloblang2/highlights.scm new file mode 100644 index 000000000..f3f3b8fff --- /dev/null +++ b/internal/bloblang2/plugins/nvim/queries/bloblang2/highlights.scm @@ -0,0 +1,77 @@ +; Keywords +["if" "else" "match" "as" "map" "import"] @keyword + +; Context roots +["input" "output"] @keyword.builtin + +; Constants +["true" "false"] @constant.builtin +"null" @constant.builtin + +; Discard +"_" @variable.builtin + +; Literals +(integer) @number +(float) @number.float +(string) @string +(raw_string) @string +(string_content) @string +(escape_sequence) @string.escape + +; Variables ($name) +(variable) @variable + +; Map declarations +(map_declaration + name: (identifier) @function) + +; Function calls +(call_expression + name: (identifier) @function.call) +(call_expression + name: (qualified_name + namespace: (identifier) @module + name: (identifier) @function.call)) +"deleted" @function.builtin +"throw" @function.builtin +"void" @function.builtin + +; Method calls +(method_call + method: (_) @function.method) +(null_safe_method_call + method: (_) @function.method) + +; Field access +(field_access + field: (_) @property) +(null_safe_field_access + field: (_) @property) + +; Parameters +(parameter + (identifier) @variable.parameter) + +; Named arguments +(named_argument + name: (identifier) @variable.parameter) + +; Metadata +(metadata_access) @attribute + +; Qualified name namespace +(qualified_name + namespace: (identifier) @module) + +; Operators +["+" "-" "*" "/" "%" "==" "!=" ">" ">=" "<" "<=" "&&" "||" "!"] @operator +["=" "=>" "->"] @operator + +; Punctuation +["." "?." "::"] @punctuation.delimiter +["(" ")" "[" "]" "?[" "{" "}"] @punctuation.bracket +["," ":"] @punctuation.delimiter + +; Comments +(comment) @comment From 87ccaa6b98e1e36d55d85b79acedf80b4dffba60 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Thu, 30 Apr 2026 14:45:48 +0100 Subject: [PATCH 08/20] bloblang(v2): Add tree-sitter grammar and demo web UI Adds internal/bloblang2/tree-sitter/, a tree-sitter grammar for V2 syntax with corpus tests, plus the syntax highlighting query used by the Neovim plugin. Also adds internal/bloblang2/demo/, a small Go-served web playground with a Monaco-style editor, a case-study dropdown of real-world mappings, an engine selector for switching between server-side and browser-side execution (the latter via the TypeScript runtime), and syntax highlighting backed by tree-sitter. --- internal/bloblang2/demo/.gitignore | 7 + internal/bloblang2/demo/main.go | 329 + internal/bloblang2/demo/page.html | 668 + internal/bloblang2/tree-sitter/.gitignore | 2 + internal/bloblang2/tree-sitter/Taskfile.yml | 44 + internal/bloblang2/tree-sitter/grammar.js | 476 + .../bloblang2/tree-sitter/package-lock.json | 29 + internal/bloblang2/tree-sitter/package.json | 13 + .../tree-sitter/queries/highlights.scm | 77 + .../bloblang2/tree-sitter/src/grammar.json | 2451 ++ .../bloblang2/tree-sitter/src/node-types.json | 4553 ++++ internal/bloblang2/tree-sitter/src/parser.c | 22228 ++++++++++++++++ internal/bloblang2/tree-sitter/src/scanner.c | 145 + .../tree-sitter/src/tree_sitter/alloc.h | 54 + .../tree-sitter/src/tree_sitter/array.h | 291 + .../tree-sitter/src/tree_sitter/parser.h | 266 + .../tree-sitter/test/corpus/assignments.txt | 126 + .../tree-sitter/test/corpus/comments.txt | 28 + .../tree-sitter/test/corpus/control_flow.txt | 182 + .../tree-sitter/test/corpus/imports.txt | 39 + .../tree-sitter/test/corpus/lambdas.txt | 107 + .../tree-sitter/test/corpus/literals.txt | 212 + .../tree-sitter/test/corpus/maps.txt | 116 + .../tree-sitter/test/corpus/methods.txt | 196 + .../tree-sitter/test/corpus/multiline.txt | 341 + .../tree-sitter/test/corpus/operators.txt | 139 + .../test/speccompat/speccompat_test.go | 186 + .../bloblang2/tree-sitter/tree-sitter.json | 17 + 28 files changed, 33322 insertions(+) create mode 100644 internal/bloblang2/demo/.gitignore create mode 100644 internal/bloblang2/demo/main.go create mode 100644 internal/bloblang2/demo/page.html create mode 100644 internal/bloblang2/tree-sitter/.gitignore create mode 100644 internal/bloblang2/tree-sitter/Taskfile.yml create mode 100644 internal/bloblang2/tree-sitter/grammar.js create mode 100644 internal/bloblang2/tree-sitter/package-lock.json create mode 100644 internal/bloblang2/tree-sitter/package.json create mode 100644 internal/bloblang2/tree-sitter/queries/highlights.scm create mode 100644 internal/bloblang2/tree-sitter/src/grammar.json create mode 100644 internal/bloblang2/tree-sitter/src/node-types.json create mode 100644 internal/bloblang2/tree-sitter/src/parser.c create mode 100644 internal/bloblang2/tree-sitter/src/scanner.c create mode 100644 internal/bloblang2/tree-sitter/src/tree_sitter/alloc.h create mode 100644 internal/bloblang2/tree-sitter/src/tree_sitter/array.h create mode 100644 internal/bloblang2/tree-sitter/src/tree_sitter/parser.h create mode 100644 internal/bloblang2/tree-sitter/test/corpus/assignments.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/comments.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/control_flow.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/imports.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/lambdas.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/literals.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/maps.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/methods.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/multiline.txt create mode 100644 internal/bloblang2/tree-sitter/test/corpus/operators.txt create mode 100644 internal/bloblang2/tree-sitter/test/speccompat/speccompat_test.go create mode 100644 internal/bloblang2/tree-sitter/tree-sitter.json diff --git a/internal/bloblang2/demo/.gitignore b/internal/bloblang2/demo/.gitignore new file mode 100644 index 000000000..7f7fd2caa --- /dev/null +++ b/internal/bloblang2/demo/.gitignore @@ -0,0 +1,7 @@ +# Built by: task sync-demo (in ../tree-sitter/) +tree-sitter-bloblang2.wasm +highlights.scm + +# Built by: npm run bundle (in ../ts/) +bloblang2.mjs +bloblang2.mjs.map diff --git a/internal/bloblang2/demo/main.go b/internal/bloblang2/demo/main.go new file mode 100644 index 000000000..bc0ab25ab --- /dev/null +++ b/internal/bloblang2/demo/main.go @@ -0,0 +1,329 @@ +package main + +import ( + "context" + "encoding/json" + "flag" + "fmt" + "log" + "net/http" + "os" + "os/exec" + "os/signal" + "path/filepath" + "runtime" + "sort" + "strings" + "syscall" + "time" + + _ "embed" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + + "gopkg.in/yaml.v3" +) + +//go:embed page.html +var pageHTML []byte + +// Shared browser assets (tree-sitter WASM, highlights query, TS bundle) are +// generated by the V2 build pipeline (ts/bundle.mjs, tree-sitter sync-demo) +// and written into this directory. They are not checked in, so we serve them +// from disk at request time rather than embedding — that way `go test ./...` +// on a fresh clone still compiles this package without those artefacts +// present. +var assetsDir = func() string { + _, thisFile, _, _ := runtime.Caller(0) + return filepath.Dir(thisFile) +}() + +// Cached at startup since they don't change. +var ( + stdlibMethods map[string]syntax.MethodInfo + stdlibFunctions map[string]syntax.FunctionInfo + stdlibMethodOpcodes map[string]uint16 + stdlibFunctionOpcodes map[string]uint16 +) + +func init() { + stdlibMethods, stdlibFunctions = eval.StdlibNames() + stdlibMethodOpcodes, stdlibFunctionOpcodes = eval.StdlibOpcodes() +} + +type executeRequest struct { + Mapping string `json:"mapping"` + Input string `json:"input"` +} + +type posError struct { + Line int `json:"line"` + Column int `json:"column"` + Message string `json:"message"` +} + +type executeResponse struct { + Result string `json:"result,omitempty"` + ParseErrors []posError `json:"parse_errors,omitempty"` + RuntimeError string `json:"runtime_error,omitempty"` +} + +func handleExecute(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + + var req executeRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + http.Error(w, err.Error(), http.StatusBadRequest) + return + } + + var resp executeResponse + defer func() { + w.Header().Set("Content-Type", "application/json") + _ = json.NewEncoder(w).Encode(resp) + }() + + // 1. Parse. + prog, errs := syntax.Parse(req.Mapping, "", nil) + if len(errs) > 0 { + resp.ParseErrors = posErrorsFromSyntax(errs) + return + } + + // 2. Optimize. + syntax.Optimize(prog) + + // 3. Resolve. + if resolveErrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: stdlibMethods, + Functions: stdlibFunctions, + MethodOpcodes: stdlibMethodOpcodes, + FunctionOpcodes: stdlibFunctionOpcodes, + }); len(resolveErrs) > 0 { + resp.ParseErrors = posErrorsFromSyntax(resolveErrs) + return + } + + // 4. Parse input JSON. + var inputVal any + if err := json.Unmarshal([]byte(req.Input), &inputVal); err != nil { + resp.RuntimeError = fmt.Sprintf("invalid input JSON: %v", err) + return + } + + // 5. Execute. + interp := eval.New(prog) + interp.RegisterStdlib() + interp.RegisterLambdaMethods() + + output, _, deleted, err := interp.Run(inputVal, map[string]any{}) + if err != nil { + resp.RuntimeError = err.Error() + return + } + if deleted { + resp.Result = "< message deleted >" + return + } + + outBytes, err := json.MarshalIndent(output, "", " ") + if err != nil { + resp.RuntimeError = fmt.Sprintf("failed to marshal output: %v", err) + return + } + resp.Result = string(outBytes) +} + +type completionItem struct { + Label string `json:"label"` + Kind string `json:"kind"` // "method" or "function" +} + +var cachedCompletions []byte + +func init() { + var items []completionItem + for name := range stdlibMethods { + items = append(items, completionItem{Label: name, Kind: "method"}) + } + for name := range stdlibFunctions { + items = append(items, completionItem{Label: name, Kind: "function"}) + } + cachedCompletions, _ = json.Marshal(items) +} + +func handleCompletions(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "application/json") + w.Header().Set("Cache-Control", "public, max-age=3600") + _, _ = w.Write(cachedCompletions) +} + +// caseStudiesDir returns the absolute path to the case_studies directory, +// derived from the source file location so it works with `go run`. +func caseStudiesDir() string { + _, thisFile, _, _ := runtime.Caller(0) + return filepath.Join(filepath.Dir(thisFile), "..", "spec", "tests", "case_studies") +} + +type caseStudySpec struct { + Description string `yaml:"description"` + Tests []caseStudyTest `yaml:"tests"` +} + +type caseStudyTest struct { + Name string `yaml:"name"` + Mapping string `yaml:"mapping"` + Input any `yaml:"input"` +} + +type caseStudyItem struct { + File string `json:"file"` + Name string `json:"name"` + Description string `json:"description"` + Mapping string `json:"mapping"` + Input string `json:"input"` +} + +func handleCaseStudies(w http.ResponseWriter, r *http.Request) { + dir := caseStudiesDir() + entries, err := os.ReadDir(dir) + if err != nil { + http.Error(w, "case studies not found", http.StatusNotFound) + return + } + + var items []caseStudyItem + for _, entry := range entries { + if entry.IsDir() || !strings.HasSuffix(entry.Name(), ".yaml") { + continue + } + data, err := os.ReadFile(filepath.Join(dir, entry.Name())) + if err != nil { + continue + } + var spec caseStudySpec + if err := yaml.Unmarshal(data, &spec); err != nil { + continue + } + for _, t := range spec.Tests { + if t.Mapping == "" { + continue + } + inputJSON, err := json.MarshalIndent(t.Input, "", " ") + if err != nil { + continue + } + items = append(items, caseStudyItem{ + File: entry.Name(), + Name: t.Name, + Description: strings.TrimSpace(spec.Description), + Mapping: t.Mapping, + Input: string(inputJSON), + }) + } + } + + sort.Slice(items, func(i, j int) bool { + return items[i].Name < items[j].Name + }) + + w.Header().Set("Content-Type", "application/json") + _ = json.NewEncoder(w).Encode(items) +} + +// serveAsset serves a generated asset from this package's directory. If the +// asset is missing we return a 404 with a hint pointing at the build task, +// rather than crashing at startup. +func serveAsset(name, contentType string) http.HandlerFunc { + return func(w http.ResponseWriter, r *http.Request) { + path := filepath.Join(assetsDir, name) + data, err := os.ReadFile(path) + if err != nil { + http.Error(w, fmt.Sprintf("%s not available (run `task -d internal/bloblang2 build:demo` first): %v", name, err), http.StatusNotFound) + return + } + w.Header().Set("Content-Type", contentType) + w.Header().Set("Cache-Control", "public, max-age=3600") + _, _ = w.Write(data) + } +} + +func posErrorsFromSyntax(errs []syntax.PosError) []posError { + out := make([]posError, len(errs)) + for i, e := range errs { + out[i] = posError{ + Line: e.Pos.Line, + Column: e.Pos.Column, + Message: e.Msg, + } + } + return out +} + +func openBrowser(url string) { + var cmd string + switch runtime.GOOS { + case "darwin": + cmd = "open" + case "linux": + cmd = "xdg-open" + case "windows": + cmd = "rundll32" + _ = exec.Command(cmd, "url.dll,FileProtocolHandler", url).Start() + return + default: + return + } + _ = exec.Command(cmd, url).Start() +} + +func main() { + addr := flag.String("addr", ":4195", "listen address") + noOpen := flag.Bool("no-open", false, "don't open browser automatically") + flag.Parse() + + mux := http.NewServeMux() + mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "text/html; charset=utf-8") + _, _ = w.Write(pageHTML) + }) + mux.HandleFunc("/execute", handleExecute) + mux.HandleFunc("/completions", handleCompletions) + mux.HandleFunc("/case-studies", handleCaseStudies) + mux.HandleFunc("/tree-sitter-bloblang2.wasm", serveAsset("tree-sitter-bloblang2.wasm", "application/wasm")) + mux.HandleFunc("/highlights.scm", serveAsset("highlights.scm", "text/plain; charset=utf-8")) + mux.HandleFunc("/bloblang2.mjs", serveAsset("bloblang2.mjs", "application/javascript; charset=utf-8")) + mux.HandleFunc("/bloblang2.mjs.map", serveAsset("bloblang2.mjs.map", "application/json; charset=utf-8")) + + server := &http.Server{ + Addr: *addr, + Handler: mux, + ReadTimeout: 10 * time.Second, + WriteTimeout: 10 * time.Second, + } + + if !*noOpen { + openBrowser("http://localhost" + *addr) + } + + log.Printf("Bloblang V2 demo server listening on http://localhost%s", *addr) + log.Printf("WARNING: This server is for local demo purposes only. Do not expose to the internet.") + + go func() { + sigChan := make(chan os.Signal, 1) + signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM) + <-sigChan + log.Println("Shutting down...") + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + _ = server.Shutdown(ctx) + }() + + if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed { + log.Fatalf("Server error: %v", err) + } +} diff --git a/internal/bloblang2/demo/page.html b/internal/bloblang2/demo/page.html new file mode 100644 index 000000000..b09ca40ec --- /dev/null +++ b/internal/bloblang2/demo/page.html @@ -0,0 +1,668 @@ + + + + + Bloblang V2 Editor + + + +
+
+
+ Input + + + JSON + +
+
+
+
+
+ Output + + + + +
+
+

+            
+
+
+
+ Mapping + + + + Bloblang V2 + +
+
+
+
+ + + + + + + + + + diff --git a/internal/bloblang2/tree-sitter/.gitignore b/internal/bloblang2/tree-sitter/.gitignore new file mode 100644 index 000000000..1dee5cb08 --- /dev/null +++ b/internal/bloblang2/tree-sitter/.gitignore @@ -0,0 +1,2 @@ +node_modules +tree-sitter-bloblang2.wasm diff --git a/internal/bloblang2/tree-sitter/Taskfile.yml b/internal/bloblang2/tree-sitter/Taskfile.yml new file mode 100644 index 000000000..d00da3cb5 --- /dev/null +++ b/internal/bloblang2/tree-sitter/Taskfile.yml @@ -0,0 +1,44 @@ +version: "3" + +tasks: + install: + desc: Install npm dependencies + cmds: + - npm install + sources: + - package.json + generates: + - node_modules/.package-lock.json + + generate: + desc: Generate the tree-sitter parser from grammar.js + deps: [install] + cmds: + - npx tree-sitter generate + + test: + desc: Run the tree-sitter corpus tests + deps: [generate] + cmds: + - npx tree-sitter test + + build-wasm: + desc: Compile the grammar to WASM (requires Docker or emscripten) + deps: [generate] + cmds: + - npx tree-sitter build --wasm -o tree-sitter-bloblang2.wasm + + sync-demo: + desc: Copy WASM and highlights.scm into the demo directory + deps: [build-wasm] + cmds: + - cp tree-sitter-bloblang2.wasm ../demo/tree-sitter-bloblang2.wasm + - cp queries/highlights.scm ../demo/highlights.scm + + all: + desc: Full rebuild — generate, test, build WASM, sync to demo + cmds: + - task: generate + - task: test + - task: build-wasm + - task: sync-demo diff --git a/internal/bloblang2/tree-sitter/grammar.js b/internal/bloblang2/tree-sitter/grammar.js new file mode 100644 index 000000000..675f1cc8a --- /dev/null +++ b/internal/bloblang2/tree-sitter/grammar.js @@ -0,0 +1,476 @@ +/// +// @ts-check + +const PREC = { + or: 10, + and: 20, + equality: 40, + comparison: 60, + additive: 80, + multiplicative: 100, + unary: 120, + postfix: 140, +}; + +module.exports = grammar({ + name: "bloblang2", + + // Two external tokens for newline handling: + // - _newline: significant newline (statement separator) + // - _nl_skip: newline consumed as whitespace (inside parens, brackets, etc.) + // + // When the parser is in a state where _newline is valid (between statements), + // the scanner checks for postfix continuation before emitting it. + // When _newline is not valid (inside expressions, argument lists, etc.), + // the scanner emits _nl_skip which is in extras and silently consumed. + externals: ($) => [$._newline, $._nl_skip], + + // Newlines are NOT in extras — they're handled by the external scanner. + // _nl_skip is in extras so newlines inside (), [], and expression contexts + // are silently consumed when the parser doesn't expect a statement separator. + extras: ($) => [/[ \t\r]/, $.comment, $._nl_skip], + + word: ($) => $.identifier, + + conflicts: ($) => [ + // '(' identifier ')' could be lambda params or parenthesized expression + [$.parameter, $._primary], + // 'match' '{' — boolean match body or object literal as subject + [$.match_expression, $.object], + // $var followed by '.' could be var_assignment path or field_access expression + [$.var_assignment, $._primary], + ], + + rules: { + // Statements separated by newlines. _newline is required between + // statements (enforcing "one statement per line"). Each _source_item + // is either a statement or a bare newline, but consecutive statements + // without an intervening newline will produce a parse error. + source_file: ($) => repeat($._source_item), + + _source_item: ($) => choice($._top_level_statement, $._newline), + + _top_level_statement: ($) => + choice( + $.assignment, + $.if_statement, + $.match_statement, + $.map_declaration, + $.import_statement, + ), + + // --- Assignments --- + + assignment: ($) => seq($.assign_target, "=", $._expression), + + assign_target: ($) => + choice( + seq("output", optional($.metadata_access), repeat($.target_path_segment)), + seq($.variable, repeat($.target_path_segment)), + ), + + metadata_access: (_) => "@", + + target_path_segment: ($) => + choice( + seq(".", $._field_name), + seq(".", $._field_name, "(", optional($.argument_list), ")"), + seq("[", $._expression, "]"), + ), + + // --- Map declarations --- + + // map_decl := 'map' id '(' params ')' '{' NL? (var_assignment NL)* expression NL? '}' + map_declaration: ($) => + seq( + "map", + field("name", $.identifier), + "(", + optional($.parameter_list), + ")", + "{", + optional($._newline), + $.expr_body, + optional($._newline), + "}", + ), + + parameter_list: ($) => seq($.parameter, repeat(seq(",", $.parameter))), + + parameter: ($) => + choice( + seq($.identifier, "=", $._literal), + $.identifier, + "_", + ), + + // expr_body := (var_assignment NL)* expression + expr_body: ($) => seq(repeat(seq($.var_assignment, $._newline)), $._expression), + + var_assignment: ($) => + seq($.variable, repeat($.target_path_segment), "=", $._expression), + + // --- Imports --- + + import_statement: ($) => seq("import", $.string, "as", $.identifier), + + // --- Expressions --- + + _expression: ($) => + choice( + $._primary, + $.unary_expression, + $.binary_expression, + $.lambda_expression, + $.if_expression, + $.match_expression, + $.field_access, + $.null_safe_field_access, + $.method_call, + $.null_safe_method_call, + $.index, + $.null_safe_index, + ), + + _primary: ($) => + choice( + $.integer, + $.float, + $.string, + $.raw_string, + $.boolean, + $.null, + $.array, + $.object, + $.input, + $.output, + $.variable, + $.identifier, + $.qualified_name, + $.call_expression, + $.parenthesized_expression, + ), + + input: ($) => seq("input", optional($.metadata_access)), + output: ($) => seq("output", optional($.metadata_access)), + + // --- Postfix operations --- + + field_access: ($) => + prec.left( + PREC.postfix, + seq(field("receiver", $._expression), ".", field("field", $._field_name)), + ), + + null_safe_field_access: ($) => + prec.left( + PREC.postfix, + seq(field("receiver", $._expression), "?.", field("field", $._field_name)), + ), + + method_call: ($) => + prec.left( + PREC.postfix + 1, + seq( + field("receiver", $._expression), + ".", + field("method", $._field_name), + "(", + optional($.argument_list), + ")", + ), + ), + + null_safe_method_call: ($) => + prec.left( + PREC.postfix + 1, + seq( + field("receiver", $._expression), + "?.", + field("method", $._field_name), + "(", + optional($.argument_list), + ")", + ), + ), + + index: ($) => + prec.left( + PREC.postfix, + seq(field("receiver", $._expression), "[", field("index", $._expression), "]"), + ), + + null_safe_index: ($) => + prec.left( + PREC.postfix, + seq(field("receiver", $._expression), "?[", field("index", $._expression), "]"), + ), + + // Field names can be identifiers (including keywords) or quoted strings. + // e.g., input.name, input.map, input."field with spaces" + _field_name: ($) => choice($._word, $.string), + + // _word includes keywords — used for field/method names after '.' and '?.' + _word: ($) => + choice( + $.identifier, + alias("input", $.identifier), + alias("output", $.identifier), + alias("if", $.identifier), + alias("else", $.identifier), + alias("match", $.identifier), + alias("as", $.identifier), + alias("map", $.identifier), + alias("import", $.identifier), + alias("true", $.identifier), + alias("false", $.identifier), + alias("null", $.identifier), + alias("deleted", $.identifier), + alias("throw", $.identifier), + alias("void", $.identifier), + ), + + // --- Function/method calls --- + + call_expression: ($) => + prec( + PREC.postfix, + seq( + field("name", choice($.identifier, $.qualified_name, "deleted", "throw", "void")), + "(", + optional($.argument_list), + ")", + ), + ), + + qualified_name: ($) => + seq(field("namespace", $.identifier), "::", field("name", $.identifier)), + + argument_list: ($) => choice($.positional_arguments, $.named_arguments), + + positional_arguments: ($) => + seq($._expression, repeat(seq(",", $._expression)), optional(",")), + + named_arguments: ($) => + seq($.named_argument, repeat(seq(",", $.named_argument)), optional(",")), + + named_argument: ($) => + seq(field("name", $.identifier), ":", field("value", $._expression)), + + // --- Unary expressions --- + + unary_expression: ($) => + prec( + PREC.unary, + seq(field("operator", choice("!", "-")), field("operand", $._expression)), + ), + + // --- Binary expressions --- + + binary_expression: ($) => + choice( + prec.left(PREC.or, seq(field("left", $._expression), field("operator", "||"), field("right", $._expression))), + prec.left(PREC.and, seq(field("left", $._expression), field("operator", "&&"), field("right", $._expression))), + prec.left(PREC.additive, seq(field("left", $._expression), field("operator", choice("+", "-")), field("right", $._expression))), + prec.left(PREC.multiplicative, seq(field("left", $._expression), field("operator", choice("*", "/", "%")), field("right", $._expression))), + // Non-associative in spec, left-associative here — semantic analysis rejects chaining. + prec.left(PREC.equality, seq(field("left", $._expression), field("operator", choice("==", "!=")), field("right", $._expression))), + prec.left(PREC.comparison, seq(field("left", $._expression), field("operator", choice(">", ">=", "<", "<=")), field("right", $._expression))), + ), + + // --- Control flow --- + + // if_expr := 'if' expr '{' NL? expr_body NL? '}' (else_if)* (else)? + if_expression: ($) => + prec.right( + seq( + "if", + field("condition", $._expression), + "{", + optional($._newline), + field("consequence", $.expr_body), + optional($._newline), + "}", + repeat($.else_if_clause), + optional($.else_clause), + ), + ), + + // if_stmt := 'if' expr stmt_block (else_if_stmt)* (else_stmt)? + if_statement: ($) => + prec.right( + seq( + "if", + field("condition", $._expression), + $.statement_block, + repeat($.else_if_statement_clause), + optional($.else_statement_clause), + ), + ), + + else_if_clause: ($) => + seq( + "else", "if", + field("condition", $._expression), + "{", + optional($._newline), + field("consequence", $.expr_body), + optional($._newline), + "}", + ), + + else_clause: ($) => + seq( + "else", "{", + optional($._newline), + field("alternative", $.expr_body), + optional($._newline), + "}", + ), + + else_if_statement_clause: ($) => + seq("else", "if", field("condition", $._expression), $.statement_block), + + else_statement_clause: ($) => seq("else", $.statement_block), + + // stmt_block := '{' (NL | statement)* '}' + statement_block: ($) => + seq("{", repeat(choice($._statement, $._newline)), "}"), + + _statement: ($) => + choice($.assignment, $.if_statement, $.match_statement), + + // match_expr := 'match' expr? ('as' id)? '{' match_cases '}' + // No _newline inside — match cases are comma-separated, so newlines + // between them are whitespace (consumed by _nl_skip in extras). + match_expression: ($) => + prec.right( + seq( + "match", + optional(seq( + field("subject", $._expression), + optional(seq("as", field("binding", $.identifier))), + )), + "{", + optional($.match_cases), + "}", + ), + ), + + match_statement: ($) => + prec.right( + seq( + "match", + optional(seq( + field("subject", $._expression), + optional(seq("as", field("binding", $.identifier))), + )), + "{", + optional($.match_statement_cases), + "}", + ), + ), + + // Match cases are comma-separated with optional newlines around commas. + match_cases: ($) => repeat1(seq($.match_case, optional(","))), + + match_case: ($) => + seq( + field("pattern", choice($._expression, "_")), + "=>", + field("body", choice($._expression, $.match_block)), + ), + + // Same disambiguation rationale as lambda_block — see comment there. + match_block: ($) => seq("{", $.expr_body, "}"), + + match_statement_cases: ($) => repeat1(seq($.match_statement_case, optional(","))), + + match_statement_case: ($) => + seq( + field("pattern", choice($._expression, "_")), + "=>", + field("body", $.statement_block), + ), + + // --- Lambda expressions --- + + lambda_expression: ($) => + prec.right( + seq( + field("parameters", $._lambda_params), + "->", + field("body", choice($._expression, $.lambda_block)), + ), + ), + + _lambda_params: ($) => + choice($.identifier, "_", seq("(", $.parameter_list, ")")), + + // lambda_block := '{' (var_assignment NL)* expression '}' + // Note: no optional(_newline) around expr_body — this is critical for + // disambiguation with object literals. When a lambda body starts with '{', + // the parser must try both lambda_block and object. If lambda_block consumed + // _newline after '{', the scanner would emit NEWLINE (since it's valid), + // which kills the object path (objects don't expect NEWLINE). By omitting + // _newline here, the scanner emits NL_SKIP (whitespace) and both paths + // survive until ':' (object) or '=' (block) disambiguates. + lambda_block: ($) => seq("{", $.expr_body, "}"), + + // --- Grouped expression --- + + parenthesized_expression: ($) => seq("(", $._expression, ")"), + + // --- Literals --- + + _literal: ($) => + choice($.integer, $.float, $.string, $.raw_string, $.boolean, $.null), + + integer: (_) => /[0-9]+/, + + float: (_) => /[0-9]+\.[0-9]+/, + + string: ($) => + seq('"', repeat(choice($.escape_sequence, $.string_content)), '"'), + + string_content: (_) => token.immediate(prec(1, /[^"\\\n]+/)), + + escape_sequence: (_) => + token.immediate(seq("\\", choice( + /["\\ntr]/, + /u[0-9a-fA-F]{4}/, + /u\{[0-9a-fA-F]{1,6}\}/, + ))), + + raw_string: (_) => /`[^`]*`/, + + boolean: (_) => choice("true", "false"), + + null: (_) => "null", + + array: ($) => + seq("[", optional(seq($._expression, repeat(seq(",", $._expression)), optional(","))), "]"), + + object: ($) => + seq( + "{", + optional(seq($.object_entry, repeat(seq(",", $.object_entry)), optional(","))), + "}", + ), + + object_entry: ($) => + seq(field("key", $._expression), ":", field("value", $._expression)), + + // --- Variables --- + + variable: (_) => /\$[a-zA-Z_][a-zA-Z0-9_]*/, + + // --- Identifiers --- + + identifier: (_) => /[a-zA-Z_][a-zA-Z0-9_]*/, + + // --- Comments --- + + comment: (_) => token(seq("#", /.*/)), + }, +}); diff --git a/internal/bloblang2/tree-sitter/package-lock.json b/internal/bloblang2/tree-sitter/package-lock.json new file mode 100644 index 000000000..5d0429b19 --- /dev/null +++ b/internal/bloblang2/tree-sitter/package-lock.json @@ -0,0 +1,29 @@ +{ + "name": "tree-sitter-bloblang2", + "version": "0.1.0", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "tree-sitter-bloblang2", + "version": "0.1.0", + "devDependencies": { + "tree-sitter-cli": "^0.24.0" + } + }, + "node_modules/tree-sitter-cli": { + "version": "0.24.7", + "resolved": "https://registry.npmjs.org/tree-sitter-cli/-/tree-sitter-cli-0.24.7.tgz", + "integrity": "sha512-o4gnE82pVmMMhJbWwD6+I9yr4lXii5Ci5qEQ2pFpUbVy1YiD8cizTJaqdcznA0qEbo7l2OneI1GocChPrI4YGQ==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "bin": { + "tree-sitter": "cli.js" + }, + "engines": { + "node": ">=12.0.0" + } + } + } +} diff --git a/internal/bloblang2/tree-sitter/package.json b/internal/bloblang2/tree-sitter/package.json new file mode 100644 index 000000000..d1b8dc560 --- /dev/null +++ b/internal/bloblang2/tree-sitter/package.json @@ -0,0 +1,13 @@ +{ + "name": "tree-sitter-bloblang2", + "version": "0.1.0", + "main": "bindings/node", + "devDependencies": { + "tree-sitter-cli": "^0.24.0" + }, + "scripts": { + "generate": "tree-sitter generate", + "test": "tree-sitter test", + "build-wasm": "tree-sitter build --wasm -o tree-sitter-bloblang2.wasm" + } +} diff --git a/internal/bloblang2/tree-sitter/queries/highlights.scm b/internal/bloblang2/tree-sitter/queries/highlights.scm new file mode 100644 index 000000000..f3f3b8fff --- /dev/null +++ b/internal/bloblang2/tree-sitter/queries/highlights.scm @@ -0,0 +1,77 @@ +; Keywords +["if" "else" "match" "as" "map" "import"] @keyword + +; Context roots +["input" "output"] @keyword.builtin + +; Constants +["true" "false"] @constant.builtin +"null" @constant.builtin + +; Discard +"_" @variable.builtin + +; Literals +(integer) @number +(float) @number.float +(string) @string +(raw_string) @string +(string_content) @string +(escape_sequence) @string.escape + +; Variables ($name) +(variable) @variable + +; Map declarations +(map_declaration + name: (identifier) @function) + +; Function calls +(call_expression + name: (identifier) @function.call) +(call_expression + name: (qualified_name + namespace: (identifier) @module + name: (identifier) @function.call)) +"deleted" @function.builtin +"throw" @function.builtin +"void" @function.builtin + +; Method calls +(method_call + method: (_) @function.method) +(null_safe_method_call + method: (_) @function.method) + +; Field access +(field_access + field: (_) @property) +(null_safe_field_access + field: (_) @property) + +; Parameters +(parameter + (identifier) @variable.parameter) + +; Named arguments +(named_argument + name: (identifier) @variable.parameter) + +; Metadata +(metadata_access) @attribute + +; Qualified name namespace +(qualified_name + namespace: (identifier) @module) + +; Operators +["+" "-" "*" "/" "%" "==" "!=" ">" ">=" "<" "<=" "&&" "||" "!"] @operator +["=" "=>" "->"] @operator + +; Punctuation +["." "?." "::"] @punctuation.delimiter +["(" ")" "[" "]" "?[" "{" "}"] @punctuation.bracket +["," ":"] @punctuation.delimiter + +; Comments +(comment) @comment diff --git a/internal/bloblang2/tree-sitter/src/grammar.json b/internal/bloblang2/tree-sitter/src/grammar.json new file mode 100644 index 000000000..af7a4def5 --- /dev/null +++ b/internal/bloblang2/tree-sitter/src/grammar.json @@ -0,0 +1,2451 @@ +{ + "$schema": "https://tree-sitter.github.io/tree-sitter/assets/schemas/grammar.schema.json", + "name": "bloblang2", + "word": "identifier", + "rules": { + "source_file": { + "type": "REPEAT", + "content": { + "type": "SYMBOL", + "name": "_source_item" + } + }, + "_source_item": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_top_level_statement" + }, + { + "type": "SYMBOL", + "name": "_newline" + } + ] + }, + "_top_level_statement": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "assignment" + }, + { + "type": "SYMBOL", + "name": "if_statement" + }, + { + "type": "SYMBOL", + "name": "match_statement" + }, + { + "type": "SYMBOL", + "name": "map_declaration" + }, + { + "type": "SYMBOL", + "name": "import_statement" + } + ] + }, + "assignment": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "assign_target" + }, + { + "type": "STRING", + "value": "=" + }, + { + "type": "SYMBOL", + "name": "_expression" + } + ] + }, + "assign_target": { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "output" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "metadata_access" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "REPEAT", + "content": { + "type": "SYMBOL", + "name": "target_path_segment" + } + } + ] + }, + { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "variable" + }, + { + "type": "REPEAT", + "content": { + "type": "SYMBOL", + "name": "target_path_segment" + } + } + ] + } + ] + }, + "metadata_access": { + "type": "STRING", + "value": "@" + }, + "target_path_segment": { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "." + }, + { + "type": "SYMBOL", + "name": "_field_name" + } + ] + }, + { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "." + }, + { + "type": "SYMBOL", + "name": "_field_name" + }, + { + "type": "STRING", + "value": "(" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "argument_list" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": ")" + } + ] + }, + { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "[" + }, + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "STRING", + "value": "]" + } + ] + } + ] + }, + "map_declaration": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "map" + }, + { + "type": "FIELD", + "name": "name", + "content": { + "type": "SYMBOL", + "name": "identifier" + } + }, + { + "type": "STRING", + "value": "(" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "parameter_list" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": ")" + }, + { + "type": "STRING", + "value": "{" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "SYMBOL", + "name": "expr_body" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "}" + } + ] + }, + "parameter_list": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "parameter" + }, + { + "type": "REPEAT", + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "SYMBOL", + "name": "parameter" + } + ] + } + } + ] + }, + "parameter": { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "identifier" + }, + { + "type": "STRING", + "value": "=" + }, + { + "type": "SYMBOL", + "name": "_literal" + } + ] + }, + { + "type": "SYMBOL", + "name": "identifier" + }, + { + "type": "STRING", + "value": "_" + } + ] + }, + "expr_body": { + "type": "SEQ", + "members": [ + { + "type": "REPEAT", + "content": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "var_assignment" + }, + { + "type": "SYMBOL", + "name": "_newline" + } + ] + } + }, + { + "type": "SYMBOL", + "name": "_expression" + } + ] + }, + "var_assignment": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "variable" + }, + { + "type": "REPEAT", + "content": { + "type": "SYMBOL", + "name": "target_path_segment" + } + }, + { + "type": "STRING", + "value": "=" + }, + { + "type": "SYMBOL", + "name": "_expression" + } + ] + }, + "import_statement": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "import" + }, + { + "type": "SYMBOL", + "name": "string" + }, + { + "type": "STRING", + "value": "as" + }, + { + "type": "SYMBOL", + "name": "identifier" + } + ] + }, + "_expression": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_primary" + }, + { + "type": "SYMBOL", + "name": "unary_expression" + }, + { + "type": "SYMBOL", + "name": "binary_expression" + }, + { + "type": "SYMBOL", + "name": "lambda_expression" + }, + { + "type": "SYMBOL", + "name": "if_expression" + }, + { + "type": "SYMBOL", + "name": "match_expression" + }, + { + "type": "SYMBOL", + "name": "field_access" + }, + { + "type": "SYMBOL", + "name": "null_safe_field_access" + }, + { + "type": "SYMBOL", + "name": "method_call" + }, + { + "type": "SYMBOL", + "name": "null_safe_method_call" + }, + { + "type": "SYMBOL", + "name": "index" + }, + { + "type": "SYMBOL", + "name": "null_safe_index" + } + ] + }, + "_primary": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "integer" + }, + { + "type": "SYMBOL", + "name": "float" + }, + { + "type": "SYMBOL", + "name": "string" + }, + { + "type": "SYMBOL", + "name": "raw_string" + }, + { + "type": "SYMBOL", + "name": "boolean" + }, + { + "type": "SYMBOL", + "name": "null" + }, + { + "type": "SYMBOL", + "name": "array" + }, + { + "type": "SYMBOL", + "name": "object" + }, + { + "type": "SYMBOL", + "name": "input" + }, + { + "type": "SYMBOL", + "name": "output" + }, + { + "type": "SYMBOL", + "name": "variable" + }, + { + "type": "SYMBOL", + "name": "identifier" + }, + { + "type": "SYMBOL", + "name": "qualified_name" + }, + { + "type": "SYMBOL", + "name": "call_expression" + }, + { + "type": "SYMBOL", + "name": "parenthesized_expression" + } + ] + }, + "input": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "input" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "metadata_access" + }, + { + "type": "BLANK" + } + ] + } + ] + }, + "output": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "output" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "metadata_access" + }, + { + "type": "BLANK" + } + ] + } + ] + }, + "field_access": { + "type": "PREC_LEFT", + "value": 140, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "receiver", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "." + }, + { + "type": "FIELD", + "name": "field", + "content": { + "type": "SYMBOL", + "name": "_field_name" + } + } + ] + } + }, + "null_safe_field_access": { + "type": "PREC_LEFT", + "value": 140, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "receiver", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "?." + }, + { + "type": "FIELD", + "name": "field", + "content": { + "type": "SYMBOL", + "name": "_field_name" + } + } + ] + } + }, + "method_call": { + "type": "PREC_LEFT", + "value": 141, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "receiver", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "." + }, + { + "type": "FIELD", + "name": "method", + "content": { + "type": "SYMBOL", + "name": "_field_name" + } + }, + { + "type": "STRING", + "value": "(" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "argument_list" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": ")" + } + ] + } + }, + "null_safe_method_call": { + "type": "PREC_LEFT", + "value": 141, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "receiver", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "?." + }, + { + "type": "FIELD", + "name": "method", + "content": { + "type": "SYMBOL", + "name": "_field_name" + } + }, + { + "type": "STRING", + "value": "(" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "argument_list" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": ")" + } + ] + } + }, + "index": { + "type": "PREC_LEFT", + "value": 140, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "receiver", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "[" + }, + { + "type": "FIELD", + "name": "index", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "]" + } + ] + } + }, + "null_safe_index": { + "type": "PREC_LEFT", + "value": 140, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "receiver", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "?[" + }, + { + "type": "FIELD", + "name": "index", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "]" + } + ] + } + }, + "_field_name": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_word" + }, + { + "type": "SYMBOL", + "name": "string" + } + ] + }, + "_word": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "input" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "output" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "if" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "else" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "match" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "as" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "map" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "import" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "true" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "false" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "null" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "deleted" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "throw" + }, + "named": true, + "value": "identifier" + }, + { + "type": "ALIAS", + "content": { + "type": "STRING", + "value": "void" + }, + "named": true, + "value": "identifier" + } + ] + }, + "call_expression": { + "type": "PREC", + "value": 140, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "name", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "identifier" + }, + { + "type": "SYMBOL", + "name": "qualified_name" + }, + { + "type": "STRING", + "value": "deleted" + }, + { + "type": "STRING", + "value": "throw" + }, + { + "type": "STRING", + "value": "void" + } + ] + } + }, + { + "type": "STRING", + "value": "(" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "argument_list" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": ")" + } + ] + } + }, + "qualified_name": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "namespace", + "content": { + "type": "SYMBOL", + "name": "identifier" + } + }, + { + "type": "STRING", + "value": "::" + }, + { + "type": "FIELD", + "name": "name", + "content": { + "type": "SYMBOL", + "name": "identifier" + } + } + ] + }, + "argument_list": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "positional_arguments" + }, + { + "type": "SYMBOL", + "name": "named_arguments" + } + ] + }, + "positional_arguments": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "REPEAT", + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "SYMBOL", + "name": "_expression" + } + ] + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "BLANK" + } + ] + } + ] + }, + "named_arguments": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "named_argument" + }, + { + "type": "REPEAT", + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "SYMBOL", + "name": "named_argument" + } + ] + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "BLANK" + } + ] + } + ] + }, + "named_argument": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "name", + "content": { + "type": "SYMBOL", + "name": "identifier" + } + }, + { + "type": "STRING", + "value": ":" + }, + { + "type": "FIELD", + "name": "value", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + }, + "unary_expression": { + "type": "PREC", + "value": 120, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "operator", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "!" + }, + { + "type": "STRING", + "value": "-" + } + ] + } + }, + { + "type": "FIELD", + "name": "operand", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + } + }, + "binary_expression": { + "type": "CHOICE", + "members": [ + { + "type": "PREC_LEFT", + "value": 10, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "left", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "FIELD", + "name": "operator", + "content": { + "type": "STRING", + "value": "||" + } + }, + { + "type": "FIELD", + "name": "right", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + } + }, + { + "type": "PREC_LEFT", + "value": 20, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "left", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "FIELD", + "name": "operator", + "content": { + "type": "STRING", + "value": "&&" + } + }, + { + "type": "FIELD", + "name": "right", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + } + }, + { + "type": "PREC_LEFT", + "value": 80, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "left", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "FIELD", + "name": "operator", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "+" + }, + { + "type": "STRING", + "value": "-" + } + ] + } + }, + { + "type": "FIELD", + "name": "right", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + } + }, + { + "type": "PREC_LEFT", + "value": 100, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "left", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "FIELD", + "name": "operator", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "*" + }, + { + "type": "STRING", + "value": "/" + }, + { + "type": "STRING", + "value": "%" + } + ] + } + }, + { + "type": "FIELD", + "name": "right", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + } + }, + { + "type": "PREC_LEFT", + "value": 40, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "left", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "FIELD", + "name": "operator", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "==" + }, + { + "type": "STRING", + "value": "!=" + } + ] + } + }, + { + "type": "FIELD", + "name": "right", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + } + }, + { + "type": "PREC_LEFT", + "value": 60, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "left", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "FIELD", + "name": "operator", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": ">" + }, + { + "type": "STRING", + "value": ">=" + }, + { + "type": "STRING", + "value": "<" + }, + { + "type": "STRING", + "value": "<=" + } + ] + } + }, + { + "type": "FIELD", + "name": "right", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + } + } + ] + }, + "if_expression": { + "type": "PREC_RIGHT", + "value": 0, + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "if" + }, + { + "type": "FIELD", + "name": "condition", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "{" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "FIELD", + "name": "consequence", + "content": { + "type": "SYMBOL", + "name": "expr_body" + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "}" + }, + { + "type": "REPEAT", + "content": { + "type": "SYMBOL", + "name": "else_if_clause" + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "else_clause" + }, + { + "type": "BLANK" + } + ] + } + ] + } + }, + "if_statement": { + "type": "PREC_RIGHT", + "value": 0, + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "if" + }, + { + "type": "FIELD", + "name": "condition", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "SYMBOL", + "name": "statement_block" + }, + { + "type": "REPEAT", + "content": { + "type": "SYMBOL", + "name": "else_if_statement_clause" + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "else_statement_clause" + }, + { + "type": "BLANK" + } + ] + } + ] + } + }, + "else_if_clause": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "else" + }, + { + "type": "STRING", + "value": "if" + }, + { + "type": "FIELD", + "name": "condition", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": "{" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "FIELD", + "name": "consequence", + "content": { + "type": "SYMBOL", + "name": "expr_body" + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "}" + } + ] + }, + "else_clause": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "else" + }, + { + "type": "STRING", + "value": "{" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "FIELD", + "name": "alternative", + "content": { + "type": "SYMBOL", + "name": "expr_body" + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "}" + } + ] + }, + "else_if_statement_clause": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "else" + }, + { + "type": "STRING", + "value": "if" + }, + { + "type": "FIELD", + "name": "condition", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "SYMBOL", + "name": "statement_block" + } + ] + }, + "else_statement_clause": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "else" + }, + { + "type": "SYMBOL", + "name": "statement_block" + } + ] + }, + "statement_block": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "{" + }, + { + "type": "REPEAT", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_statement" + }, + { + "type": "SYMBOL", + "name": "_newline" + } + ] + } + }, + { + "type": "STRING", + "value": "}" + } + ] + }, + "_statement": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "assignment" + }, + { + "type": "SYMBOL", + "name": "if_statement" + }, + { + "type": "SYMBOL", + "name": "match_statement" + } + ] + }, + "match_expression": { + "type": "PREC_RIGHT", + "value": 0, + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "match" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "subject", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "as" + }, + { + "type": "FIELD", + "name": "binding", + "content": { + "type": "SYMBOL", + "name": "identifier" + } + } + ] + }, + { + "type": "BLANK" + } + ] + } + ] + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "{" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "match_cases" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "}" + } + ] + } + }, + "match_statement": { + "type": "PREC_RIGHT", + "value": 0, + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "match" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "subject", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "as" + }, + { + "type": "FIELD", + "name": "binding", + "content": { + "type": "SYMBOL", + "name": "identifier" + } + } + ] + }, + { + "type": "BLANK" + } + ] + } + ] + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "{" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "match_statement_cases" + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "}" + } + ] + } + }, + "match_cases": { + "type": "REPEAT1", + "content": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "match_case" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "BLANK" + } + ] + } + ] + } + }, + "match_case": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "pattern", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "STRING", + "value": "_" + } + ] + } + }, + { + "type": "STRING", + "value": "=>" + }, + { + "type": "FIELD", + "name": "body", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "SYMBOL", + "name": "match_block" + } + ] + } + } + ] + }, + "match_block": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "{" + }, + { + "type": "SYMBOL", + "name": "expr_body" + }, + { + "type": "STRING", + "value": "}" + } + ] + }, + "match_statement_cases": { + "type": "REPEAT1", + "content": { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "match_statement_case" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "BLANK" + } + ] + } + ] + } + }, + "match_statement_case": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "pattern", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "STRING", + "value": "_" + } + ] + } + }, + { + "type": "STRING", + "value": "=>" + }, + { + "type": "FIELD", + "name": "body", + "content": { + "type": "SYMBOL", + "name": "statement_block" + } + } + ] + }, + "lambda_expression": { + "type": "PREC_RIGHT", + "value": 0, + "content": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "parameters", + "content": { + "type": "SYMBOL", + "name": "_lambda_params" + } + }, + { + "type": "STRING", + "value": "->" + }, + { + "type": "FIELD", + "name": "body", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "SYMBOL", + "name": "lambda_block" + } + ] + } + } + ] + } + }, + "_lambda_params": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "identifier" + }, + { + "type": "STRING", + "value": "_" + }, + { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "(" + }, + { + "type": "SYMBOL", + "name": "parameter_list" + }, + { + "type": "STRING", + "value": ")" + } + ] + } + ] + }, + "lambda_block": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "{" + }, + { + "type": "SYMBOL", + "name": "expr_body" + }, + { + "type": "STRING", + "value": "}" + } + ] + }, + "parenthesized_expression": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "(" + }, + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "STRING", + "value": ")" + } + ] + }, + "_literal": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "integer" + }, + { + "type": "SYMBOL", + "name": "float" + }, + { + "type": "SYMBOL", + "name": "string" + }, + { + "type": "SYMBOL", + "name": "raw_string" + }, + { + "type": "SYMBOL", + "name": "boolean" + }, + { + "type": "SYMBOL", + "name": "null" + } + ] + }, + "integer": { + "type": "PATTERN", + "value": "[0-9]+" + }, + "float": { + "type": "PATTERN", + "value": "[0-9]+\\.[0-9]+" + }, + "string": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "\"" + }, + { + "type": "REPEAT", + "content": { + "type": "CHOICE", + "members": [ + { + "type": "SYMBOL", + "name": "escape_sequence" + }, + { + "type": "SYMBOL", + "name": "string_content" + } + ] + } + }, + { + "type": "STRING", + "value": "\"" + } + ] + }, + "string_content": { + "type": "IMMEDIATE_TOKEN", + "content": { + "type": "PREC", + "value": 1, + "content": { + "type": "PATTERN", + "value": "[^\"\\\\\\n]+" + } + } + }, + "escape_sequence": { + "type": "IMMEDIATE_TOKEN", + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "\\" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "PATTERN", + "value": "[\"\\\\ntr]" + }, + { + "type": "PATTERN", + "value": "u[0-9a-fA-F]{4}" + }, + { + "type": "PATTERN", + "value": "u\\{[0-9a-fA-F]{1,6}\\}" + } + ] + } + ] + } + }, + "raw_string": { + "type": "PATTERN", + "value": "`[^`]*`" + }, + "boolean": { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "true" + }, + { + "type": "STRING", + "value": "false" + } + ] + }, + "null": { + "type": "STRING", + "value": "null" + }, + "array": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "[" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "_expression" + }, + { + "type": "REPEAT", + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "SYMBOL", + "name": "_expression" + } + ] + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "BLANK" + } + ] + } + ] + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "]" + } + ] + }, + "object": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "{" + }, + { + "type": "CHOICE", + "members": [ + { + "type": "SEQ", + "members": [ + { + "type": "SYMBOL", + "name": "object_entry" + }, + { + "type": "REPEAT", + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "SYMBOL", + "name": "object_entry" + } + ] + } + }, + { + "type": "CHOICE", + "members": [ + { + "type": "STRING", + "value": "," + }, + { + "type": "BLANK" + } + ] + } + ] + }, + { + "type": "BLANK" + } + ] + }, + { + "type": "STRING", + "value": "}" + } + ] + }, + "object_entry": { + "type": "SEQ", + "members": [ + { + "type": "FIELD", + "name": "key", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + }, + { + "type": "STRING", + "value": ":" + }, + { + "type": "FIELD", + "name": "value", + "content": { + "type": "SYMBOL", + "name": "_expression" + } + } + ] + }, + "variable": { + "type": "PATTERN", + "value": "\\$[a-zA-Z_][a-zA-Z0-9_]*" + }, + "identifier": { + "type": "PATTERN", + "value": "[a-zA-Z_][a-zA-Z0-9_]*" + }, + "comment": { + "type": "TOKEN", + "content": { + "type": "SEQ", + "members": [ + { + "type": "STRING", + "value": "#" + }, + { + "type": "PATTERN", + "value": ".*" + } + ] + } + } + }, + "extras": [ + { + "type": "PATTERN", + "value": "[ \\t\\r]" + }, + { + "type": "SYMBOL", + "name": "comment" + }, + { + "type": "SYMBOL", + "name": "_nl_skip" + } + ], + "conflicts": [ + [ + "parameter", + "_primary" + ], + [ + "match_expression", + "object" + ], + [ + "var_assignment", + "_primary" + ] + ], + "precedences": [], + "externals": [ + { + "type": "SYMBOL", + "name": "_newline" + }, + { + "type": "SYMBOL", + "name": "_nl_skip" + } + ], + "inline": [], + "supertypes": [] +} diff --git a/internal/bloblang2/tree-sitter/src/node-types.json b/internal/bloblang2/tree-sitter/src/node-types.json new file mode 100644 index 000000000..65fb26f21 --- /dev/null +++ b/internal/bloblang2/tree-sitter/src/node-types.json @@ -0,0 +1,4553 @@ +[ + { + "type": "argument_list", + "named": true, + "fields": {}, + "children": { + "multiple": false, + "required": true, + "types": [ + { + "type": "named_arguments", + "named": true + }, + { + "type": "positional_arguments", + "named": true + } + ] + } + }, + { + "type": "array", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "assign_target", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "metadata_access", + "named": true + }, + { + "type": "target_path_segment", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "assignment", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "assign_target", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "binary_expression", + "named": true, + "fields": { + "left": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "operator": { + "multiple": false, + "required": true, + "types": [ + { + "type": "!=", + "named": false + }, + { + "type": "%", + "named": false + }, + { + "type": "&&", + "named": false + }, + { + "type": "*", + "named": false + }, + { + "type": "+", + "named": false + }, + { + "type": "-", + "named": false + }, + { + "type": "/", + "named": false + }, + { + "type": "<", + "named": false + }, + { + "type": "<=", + "named": false + }, + { + "type": "==", + "named": false + }, + { + "type": ">", + "named": false + }, + { + "type": ">=", + "named": false + }, + { + "type": "||", + "named": false + } + ] + }, + "right": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "boolean", + "named": true, + "fields": {} + }, + { + "type": "call_expression", + "named": true, + "fields": { + "name": { + "multiple": false, + "required": true, + "types": [ + { + "type": "deleted", + "named": false + }, + { + "type": "identifier", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "throw", + "named": false + }, + { + "type": "void", + "named": false + } + ] + } + }, + "children": { + "multiple": false, + "required": false, + "types": [ + { + "type": "argument_list", + "named": true + } + ] + } + }, + { + "type": "else_clause", + "named": true, + "fields": { + "alternative": { + "multiple": false, + "required": true, + "types": [ + { + "type": "expr_body", + "named": true + } + ] + } + } + }, + { + "type": "else_if_clause", + "named": true, + "fields": { + "condition": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "consequence": { + "multiple": false, + "required": true, + "types": [ + { + "type": "expr_body", + "named": true + } + ] + } + } + }, + { + "type": "else_if_statement_clause", + "named": true, + "fields": { + "condition": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + "children": { + "multiple": false, + "required": true, + "types": [ + { + "type": "statement_block", + "named": true + } + ] + } + }, + { + "type": "else_statement_clause", + "named": true, + "fields": {}, + "children": { + "multiple": false, + "required": true, + "types": [ + { + "type": "statement_block", + "named": true + } + ] + } + }, + { + "type": "expr_body", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "var_assignment", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "field_access", + "named": true, + "fields": { + "field": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + }, + { + "type": "string", + "named": true + } + ] + }, + "receiver": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "if_expression", + "named": true, + "fields": { + "condition": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "consequence": { + "multiple": false, + "required": true, + "types": [ + { + "type": "expr_body", + "named": true + } + ] + } + }, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "else_clause", + "named": true + }, + { + "type": "else_if_clause", + "named": true + } + ] + } + }, + { + "type": "if_statement", + "named": true, + "fields": { + "condition": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "else_if_statement_clause", + "named": true + }, + { + "type": "else_statement_clause", + "named": true + }, + { + "type": "statement_block", + "named": true + } + ] + } + }, + { + "type": "import_statement", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + }, + { + "type": "string", + "named": true + } + ] + } + }, + { + "type": "index", + "named": true, + "fields": { + "index": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "receiver": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "input", + "named": true, + "fields": {}, + "children": { + "multiple": false, + "required": false, + "types": [ + { + "type": "metadata_access", + "named": true + } + ] + } + }, + { + "type": "lambda_block", + "named": true, + "fields": {}, + "children": { + "multiple": false, + "required": true, + "types": [ + { + "type": "expr_body", + "named": true + } + ] + } + }, + { + "type": "lambda_expression", + "named": true, + "fields": { + "body": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_block", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "parameters": { + "multiple": true, + "required": true, + "types": [ + { + "type": "(", + "named": false + }, + { + "type": ")", + "named": false + }, + { + "type": "_", + "named": false + }, + { + "type": "identifier", + "named": true + }, + { + "type": "parameter_list", + "named": true + } + ] + } + } + }, + { + "type": "map_declaration", + "named": true, + "fields": { + "name": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + } + ] + } + }, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "expr_body", + "named": true + }, + { + "type": "parameter_list", + "named": true + } + ] + } + }, + { + "type": "match_block", + "named": true, + "fields": {}, + "children": { + "multiple": false, + "required": true, + "types": [ + { + "type": "expr_body", + "named": true + } + ] + } + }, + { + "type": "match_case", + "named": true, + "fields": { + "body": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_block", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "pattern": { + "multiple": false, + "required": true, + "types": [ + { + "type": "_", + "named": false + }, + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "match_cases", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "match_case", + "named": true + } + ] + } + }, + { + "type": "match_expression", + "named": true, + "fields": { + "binding": { + "multiple": false, + "required": false, + "types": [ + { + "type": "identifier", + "named": true + } + ] + }, + "subject": { + "multiple": false, + "required": false, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + "children": { + "multiple": false, + "required": false, + "types": [ + { + "type": "match_cases", + "named": true + } + ] + } + }, + { + "type": "match_statement", + "named": true, + "fields": { + "binding": { + "multiple": false, + "required": false, + "types": [ + { + "type": "identifier", + "named": true + } + ] + }, + "subject": { + "multiple": false, + "required": false, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + "children": { + "multiple": false, + "required": false, + "types": [ + { + "type": "match_statement_cases", + "named": true + } + ] + } + }, + { + "type": "match_statement_case", + "named": true, + "fields": { + "body": { + "multiple": false, + "required": true, + "types": [ + { + "type": "statement_block", + "named": true + } + ] + }, + "pattern": { + "multiple": false, + "required": true, + "types": [ + { + "type": "_", + "named": false + }, + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "match_statement_cases", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "match_statement_case", + "named": true + } + ] + } + }, + { + "type": "method_call", + "named": true, + "fields": { + "method": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + }, + { + "type": "string", + "named": true + } + ] + }, + "receiver": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + "children": { + "multiple": false, + "required": false, + "types": [ + { + "type": "argument_list", + "named": true + } + ] + } + }, + { + "type": "named_argument", + "named": true, + "fields": { + "name": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + } + ] + }, + "value": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "named_arguments", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "named_argument", + "named": true + } + ] + } + }, + { + "type": "null", + "named": true, + "fields": {} + }, + { + "type": "null_safe_field_access", + "named": true, + "fields": { + "field": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + }, + { + "type": "string", + "named": true + } + ] + }, + "receiver": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "null_safe_index", + "named": true, + "fields": { + "index": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "receiver": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "null_safe_method_call", + "named": true, + "fields": { + "method": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + }, + { + "type": "string", + "named": true + } + ] + }, + "receiver": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + "children": { + "multiple": false, + "required": false, + "types": [ + { + "type": "argument_list", + "named": true + } + ] + } + }, + { + "type": "object", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "object_entry", + "named": true + } + ] + } + }, + { + "type": "object_entry", + "named": true, + "fields": { + "key": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "value": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + } + }, + { + "type": "output", + "named": true, + "fields": {}, + "children": { + "multiple": false, + "required": false, + "types": [ + { + "type": "metadata_access", + "named": true + } + ] + } + }, + { + "type": "parameter", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "boolean", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + } + ] + } + }, + { + "type": "parameter_list", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "parameter", + "named": true + } + ] + } + }, + { + "type": "parenthesized_expression", + "named": true, + "fields": {}, + "children": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "positional_arguments", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "qualified_name", + "named": true, + "fields": { + "name": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + } + ] + }, + "namespace": { + "multiple": false, + "required": true, + "types": [ + { + "type": "identifier", + "named": true + } + ] + } + } + }, + { + "type": "source_file", + "named": true, + "root": true, + "fields": {}, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "assignment", + "named": true + }, + { + "type": "if_statement", + "named": true + }, + { + "type": "import_statement", + "named": true + }, + { + "type": "map_declaration", + "named": true + }, + { + "type": "match_statement", + "named": true + } + ] + } + }, + { + "type": "statement_block", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "assignment", + "named": true + }, + { + "type": "if_statement", + "named": true + }, + { + "type": "match_statement", + "named": true + } + ] + } + }, + { + "type": "string", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": false, + "types": [ + { + "type": "escape_sequence", + "named": true + }, + { + "type": "string_content", + "named": true + } + ] + } + }, + { + "type": "target_path_segment", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "argument_list", + "named": true + }, + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "unary_expression", + "named": true, + "fields": { + "operand": { + "multiple": false, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + }, + "operator": { + "multiple": false, + "required": true, + "types": [ + { + "type": "!", + "named": false + }, + { + "type": "-", + "named": false + } + ] + } + } + }, + { + "type": "var_assignment", + "named": true, + "fields": {}, + "children": { + "multiple": true, + "required": true, + "types": [ + { + "type": "array", + "named": true + }, + { + "type": "binary_expression", + "named": true + }, + { + "type": "boolean", + "named": true + }, + { + "type": "call_expression", + "named": true + }, + { + "type": "field_access", + "named": true + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if_expression", + "named": true + }, + { + "type": "index", + "named": true + }, + { + "type": "input", + "named": true + }, + { + "type": "integer", + "named": true + }, + { + "type": "lambda_expression", + "named": true + }, + { + "type": "match_expression", + "named": true + }, + { + "type": "method_call", + "named": true + }, + { + "type": "null", + "named": true + }, + { + "type": "null_safe_field_access", + "named": true + }, + { + "type": "null_safe_index", + "named": true + }, + { + "type": "null_safe_method_call", + "named": true + }, + { + "type": "object", + "named": true + }, + { + "type": "output", + "named": true + }, + { + "type": "parenthesized_expression", + "named": true + }, + { + "type": "qualified_name", + "named": true + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string", + "named": true + }, + { + "type": "target_path_segment", + "named": true + }, + { + "type": "unary_expression", + "named": true + }, + { + "type": "variable", + "named": true + } + ] + } + }, + { + "type": "!", + "named": false + }, + { + "type": "!=", + "named": false + }, + { + "type": "\"", + "named": false + }, + { + "type": "%", + "named": false + }, + { + "type": "&&", + "named": false + }, + { + "type": "(", + "named": false + }, + { + "type": ")", + "named": false + }, + { + "type": "*", + "named": false + }, + { + "type": "+", + "named": false + }, + { + "type": ",", + "named": false + }, + { + "type": "-", + "named": false + }, + { + "type": "->", + "named": false + }, + { + "type": ".", + "named": false + }, + { + "type": "/", + "named": false + }, + { + "type": ":", + "named": false + }, + { + "type": "::", + "named": false + }, + { + "type": "<", + "named": false + }, + { + "type": "<=", + "named": false + }, + { + "type": "=", + "named": false + }, + { + "type": "==", + "named": false + }, + { + "type": "=>", + "named": false + }, + { + "type": ">", + "named": false + }, + { + "type": ">=", + "named": false + }, + { + "type": "?.", + "named": false + }, + { + "type": "?[", + "named": false + }, + { + "type": "[", + "named": false + }, + { + "type": "]", + "named": false + }, + { + "type": "_", + "named": false + }, + { + "type": "as", + "named": false + }, + { + "type": "comment", + "named": true + }, + { + "type": "deleted", + "named": false + }, + { + "type": "else", + "named": false + }, + { + "type": "escape_sequence", + "named": true + }, + { + "type": "false", + "named": false + }, + { + "type": "float", + "named": true + }, + { + "type": "identifier", + "named": true + }, + { + "type": "if", + "named": false + }, + { + "type": "import", + "named": false + }, + { + "type": "input", + "named": false + }, + { + "type": "integer", + "named": true + }, + { + "type": "map", + "named": false + }, + { + "type": "match", + "named": false + }, + { + "type": "metadata_access", + "named": true + }, + { + "type": "null", + "named": false + }, + { + "type": "output", + "named": false + }, + { + "type": "raw_string", + "named": true + }, + { + "type": "string_content", + "named": true + }, + { + "type": "throw", + "named": false + }, + { + "type": "true", + "named": false + }, + { + "type": "variable", + "named": true + }, + { + "type": "void", + "named": false + }, + { + "type": "{", + "named": false + }, + { + "type": "||", + "named": false + }, + { + "type": "}", + "named": false + } +] \ No newline at end of file diff --git a/internal/bloblang2/tree-sitter/src/parser.c b/internal/bloblang2/tree-sitter/src/parser.c new file mode 100644 index 000000000..6d306cc4c --- /dev/null +++ b/internal/bloblang2/tree-sitter/src/parser.c @@ -0,0 +1,22228 @@ +#include "tree_sitter/parser.h" + +#if defined(__GNUC__) || defined(__clang__) +#pragma GCC diagnostic ignored "-Wmissing-field-initializers" +#endif + +#define LANGUAGE_VERSION 14 +#define STATE_COUNT 485 +#define LARGE_STATE_COUNT 2 +#define SYMBOL_COUNT 128 +#define ALIAS_COUNT 0 +#define TOKEN_COUNT 57 +#define EXTERNAL_TOKEN_COUNT 2 +#define FIELD_COUNT 20 +#define MAX_ALIAS_SEQUENCE_LENGTH 10 +#define PRODUCTION_ID_COUNT 24 + +enum ts_symbol_identifiers { + sym_identifier = 1, + anon_sym_EQ = 2, + anon_sym_output = 3, + sym_metadata_access = 4, + anon_sym_DOT = 5, + anon_sym_LPAREN = 6, + anon_sym_RPAREN = 7, + anon_sym_LBRACK = 8, + anon_sym_RBRACK = 9, + anon_sym_map = 10, + anon_sym_LBRACE = 11, + anon_sym_RBRACE = 12, + anon_sym_COMMA = 13, + anon_sym__ = 14, + anon_sym_import = 15, + anon_sym_as = 16, + anon_sym_input = 17, + anon_sym_QMARK_DOT = 18, + anon_sym_QMARK_LBRACK = 19, + anon_sym_if = 20, + anon_sym_else = 21, + anon_sym_match = 22, + anon_sym_true = 23, + anon_sym_false = 24, + anon_sym_null = 25, + anon_sym_deleted = 26, + anon_sym_throw = 27, + anon_sym_void = 28, + anon_sym_COLON_COLON = 29, + anon_sym_COLON = 30, + anon_sym_BANG = 31, + anon_sym_DASH = 32, + anon_sym_PIPE_PIPE = 33, + anon_sym_AMP_AMP = 34, + anon_sym_PLUS = 35, + anon_sym_STAR = 36, + anon_sym_SLASH = 37, + anon_sym_PERCENT = 38, + anon_sym_EQ_EQ = 39, + anon_sym_BANG_EQ = 40, + anon_sym_GT = 41, + anon_sym_GT_EQ = 42, + anon_sym_LT = 43, + anon_sym_LT_EQ = 44, + anon_sym_EQ_GT = 45, + anon_sym_DASH_GT = 46, + sym_integer = 47, + sym_float = 48, + anon_sym_DQUOTE = 49, + sym_string_content = 50, + sym_escape_sequence = 51, + sym_raw_string = 52, + sym_variable = 53, + sym_comment = 54, + sym__newline = 55, + sym__nl_skip = 56, + sym_source_file = 57, + sym__source_item = 58, + sym__top_level_statement = 59, + sym_assignment = 60, + sym_assign_target = 61, + sym_target_path_segment = 62, + sym_map_declaration = 63, + sym_parameter_list = 64, + sym_parameter = 65, + sym_expr_body = 66, + sym_var_assignment = 67, + sym_import_statement = 68, + sym__expression = 69, + sym__primary = 70, + sym_input = 71, + sym_output = 72, + sym_field_access = 73, + sym_null_safe_field_access = 74, + sym_method_call = 75, + sym_null_safe_method_call = 76, + sym_index = 77, + sym_null_safe_index = 78, + sym__field_name = 79, + sym__word = 80, + sym_call_expression = 81, + sym_qualified_name = 82, + sym_argument_list = 83, + sym_positional_arguments = 84, + sym_named_arguments = 85, + sym_named_argument = 86, + sym_unary_expression = 87, + sym_binary_expression = 88, + sym_if_expression = 89, + sym_if_statement = 90, + sym_else_if_clause = 91, + sym_else_clause = 92, + sym_else_if_statement_clause = 93, + sym_else_statement_clause = 94, + sym_statement_block = 95, + sym__statement = 96, + sym_match_expression = 97, + sym_match_statement = 98, + sym_match_cases = 99, + sym_match_case = 100, + sym_match_block = 101, + sym_match_statement_cases = 102, + sym_match_statement_case = 103, + sym_lambda_expression = 104, + sym__lambda_params = 105, + sym_lambda_block = 106, + sym_parenthesized_expression = 107, + sym__literal = 108, + sym_string = 109, + sym_boolean = 110, + sym_null = 111, + sym_array = 112, + sym_object = 113, + sym_object_entry = 114, + aux_sym_source_file_repeat1 = 115, + aux_sym_assign_target_repeat1 = 116, + aux_sym_parameter_list_repeat1 = 117, + aux_sym_expr_body_repeat1 = 118, + aux_sym_positional_arguments_repeat1 = 119, + aux_sym_named_arguments_repeat1 = 120, + aux_sym_if_expression_repeat1 = 121, + aux_sym_if_statement_repeat1 = 122, + aux_sym_statement_block_repeat1 = 123, + aux_sym_match_cases_repeat1 = 124, + aux_sym_match_statement_cases_repeat1 = 125, + aux_sym_string_repeat1 = 126, + aux_sym_object_repeat1 = 127, +}; + +static const char * const ts_symbol_names[] = { + [ts_builtin_sym_end] = "end", + [sym_identifier] = "identifier", + [anon_sym_EQ] = "=", + [anon_sym_output] = "output", + [sym_metadata_access] = "metadata_access", + [anon_sym_DOT] = ".", + [anon_sym_LPAREN] = "(", + [anon_sym_RPAREN] = ")", + [anon_sym_LBRACK] = "[", + [anon_sym_RBRACK] = "]", + [anon_sym_map] = "map", + [anon_sym_LBRACE] = "{", + [anon_sym_RBRACE] = "}", + [anon_sym_COMMA] = ",", + [anon_sym__] = "_", + [anon_sym_import] = "import", + [anon_sym_as] = "as", + [anon_sym_input] = "input", + [anon_sym_QMARK_DOT] = "\?.", + [anon_sym_QMARK_LBRACK] = "\?[", + [anon_sym_if] = "if", + [anon_sym_else] = "else", + [anon_sym_match] = "match", + [anon_sym_true] = "true", + [anon_sym_false] = "false", + [anon_sym_null] = "null", + [anon_sym_deleted] = "deleted", + [anon_sym_throw] = "throw", + [anon_sym_void] = "void", + [anon_sym_COLON_COLON] = "::", + [anon_sym_COLON] = ":", + [anon_sym_BANG] = "!", + [anon_sym_DASH] = "-", + [anon_sym_PIPE_PIPE] = "||", + [anon_sym_AMP_AMP] = "&&", + [anon_sym_PLUS] = "+", + [anon_sym_STAR] = "*", + [anon_sym_SLASH] = "/", + [anon_sym_PERCENT] = "%", + [anon_sym_EQ_EQ] = "==", + [anon_sym_BANG_EQ] = "!=", + [anon_sym_GT] = ">", + [anon_sym_GT_EQ] = ">=", + [anon_sym_LT] = "<", + [anon_sym_LT_EQ] = "<=", + [anon_sym_EQ_GT] = "=>", + [anon_sym_DASH_GT] = "->", + [sym_integer] = "integer", + [sym_float] = "float", + [anon_sym_DQUOTE] = "\"", + [sym_string_content] = "string_content", + [sym_escape_sequence] = "escape_sequence", + [sym_raw_string] = "raw_string", + [sym_variable] = "variable", + [sym_comment] = "comment", + [sym__newline] = "_newline", + [sym__nl_skip] = "_nl_skip", + [sym_source_file] = "source_file", + [sym__source_item] = "_source_item", + [sym__top_level_statement] = "_top_level_statement", + [sym_assignment] = "assignment", + [sym_assign_target] = "assign_target", + [sym_target_path_segment] = "target_path_segment", + [sym_map_declaration] = "map_declaration", + [sym_parameter_list] = "parameter_list", + [sym_parameter] = "parameter", + [sym_expr_body] = "expr_body", + [sym_var_assignment] = "var_assignment", + [sym_import_statement] = "import_statement", + [sym__expression] = "_expression", + [sym__primary] = "_primary", + [sym_input] = "input", + [sym_output] = "output", + [sym_field_access] = "field_access", + [sym_null_safe_field_access] = "null_safe_field_access", + [sym_method_call] = "method_call", + [sym_null_safe_method_call] = "null_safe_method_call", + [sym_index] = "index", + [sym_null_safe_index] = "null_safe_index", + [sym__field_name] = "_field_name", + [sym__word] = "_word", + [sym_call_expression] = "call_expression", + [sym_qualified_name] = "qualified_name", + [sym_argument_list] = "argument_list", + [sym_positional_arguments] = "positional_arguments", + [sym_named_arguments] = "named_arguments", + [sym_named_argument] = "named_argument", + [sym_unary_expression] = "unary_expression", + [sym_binary_expression] = "binary_expression", + [sym_if_expression] = "if_expression", + [sym_if_statement] = "if_statement", + [sym_else_if_clause] = "else_if_clause", + [sym_else_clause] = "else_clause", + [sym_else_if_statement_clause] = "else_if_statement_clause", + [sym_else_statement_clause] = "else_statement_clause", + [sym_statement_block] = "statement_block", + [sym__statement] = "_statement", + [sym_match_expression] = "match_expression", + [sym_match_statement] = "match_statement", + [sym_match_cases] = "match_cases", + [sym_match_case] = "match_case", + [sym_match_block] = "match_block", + [sym_match_statement_cases] = "match_statement_cases", + [sym_match_statement_case] = "match_statement_case", + [sym_lambda_expression] = "lambda_expression", + [sym__lambda_params] = "_lambda_params", + [sym_lambda_block] = "lambda_block", + [sym_parenthesized_expression] = "parenthesized_expression", + [sym__literal] = "_literal", + [sym_string] = "string", + [sym_boolean] = "boolean", + [sym_null] = "null", + [sym_array] = "array", + [sym_object] = "object", + [sym_object_entry] = "object_entry", + [aux_sym_source_file_repeat1] = "source_file_repeat1", + [aux_sym_assign_target_repeat1] = "assign_target_repeat1", + [aux_sym_parameter_list_repeat1] = "parameter_list_repeat1", + [aux_sym_expr_body_repeat1] = "expr_body_repeat1", + [aux_sym_positional_arguments_repeat1] = "positional_arguments_repeat1", + [aux_sym_named_arguments_repeat1] = "named_arguments_repeat1", + [aux_sym_if_expression_repeat1] = "if_expression_repeat1", + [aux_sym_if_statement_repeat1] = "if_statement_repeat1", + [aux_sym_statement_block_repeat1] = "statement_block_repeat1", + [aux_sym_match_cases_repeat1] = "match_cases_repeat1", + [aux_sym_match_statement_cases_repeat1] = "match_statement_cases_repeat1", + [aux_sym_string_repeat1] = "string_repeat1", + [aux_sym_object_repeat1] = "object_repeat1", +}; + +static const TSSymbol ts_symbol_map[] = { + [ts_builtin_sym_end] = ts_builtin_sym_end, + [sym_identifier] = sym_identifier, + [anon_sym_EQ] = anon_sym_EQ, + [anon_sym_output] = anon_sym_output, + [sym_metadata_access] = sym_metadata_access, + [anon_sym_DOT] = anon_sym_DOT, + [anon_sym_LPAREN] = anon_sym_LPAREN, + [anon_sym_RPAREN] = anon_sym_RPAREN, + [anon_sym_LBRACK] = anon_sym_LBRACK, + [anon_sym_RBRACK] = anon_sym_RBRACK, + [anon_sym_map] = anon_sym_map, + [anon_sym_LBRACE] = anon_sym_LBRACE, + [anon_sym_RBRACE] = anon_sym_RBRACE, + [anon_sym_COMMA] = anon_sym_COMMA, + [anon_sym__] = anon_sym__, + [anon_sym_import] = anon_sym_import, + [anon_sym_as] = anon_sym_as, + [anon_sym_input] = anon_sym_input, + [anon_sym_QMARK_DOT] = anon_sym_QMARK_DOT, + [anon_sym_QMARK_LBRACK] = anon_sym_QMARK_LBRACK, + [anon_sym_if] = anon_sym_if, + [anon_sym_else] = anon_sym_else, + [anon_sym_match] = anon_sym_match, + [anon_sym_true] = anon_sym_true, + [anon_sym_false] = anon_sym_false, + [anon_sym_null] = anon_sym_null, + [anon_sym_deleted] = anon_sym_deleted, + [anon_sym_throw] = anon_sym_throw, + [anon_sym_void] = anon_sym_void, + [anon_sym_COLON_COLON] = anon_sym_COLON_COLON, + [anon_sym_COLON] = anon_sym_COLON, + [anon_sym_BANG] = anon_sym_BANG, + [anon_sym_DASH] = anon_sym_DASH, + [anon_sym_PIPE_PIPE] = anon_sym_PIPE_PIPE, + [anon_sym_AMP_AMP] = anon_sym_AMP_AMP, + [anon_sym_PLUS] = anon_sym_PLUS, + [anon_sym_STAR] = anon_sym_STAR, + [anon_sym_SLASH] = anon_sym_SLASH, + [anon_sym_PERCENT] = anon_sym_PERCENT, + [anon_sym_EQ_EQ] = anon_sym_EQ_EQ, + [anon_sym_BANG_EQ] = anon_sym_BANG_EQ, + [anon_sym_GT] = anon_sym_GT, + [anon_sym_GT_EQ] = anon_sym_GT_EQ, + [anon_sym_LT] = anon_sym_LT, + [anon_sym_LT_EQ] = anon_sym_LT_EQ, + [anon_sym_EQ_GT] = anon_sym_EQ_GT, + [anon_sym_DASH_GT] = anon_sym_DASH_GT, + [sym_integer] = sym_integer, + [sym_float] = sym_float, + [anon_sym_DQUOTE] = anon_sym_DQUOTE, + [sym_string_content] = sym_string_content, + [sym_escape_sequence] = sym_escape_sequence, + [sym_raw_string] = sym_raw_string, + [sym_variable] = sym_variable, + [sym_comment] = sym_comment, + [sym__newline] = sym__newline, + [sym__nl_skip] = sym__nl_skip, + [sym_source_file] = sym_source_file, + [sym__source_item] = sym__source_item, + [sym__top_level_statement] = sym__top_level_statement, + [sym_assignment] = sym_assignment, + [sym_assign_target] = sym_assign_target, + [sym_target_path_segment] = sym_target_path_segment, + [sym_map_declaration] = sym_map_declaration, + [sym_parameter_list] = sym_parameter_list, + [sym_parameter] = sym_parameter, + [sym_expr_body] = sym_expr_body, + [sym_var_assignment] = sym_var_assignment, + [sym_import_statement] = sym_import_statement, + [sym__expression] = sym__expression, + [sym__primary] = sym__primary, + [sym_input] = sym_input, + [sym_output] = sym_output, + [sym_field_access] = sym_field_access, + [sym_null_safe_field_access] = sym_null_safe_field_access, + [sym_method_call] = sym_method_call, + [sym_null_safe_method_call] = sym_null_safe_method_call, + [sym_index] = sym_index, + [sym_null_safe_index] = sym_null_safe_index, + [sym__field_name] = sym__field_name, + [sym__word] = sym__word, + [sym_call_expression] = sym_call_expression, + [sym_qualified_name] = sym_qualified_name, + [sym_argument_list] = sym_argument_list, + [sym_positional_arguments] = sym_positional_arguments, + [sym_named_arguments] = sym_named_arguments, + [sym_named_argument] = sym_named_argument, + [sym_unary_expression] = sym_unary_expression, + [sym_binary_expression] = sym_binary_expression, + [sym_if_expression] = sym_if_expression, + [sym_if_statement] = sym_if_statement, + [sym_else_if_clause] = sym_else_if_clause, + [sym_else_clause] = sym_else_clause, + [sym_else_if_statement_clause] = sym_else_if_statement_clause, + [sym_else_statement_clause] = sym_else_statement_clause, + [sym_statement_block] = sym_statement_block, + [sym__statement] = sym__statement, + [sym_match_expression] = sym_match_expression, + [sym_match_statement] = sym_match_statement, + [sym_match_cases] = sym_match_cases, + [sym_match_case] = sym_match_case, + [sym_match_block] = sym_match_block, + [sym_match_statement_cases] = sym_match_statement_cases, + [sym_match_statement_case] = sym_match_statement_case, + [sym_lambda_expression] = sym_lambda_expression, + [sym__lambda_params] = sym__lambda_params, + [sym_lambda_block] = sym_lambda_block, + [sym_parenthesized_expression] = sym_parenthesized_expression, + [sym__literal] = sym__literal, + [sym_string] = sym_string, + [sym_boolean] = sym_boolean, + [sym_null] = sym_null, + [sym_array] = sym_array, + [sym_object] = sym_object, + [sym_object_entry] = sym_object_entry, + [aux_sym_source_file_repeat1] = aux_sym_source_file_repeat1, + [aux_sym_assign_target_repeat1] = aux_sym_assign_target_repeat1, + [aux_sym_parameter_list_repeat1] = aux_sym_parameter_list_repeat1, + [aux_sym_expr_body_repeat1] = aux_sym_expr_body_repeat1, + [aux_sym_positional_arguments_repeat1] = aux_sym_positional_arguments_repeat1, + [aux_sym_named_arguments_repeat1] = aux_sym_named_arguments_repeat1, + [aux_sym_if_expression_repeat1] = aux_sym_if_expression_repeat1, + [aux_sym_if_statement_repeat1] = aux_sym_if_statement_repeat1, + [aux_sym_statement_block_repeat1] = aux_sym_statement_block_repeat1, + [aux_sym_match_cases_repeat1] = aux_sym_match_cases_repeat1, + [aux_sym_match_statement_cases_repeat1] = aux_sym_match_statement_cases_repeat1, + [aux_sym_string_repeat1] = aux_sym_string_repeat1, + [aux_sym_object_repeat1] = aux_sym_object_repeat1, +}; + +static const TSSymbolMetadata ts_symbol_metadata[] = { + [ts_builtin_sym_end] = { + .visible = false, + .named = true, + }, + [sym_identifier] = { + .visible = true, + .named = true, + }, + [anon_sym_EQ] = { + .visible = true, + .named = false, + }, + [anon_sym_output] = { + .visible = true, + .named = false, + }, + [sym_metadata_access] = { + .visible = true, + .named = true, + }, + [anon_sym_DOT] = { + .visible = true, + .named = false, + }, + [anon_sym_LPAREN] = { + .visible = true, + .named = false, + }, + [anon_sym_RPAREN] = { + .visible = true, + .named = false, + }, + [anon_sym_LBRACK] = { + .visible = true, + .named = false, + }, + [anon_sym_RBRACK] = { + .visible = true, + .named = false, + }, + [anon_sym_map] = { + .visible = true, + .named = false, + }, + [anon_sym_LBRACE] = { + .visible = true, + .named = false, + }, + [anon_sym_RBRACE] = { + .visible = true, + .named = false, + }, + [anon_sym_COMMA] = { + .visible = true, + .named = false, + }, + [anon_sym__] = { + .visible = true, + .named = false, + }, + [anon_sym_import] = { + .visible = true, + .named = false, + }, + [anon_sym_as] = { + .visible = true, + .named = false, + }, + [anon_sym_input] = { + .visible = true, + .named = false, + }, + [anon_sym_QMARK_DOT] = { + .visible = true, + .named = false, + }, + [anon_sym_QMARK_LBRACK] = { + .visible = true, + .named = false, + }, + [anon_sym_if] = { + .visible = true, + .named = false, + }, + [anon_sym_else] = { + .visible = true, + .named = false, + }, + [anon_sym_match] = { + .visible = true, + .named = false, + }, + [anon_sym_true] = { + .visible = true, + .named = false, + }, + [anon_sym_false] = { + .visible = true, + .named = false, + }, + [anon_sym_null] = { + .visible = true, + .named = false, + }, + [anon_sym_deleted] = { + .visible = true, + .named = false, + }, + [anon_sym_throw] = { + .visible = true, + .named = false, + }, + [anon_sym_void] = { + .visible = true, + .named = false, + }, + [anon_sym_COLON_COLON] = { + .visible = true, + .named = false, + }, + [anon_sym_COLON] = { + .visible = true, + .named = false, + }, + [anon_sym_BANG] = { + .visible = true, + .named = false, + }, + [anon_sym_DASH] = { + .visible = true, + .named = false, + }, + [anon_sym_PIPE_PIPE] = { + .visible = true, + .named = false, + }, + [anon_sym_AMP_AMP] = { + .visible = true, + .named = false, + }, + [anon_sym_PLUS] = { + .visible = true, + .named = false, + }, + [anon_sym_STAR] = { + .visible = true, + .named = false, + }, + [anon_sym_SLASH] = { + .visible = true, + .named = false, + }, + [anon_sym_PERCENT] = { + .visible = true, + .named = false, + }, + [anon_sym_EQ_EQ] = { + .visible = true, + .named = false, + }, + [anon_sym_BANG_EQ] = { + .visible = true, + .named = false, + }, + [anon_sym_GT] = { + .visible = true, + .named = false, + }, + [anon_sym_GT_EQ] = { + .visible = true, + .named = false, + }, + [anon_sym_LT] = { + .visible = true, + .named = false, + }, + [anon_sym_LT_EQ] = { + .visible = true, + .named = false, + }, + [anon_sym_EQ_GT] = { + .visible = true, + .named = false, + }, + [anon_sym_DASH_GT] = { + .visible = true, + .named = false, + }, + [sym_integer] = { + .visible = true, + .named = true, + }, + [sym_float] = { + .visible = true, + .named = true, + }, + [anon_sym_DQUOTE] = { + .visible = true, + .named = false, + }, + [sym_string_content] = { + .visible = true, + .named = true, + }, + [sym_escape_sequence] = { + .visible = true, + .named = true, + }, + [sym_raw_string] = { + .visible = true, + .named = true, + }, + [sym_variable] = { + .visible = true, + .named = true, + }, + [sym_comment] = { + .visible = true, + .named = true, + }, + [sym__newline] = { + .visible = false, + .named = true, + }, + [sym__nl_skip] = { + .visible = false, + .named = true, + }, + [sym_source_file] = { + .visible = true, + .named = true, + }, + [sym__source_item] = { + .visible = false, + .named = true, + }, + [sym__top_level_statement] = { + .visible = false, + .named = true, + }, + [sym_assignment] = { + .visible = true, + .named = true, + }, + [sym_assign_target] = { + .visible = true, + .named = true, + }, + [sym_target_path_segment] = { + .visible = true, + .named = true, + }, + [sym_map_declaration] = { + .visible = true, + .named = true, + }, + [sym_parameter_list] = { + .visible = true, + .named = true, + }, + [sym_parameter] = { + .visible = true, + .named = true, + }, + [sym_expr_body] = { + .visible = true, + .named = true, + }, + [sym_var_assignment] = { + .visible = true, + .named = true, + }, + [sym_import_statement] = { + .visible = true, + .named = true, + }, + [sym__expression] = { + .visible = false, + .named = true, + }, + [sym__primary] = { + .visible = false, + .named = true, + }, + [sym_input] = { + .visible = true, + .named = true, + }, + [sym_output] = { + .visible = true, + .named = true, + }, + [sym_field_access] = { + .visible = true, + .named = true, + }, + [sym_null_safe_field_access] = { + .visible = true, + .named = true, + }, + [sym_method_call] = { + .visible = true, + .named = true, + }, + [sym_null_safe_method_call] = { + .visible = true, + .named = true, + }, + [sym_index] = { + .visible = true, + .named = true, + }, + [sym_null_safe_index] = { + .visible = true, + .named = true, + }, + [sym__field_name] = { + .visible = false, + .named = true, + }, + [sym__word] = { + .visible = false, + .named = true, + }, + [sym_call_expression] = { + .visible = true, + .named = true, + }, + [sym_qualified_name] = { + .visible = true, + .named = true, + }, + [sym_argument_list] = { + .visible = true, + .named = true, + }, + [sym_positional_arguments] = { + .visible = true, + .named = true, + }, + [sym_named_arguments] = { + .visible = true, + .named = true, + }, + [sym_named_argument] = { + .visible = true, + .named = true, + }, + [sym_unary_expression] = { + .visible = true, + .named = true, + }, + [sym_binary_expression] = { + .visible = true, + .named = true, + }, + [sym_if_expression] = { + .visible = true, + .named = true, + }, + [sym_if_statement] = { + .visible = true, + .named = true, + }, + [sym_else_if_clause] = { + .visible = true, + .named = true, + }, + [sym_else_clause] = { + .visible = true, + .named = true, + }, + [sym_else_if_statement_clause] = { + .visible = true, + .named = true, + }, + [sym_else_statement_clause] = { + .visible = true, + .named = true, + }, + [sym_statement_block] = { + .visible = true, + .named = true, + }, + [sym__statement] = { + .visible = false, + .named = true, + }, + [sym_match_expression] = { + .visible = true, + .named = true, + }, + [sym_match_statement] = { + .visible = true, + .named = true, + }, + [sym_match_cases] = { + .visible = true, + .named = true, + }, + [sym_match_case] = { + .visible = true, + .named = true, + }, + [sym_match_block] = { + .visible = true, + .named = true, + }, + [sym_match_statement_cases] = { + .visible = true, + .named = true, + }, + [sym_match_statement_case] = { + .visible = true, + .named = true, + }, + [sym_lambda_expression] = { + .visible = true, + .named = true, + }, + [sym__lambda_params] = { + .visible = false, + .named = true, + }, + [sym_lambda_block] = { + .visible = true, + .named = true, + }, + [sym_parenthesized_expression] = { + .visible = true, + .named = true, + }, + [sym__literal] = { + .visible = false, + .named = true, + }, + [sym_string] = { + .visible = true, + .named = true, + }, + [sym_boolean] = { + .visible = true, + .named = true, + }, + [sym_null] = { + .visible = true, + .named = true, + }, + [sym_array] = { + .visible = true, + .named = true, + }, + [sym_object] = { + .visible = true, + .named = true, + }, + [sym_object_entry] = { + .visible = true, + .named = true, + }, + [aux_sym_source_file_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_assign_target_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_parameter_list_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_expr_body_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_positional_arguments_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_named_arguments_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_if_expression_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_if_statement_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_statement_block_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_match_cases_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_match_statement_cases_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_string_repeat1] = { + .visible = false, + .named = false, + }, + [aux_sym_object_repeat1] = { + .visible = false, + .named = false, + }, +}; + +enum ts_field_identifiers { + field_alternative = 1, + field_binding = 2, + field_body = 3, + field_condition = 4, + field_consequence = 5, + field_field = 6, + field_index = 7, + field_key = 8, + field_left = 9, + field_method = 10, + field_name = 11, + field_namespace = 12, + field_operand = 13, + field_operator = 14, + field_parameters = 15, + field_pattern = 16, + field_receiver = 17, + field_right = 18, + field_subject = 19, + field_value = 20, +}; + +static const char * const ts_field_names[] = { + [0] = NULL, + [field_alternative] = "alternative", + [field_binding] = "binding", + [field_body] = "body", + [field_condition] = "condition", + [field_consequence] = "consequence", + [field_field] = "field", + [field_index] = "index", + [field_key] = "key", + [field_left] = "left", + [field_method] = "method", + [field_name] = "name", + [field_namespace] = "namespace", + [field_operand] = "operand", + [field_operator] = "operator", + [field_parameters] = "parameters", + [field_pattern] = "pattern", + [field_receiver] = "receiver", + [field_right] = "right", + [field_subject] = "subject", + [field_value] = "value", +}; + +static const TSFieldMapSlice ts_field_map_slices[PRODUCTION_ID_COUNT] = { + [2] = {.index = 0, .length = 2}, + [3] = {.index = 2, .length = 1}, + [4] = {.index = 3, .length = 1}, + [5] = {.index = 4, .length = 2}, + [6] = {.index = 6, .length = 2}, + [7] = {.index = 8, .length = 3}, + [8] = {.index = 11, .length = 2}, + [9] = {.index = 13, .length = 1}, + [10] = {.index = 14, .length = 2}, + [11] = {.index = 16, .length = 2}, + [12] = {.index = 18, .length = 2}, + [13] = {.index = 20, .length = 2}, + [14] = {.index = 22, .length = 2}, + [15] = {.index = 24, .length = 2}, + [16] = {.index = 26, .length = 2}, + [17] = {.index = 28, .length = 1}, + [18] = {.index = 29, .length = 2}, + [19] = {.index = 31, .length = 1}, + [20] = {.index = 32, .length = 1}, + [21] = {.index = 33, .length = 1}, + [22] = {.index = 34, .length = 2}, + [23] = {.index = 36, .length = 2}, +}; + +static const TSFieldMapEntry ts_field_map_entries[] = { + [0] = + {field_operand, 1}, + {field_operator, 0}, + [2] = + {field_condition, 1}, + [3] = + {field_name, 0}, + [4] = + {field_name, 2}, + {field_namespace, 0}, + [6] = + {field_field, 2}, + {field_receiver, 0}, + [8] = + {field_left, 0}, + {field_operator, 1}, + {field_right, 2}, + [11] = + {field_body, 2}, + {field_parameters, 0}, + [13] = + {field_subject, 1}, + [14] = + {field_key, 0}, + {field_value, 2}, + [16] = + {field_index, 2}, + {field_receiver, 0}, + [18] = + {field_body, 2}, + {field_pattern, 0}, + [20] = + {field_condition, 1}, + {field_consequence, 3}, + [22] = + {field_name, 0}, + {field_value, 2}, + [24] = + {field_method, 2}, + {field_receiver, 0}, + [26] = + {field_binding, 3}, + {field_subject, 1}, + [28] = + {field_name, 1}, + [29] = + {field_condition, 1}, + {field_consequence, 4}, + [31] = + {field_condition, 2}, + [32] = + {field_alternative, 2}, + [33] = + {field_alternative, 3}, + [34] = + {field_condition, 2}, + {field_consequence, 4}, + [36] = + {field_condition, 2}, + {field_consequence, 5}, +}; + +static const TSSymbol ts_alias_sequences[PRODUCTION_ID_COUNT][MAX_ALIAS_SEQUENCE_LENGTH] = { + [0] = {0}, + [1] = { + [0] = sym_identifier, + }, +}; + +static const uint16_t ts_non_terminal_alias_map[] = { + 0, +}; + +static const TSStateId ts_primary_state_ids[STATE_COUNT] = { + [0] = 0, + [1] = 1, + [2] = 2, + [3] = 3, + [4] = 4, + [5] = 5, + [6] = 6, + [7] = 7, + [8] = 8, + [9] = 9, + [10] = 5, + [11] = 2, + [12] = 4, + [13] = 8, + [14] = 4, + [15] = 4, + [16] = 6, + [17] = 17, + [18] = 18, + [19] = 19, + [20] = 20, + [21] = 21, + [22] = 22, + [23] = 23, + [24] = 21, + [25] = 25, + [26] = 26, + [27] = 17, + [28] = 25, + [29] = 18, + [30] = 26, + [31] = 31, + [32] = 32, + [33] = 33, + [34] = 34, + [35] = 35, + [36] = 36, + [37] = 32, + [38] = 31, + [39] = 39, + [40] = 40, + [41] = 41, + [42] = 36, + [43] = 43, + [44] = 44, + [45] = 45, + [46] = 46, + [47] = 47, + [48] = 45, + [49] = 49, + [50] = 50, + [51] = 51, + [52] = 52, + [53] = 43, + [54] = 54, + [55] = 51, + [56] = 56, + [57] = 52, + [58] = 47, + [59] = 59, + [60] = 60, + [61] = 61, + [62] = 62, + [63] = 62, + [64] = 64, + [65] = 65, + [66] = 66, + [67] = 67, + [68] = 68, + [69] = 64, + [70] = 64, + [71] = 66, + [72] = 61, + [73] = 73, + [74] = 74, + [75] = 64, + [76] = 76, + [77] = 77, + [78] = 78, + [79] = 79, + [80] = 77, + [81] = 81, + [82] = 82, + [83] = 83, + [84] = 84, + [85] = 85, + [86] = 86, + [87] = 87, + [88] = 88, + [89] = 89, + [90] = 90, + [91] = 91, + [92] = 76, + [93] = 93, + [94] = 94, + [95] = 95, + [96] = 96, + [97] = 97, + [98] = 98, + [99] = 99, + [100] = 100, + [101] = 81, + [102] = 76, + [103] = 99, + [104] = 100, + [105] = 93, + [106] = 94, + [107] = 95, + [108] = 96, + [109] = 97, + [110] = 98, + [111] = 93, + [112] = 94, + [113] = 95, + [114] = 96, + [115] = 97, + [116] = 98, + [117] = 81, + [118] = 76, + [119] = 93, + [120] = 94, + [121] = 95, + [122] = 96, + [123] = 97, + [124] = 98, + [125] = 81, + [126] = 91, + [127] = 127, + [128] = 128, + [129] = 129, + [130] = 130, + [131] = 131, + [132] = 132, + [133] = 133, + [134] = 134, + [135] = 135, + [136] = 136, + [137] = 137, + [138] = 138, + [139] = 139, + [140] = 140, + [141] = 141, + [142] = 142, + [143] = 143, + [144] = 144, + [145] = 145, + [146] = 146, + [147] = 147, + [148] = 148, + [149] = 149, + [150] = 150, + [151] = 151, + [152] = 152, + [153] = 153, + [154] = 154, + [155] = 155, + [156] = 156, + [157] = 157, + [158] = 158, + [159] = 159, + [160] = 160, + [161] = 161, + [162] = 162, + [163] = 163, + [164] = 164, + [165] = 165, + [166] = 166, + [167] = 167, + [168] = 168, + [169] = 169, + [170] = 170, + [171] = 171, + [172] = 172, + [173] = 173, + [174] = 174, + [175] = 175, + [176] = 176, + [177] = 177, + [178] = 178, + [179] = 179, + [180] = 180, + [181] = 181, + [182] = 182, + [183] = 183, + [184] = 184, + [185] = 185, + [186] = 186, + [187] = 187, + [188] = 46, + [189] = 50, + [190] = 44, + [191] = 49, + [192] = 56, + [193] = 60, + [194] = 59, + [195] = 54, + [196] = 84, + [197] = 65, + [198] = 198, + [199] = 180, + [200] = 134, + [201] = 135, + [202] = 169, + [203] = 127, + [204] = 128, + [205] = 129, + [206] = 168, + [207] = 171, + [208] = 173, + [209] = 130, + [210] = 131, + [211] = 132, + [212] = 133, + [213] = 149, + [214] = 157, + [215] = 158, + [216] = 163, + [217] = 166, + [218] = 159, + [219] = 184, + [220] = 186, + [221] = 187, + [222] = 148, + [223] = 181, + [224] = 183, + [225] = 182, + [226] = 150, + [227] = 151, + [228] = 160, + [229] = 162, + [230] = 164, + [231] = 165, + [232] = 167, + [233] = 161, + [234] = 172, + [235] = 174, + [236] = 175, + [237] = 176, + [238] = 177, + [239] = 179, + [240] = 146, + [241] = 147, + [242] = 144, + [243] = 145, + [244] = 136, + [245] = 178, + [246] = 137, + [247] = 138, + [248] = 139, + [249] = 140, + [250] = 141, + [251] = 142, + [252] = 143, + [253] = 156, + [254] = 152, + [255] = 153, + [256] = 154, + [257] = 155, + [258] = 258, + [259] = 170, + [260] = 180, + [261] = 261, + [262] = 262, + [263] = 263, + [264] = 264, + [265] = 265, + [266] = 266, + [267] = 267, + [268] = 268, + [269] = 181, + [270] = 182, + [271] = 271, + [272] = 272, + [273] = 273, + [274] = 184, + [275] = 186, + [276] = 187, + [277] = 183, + [278] = 278, + [279] = 279, + [280] = 279, + [281] = 281, + [282] = 282, + [283] = 283, + [284] = 282, + [285] = 285, + [286] = 286, + [287] = 180, + [288] = 288, + [289] = 289, + [290] = 290, + [291] = 291, + [292] = 292, + [293] = 293, + [294] = 294, + [295] = 295, + [296] = 296, + [297] = 184, + [298] = 186, + [299] = 187, + [300] = 181, + [301] = 183, + [302] = 182, + [303] = 303, + [304] = 304, + [305] = 293, + [306] = 306, + [307] = 294, + [308] = 295, + [309] = 309, + [310] = 310, + [311] = 311, + [312] = 312, + [313] = 313, + [314] = 314, + [315] = 315, + [316] = 316, + [317] = 316, + [318] = 306, + [319] = 319, + [320] = 320, + [321] = 321, + [322] = 315, + [323] = 321, + [324] = 324, + [325] = 319, + [326] = 320, + [327] = 327, + [328] = 328, + [329] = 329, + [330] = 330, + [331] = 331, + [332] = 332, + [333] = 333, + [334] = 334, + [335] = 332, + [336] = 331, + [337] = 337, + [338] = 261, + [339] = 339, + [340] = 265, + [341] = 341, + [342] = 342, + [343] = 343, + [344] = 344, + [345] = 345, + [346] = 346, + [347] = 347, + [348] = 348, + [349] = 349, + [350] = 350, + [351] = 351, + [352] = 352, + [353] = 353, + [354] = 354, + [355] = 355, + [356] = 356, + [357] = 357, + [358] = 358, + [359] = 359, + [360] = 360, + [361] = 361, + [362] = 362, + [363] = 363, + [364] = 363, + [365] = 365, + [366] = 366, + [367] = 365, + [368] = 368, + [369] = 369, + [370] = 370, + [371] = 371, + [372] = 372, + [373] = 373, + [374] = 374, + [375] = 375, + [376] = 376, + [377] = 377, + [378] = 370, + [379] = 379, + [380] = 380, + [381] = 371, + [382] = 382, + [383] = 383, + [384] = 384, + [385] = 385, + [386] = 383, + [387] = 387, + [388] = 388, + [389] = 389, + [390] = 390, + [391] = 391, + [392] = 392, + [393] = 393, + [394] = 394, + [395] = 395, + [396] = 396, + [397] = 397, + [398] = 398, + [399] = 399, + [400] = 394, + [401] = 401, + [402] = 397, + [403] = 403, + [404] = 396, + [405] = 405, + [406] = 406, + [407] = 407, + [408] = 403, + [409] = 409, + [410] = 410, + [411] = 411, + [412] = 412, + [413] = 401, + [414] = 414, + [415] = 405, + [416] = 416, + [417] = 417, + [418] = 418, + [419] = 414, + [420] = 420, + [421] = 421, + [422] = 422, + [423] = 423, + [424] = 424, + [425] = 425, + [426] = 426, + [427] = 427, + [428] = 428, + [429] = 423, + [430] = 430, + [431] = 431, + [432] = 432, + [433] = 433, + [434] = 434, + [435] = 435, + [436] = 436, + [437] = 437, + [438] = 438, + [439] = 439, + [440] = 440, + [441] = 441, + [442] = 434, + [443] = 443, + [444] = 444, + [445] = 439, + [446] = 446, + [447] = 447, + [448] = 426, + [449] = 449, + [450] = 450, + [451] = 451, + [452] = 452, + [453] = 453, + [454] = 440, + [455] = 422, + [456] = 450, + [457] = 424, + [458] = 458, + [459] = 446, + [460] = 460, + [461] = 461, + [462] = 449, + [463] = 428, + [464] = 464, + [465] = 460, + [466] = 425, + [467] = 467, + [468] = 458, + [469] = 469, + [470] = 470, + [471] = 471, + [472] = 421, + [473] = 473, + [474] = 474, + [475] = 446, + [476] = 427, + [477] = 473, + [478] = 433, + [479] = 479, + [480] = 446, + [481] = 481, + [482] = 482, + [483] = 479, + [484] = 484, +}; + +static bool ts_lex(TSLexer *lexer, TSStateId state) { + START_LEXER(); + eof = lexer->eof(lexer); + switch (state) { + case 0: + if (eof) ADVANCE(25); + ADVANCE_MAP( + '!', 41, + '"', 59, + '#', 66, + '$', 23, + '%', 48, + '&', 4, + '(', 30, + ')', 31, + '*', 46, + '+', 45, + ',', 36, + '-', 42, + '.', 29, + '/', 47, + ':', 40, + '<', 53, + '=', 27, + '>', 51, + '?', 5, + '@', 28, + '[', 32, + '\\', 9, + ']', 33, + '`', 8, + '{', 34, + '|', 11, + '}', 35, + ); + if (lookahead == '\t' || + lookahead == '\r' || + lookahead == ' ') SKIP(24); + if (('0' <= lookahead && lookahead <= '9')) ADVANCE(57); + if (('A' <= lookahead && lookahead <= 'Z') || + ('_' <= lookahead && lookahead <= 'z')) ADVANCE(65); + END_STATE(); + case 1: + ADVANCE_MAP( + '!', 41, + '"', 59, + '#', 66, + '$', 23, + '%', 48, + '&', 4, + '(', 30, + ')', 31, + '*', 46, + '+', 45, + ',', 36, + '-', 42, + '.', 29, + '/', 47, + ':', 40, + '<', 53, + '=', 7, + '>', 51, + '?', 5, + '@', 28, + '[', 32, + ']', 33, + '`', 8, + '{', 34, + '|', 11, + '}', 35, + ); + if (lookahead == '\t' || + lookahead == '\r' || + lookahead == ' ') SKIP(1); + if (('0' <= lookahead && lookahead <= '9')) ADVANCE(57); + if (('A' <= lookahead && lookahead <= 'Z') || + ('_' <= lookahead && lookahead <= 'z')) ADVANCE(65); + END_STATE(); + case 2: + ADVANCE_MAP( + '!', 6, + '#', 66, + '%', 48, + '&', 4, + '(', 30, + ')', 31, + '*', 46, + '+', 45, + ',', 36, + '-', 42, + '.', 29, + '/', 47, + ':', 40, + '<', 53, + '=', 26, + '>', 51, + '?', 5, + '@', 28, + '[', 32, + '|', 11, + '}', 35, + ); + if (lookahead == '\t' || + lookahead == '\r' || + lookahead == ' ') SKIP(2); + END_STATE(); + case 3: + if (lookahead == '"') ADVANCE(59); + if (lookahead == '#') ADVANCE(61); + if (lookahead == '\\') ADVANCE(9); + if (lookahead == '\t' || + lookahead == '\r' || + lookahead == ' ') ADVANCE(60); + if (lookahead != 0 && + lookahead != '\t' && + lookahead != '\n') ADVANCE(61); + END_STATE(); + case 4: + if (lookahead == '&') ADVANCE(44); + END_STATE(); + case 5: + if (lookahead == '.') ADVANCE(37); + if (lookahead == '[') ADVANCE(38); + END_STATE(); + case 6: + if (lookahead == '=') ADVANCE(50); + END_STATE(); + case 7: + if (lookahead == '=') ADVANCE(49); + if (lookahead == '>') ADVANCE(55); + END_STATE(); + case 8: + if (lookahead == '`') ADVANCE(63); + if (lookahead != 0) ADVANCE(8); + END_STATE(); + case 9: + if (lookahead == 'u') ADVANCE(10); + if (lookahead == '"' || + lookahead == '\\' || + lookahead == 'n' || + lookahead == 'r' || + lookahead == 't') ADVANCE(62); + END_STATE(); + case 10: + if (lookahead == '{') ADVANCE(20); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(22); + END_STATE(); + case 11: + if (lookahead == '|') ADVANCE(43); + END_STATE(); + case 12: + if (lookahead == '}') ADVANCE(62); + END_STATE(); + case 13: + if (lookahead == '}') ADVANCE(62); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(12); + END_STATE(); + case 14: + if (lookahead == '}') ADVANCE(62); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(13); + END_STATE(); + case 15: + if (lookahead == '}') ADVANCE(62); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(14); + END_STATE(); + case 16: + if (lookahead == '}') ADVANCE(62); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(15); + END_STATE(); + case 17: + if (lookahead == '}') ADVANCE(62); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(16); + END_STATE(); + case 18: + if (('0' <= lookahead && lookahead <= '9')) ADVANCE(58); + END_STATE(); + case 19: + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(62); + END_STATE(); + case 20: + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(17); + END_STATE(); + case 21: + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(19); + END_STATE(); + case 22: + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'F') || + ('a' <= lookahead && lookahead <= 'f')) ADVANCE(21); + END_STATE(); + case 23: + if (('A' <= lookahead && lookahead <= 'Z') || + lookahead == '_' || + ('a' <= lookahead && lookahead <= 'z')) ADVANCE(64); + END_STATE(); + case 24: + if (eof) ADVANCE(25); + ADVANCE_MAP( + '!', 41, + '"', 59, + '#', 66, + '$', 23, + '%', 48, + '&', 4, + '(', 30, + ')', 31, + '*', 46, + '+', 45, + ',', 36, + '-', 42, + '.', 29, + '/', 47, + ':', 40, + '<', 53, + '=', 27, + '>', 51, + '?', 5, + '@', 28, + '[', 32, + ']', 33, + '`', 8, + '{', 34, + '|', 11, + '}', 35, + ); + if (lookahead == '\t' || + lookahead == '\r' || + lookahead == ' ') SKIP(24); + if (('0' <= lookahead && lookahead <= '9')) ADVANCE(57); + if (('A' <= lookahead && lookahead <= 'Z') || + ('_' <= lookahead && lookahead <= 'z')) ADVANCE(65); + END_STATE(); + case 25: + ACCEPT_TOKEN(ts_builtin_sym_end); + END_STATE(); + case 26: + ACCEPT_TOKEN(anon_sym_EQ); + if (lookahead == '=') ADVANCE(49); + END_STATE(); + case 27: + ACCEPT_TOKEN(anon_sym_EQ); + if (lookahead == '=') ADVANCE(49); + if (lookahead == '>') ADVANCE(55); + END_STATE(); + case 28: + ACCEPT_TOKEN(sym_metadata_access); + END_STATE(); + case 29: + ACCEPT_TOKEN(anon_sym_DOT); + END_STATE(); + case 30: + ACCEPT_TOKEN(anon_sym_LPAREN); + END_STATE(); + case 31: + ACCEPT_TOKEN(anon_sym_RPAREN); + END_STATE(); + case 32: + ACCEPT_TOKEN(anon_sym_LBRACK); + END_STATE(); + case 33: + ACCEPT_TOKEN(anon_sym_RBRACK); + END_STATE(); + case 34: + ACCEPT_TOKEN(anon_sym_LBRACE); + END_STATE(); + case 35: + ACCEPT_TOKEN(anon_sym_RBRACE); + END_STATE(); + case 36: + ACCEPT_TOKEN(anon_sym_COMMA); + END_STATE(); + case 37: + ACCEPT_TOKEN(anon_sym_QMARK_DOT); + END_STATE(); + case 38: + ACCEPT_TOKEN(anon_sym_QMARK_LBRACK); + END_STATE(); + case 39: + ACCEPT_TOKEN(anon_sym_COLON_COLON); + END_STATE(); + case 40: + ACCEPT_TOKEN(anon_sym_COLON); + if (lookahead == ':') ADVANCE(39); + END_STATE(); + case 41: + ACCEPT_TOKEN(anon_sym_BANG); + if (lookahead == '=') ADVANCE(50); + END_STATE(); + case 42: + ACCEPT_TOKEN(anon_sym_DASH); + if (lookahead == '>') ADVANCE(56); + END_STATE(); + case 43: + ACCEPT_TOKEN(anon_sym_PIPE_PIPE); + END_STATE(); + case 44: + ACCEPT_TOKEN(anon_sym_AMP_AMP); + END_STATE(); + case 45: + ACCEPT_TOKEN(anon_sym_PLUS); + END_STATE(); + case 46: + ACCEPT_TOKEN(anon_sym_STAR); + END_STATE(); + case 47: + ACCEPT_TOKEN(anon_sym_SLASH); + END_STATE(); + case 48: + ACCEPT_TOKEN(anon_sym_PERCENT); + END_STATE(); + case 49: + ACCEPT_TOKEN(anon_sym_EQ_EQ); + END_STATE(); + case 50: + ACCEPT_TOKEN(anon_sym_BANG_EQ); + END_STATE(); + case 51: + ACCEPT_TOKEN(anon_sym_GT); + if (lookahead == '=') ADVANCE(52); + END_STATE(); + case 52: + ACCEPT_TOKEN(anon_sym_GT_EQ); + END_STATE(); + case 53: + ACCEPT_TOKEN(anon_sym_LT); + if (lookahead == '=') ADVANCE(54); + END_STATE(); + case 54: + ACCEPT_TOKEN(anon_sym_LT_EQ); + END_STATE(); + case 55: + ACCEPT_TOKEN(anon_sym_EQ_GT); + END_STATE(); + case 56: + ACCEPT_TOKEN(anon_sym_DASH_GT); + END_STATE(); + case 57: + ACCEPT_TOKEN(sym_integer); + if (lookahead == '.') ADVANCE(18); + if (('0' <= lookahead && lookahead <= '9')) ADVANCE(57); + END_STATE(); + case 58: + ACCEPT_TOKEN(sym_float); + if (('0' <= lookahead && lookahead <= '9')) ADVANCE(58); + END_STATE(); + case 59: + ACCEPT_TOKEN(anon_sym_DQUOTE); + END_STATE(); + case 60: + ACCEPT_TOKEN(sym_string_content); + if (lookahead == '#') ADVANCE(61); + if (lookahead == '\t' || + lookahead == '\r' || + lookahead == ' ') ADVANCE(60); + if (lookahead != 0 && + lookahead != '\t' && + lookahead != '\n' && + lookahead != '"' && + lookahead != '#' && + lookahead != '\\') ADVANCE(61); + END_STATE(); + case 61: + ACCEPT_TOKEN(sym_string_content); + if (lookahead != 0 && + lookahead != '\n' && + lookahead != '"' && + lookahead != '\\') ADVANCE(61); + END_STATE(); + case 62: + ACCEPT_TOKEN(sym_escape_sequence); + END_STATE(); + case 63: + ACCEPT_TOKEN(sym_raw_string); + END_STATE(); + case 64: + ACCEPT_TOKEN(sym_variable); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'Z') || + lookahead == '_' || + ('a' <= lookahead && lookahead <= 'z')) ADVANCE(64); + END_STATE(); + case 65: + ACCEPT_TOKEN(sym_identifier); + if (('0' <= lookahead && lookahead <= '9') || + ('A' <= lookahead && lookahead <= 'Z') || + lookahead == '_' || + ('a' <= lookahead && lookahead <= 'z')) ADVANCE(65); + END_STATE(); + case 66: + ACCEPT_TOKEN(sym_comment); + if (lookahead != 0 && + lookahead != '\n') ADVANCE(66); + END_STATE(); + default: + return false; + } +} + +static bool ts_lex_keywords(TSLexer *lexer, TSStateId state) { + START_LEXER(); + eof = lexer->eof(lexer); + switch (state) { + case 0: + ADVANCE_MAP( + '_', 1, + 'a', 2, + 'd', 3, + 'e', 4, + 'f', 5, + 'i', 6, + 'm', 7, + 'n', 8, + 'o', 9, + 't', 10, + 'v', 11, + ); + if (lookahead == '\t' || + lookahead == '\r' || + lookahead == ' ') SKIP(0); + END_STATE(); + case 1: + ACCEPT_TOKEN(anon_sym__); + END_STATE(); + case 2: + if (lookahead == 's') ADVANCE(12); + END_STATE(); + case 3: + if (lookahead == 'e') ADVANCE(13); + END_STATE(); + case 4: + if (lookahead == 'l') ADVANCE(14); + END_STATE(); + case 5: + if (lookahead == 'a') ADVANCE(15); + END_STATE(); + case 6: + if (lookahead == 'f') ADVANCE(16); + if (lookahead == 'm') ADVANCE(17); + if (lookahead == 'n') ADVANCE(18); + END_STATE(); + case 7: + if (lookahead == 'a') ADVANCE(19); + END_STATE(); + case 8: + if (lookahead == 'u') ADVANCE(20); + END_STATE(); + case 9: + if (lookahead == 'u') ADVANCE(21); + END_STATE(); + case 10: + if (lookahead == 'h') ADVANCE(22); + if (lookahead == 'r') ADVANCE(23); + END_STATE(); + case 11: + if (lookahead == 'o') ADVANCE(24); + END_STATE(); + case 12: + ACCEPT_TOKEN(anon_sym_as); + END_STATE(); + case 13: + if (lookahead == 'l') ADVANCE(25); + END_STATE(); + case 14: + if (lookahead == 's') ADVANCE(26); + END_STATE(); + case 15: + if (lookahead == 'l') ADVANCE(27); + END_STATE(); + case 16: + ACCEPT_TOKEN(anon_sym_if); + END_STATE(); + case 17: + if (lookahead == 'p') ADVANCE(28); + END_STATE(); + case 18: + if (lookahead == 'p') ADVANCE(29); + END_STATE(); + case 19: + if (lookahead == 'p') ADVANCE(30); + if (lookahead == 't') ADVANCE(31); + END_STATE(); + case 20: + if (lookahead == 'l') ADVANCE(32); + END_STATE(); + case 21: + if (lookahead == 't') ADVANCE(33); + END_STATE(); + case 22: + if (lookahead == 'r') ADVANCE(34); + END_STATE(); + case 23: + if (lookahead == 'u') ADVANCE(35); + END_STATE(); + case 24: + if (lookahead == 'i') ADVANCE(36); + END_STATE(); + case 25: + if (lookahead == 'e') ADVANCE(37); + END_STATE(); + case 26: + if (lookahead == 'e') ADVANCE(38); + END_STATE(); + case 27: + if (lookahead == 's') ADVANCE(39); + END_STATE(); + case 28: + if (lookahead == 'o') ADVANCE(40); + END_STATE(); + case 29: + if (lookahead == 'u') ADVANCE(41); + END_STATE(); + case 30: + ACCEPT_TOKEN(anon_sym_map); + END_STATE(); + case 31: + if (lookahead == 'c') ADVANCE(42); + END_STATE(); + case 32: + if (lookahead == 'l') ADVANCE(43); + END_STATE(); + case 33: + if (lookahead == 'p') ADVANCE(44); + END_STATE(); + case 34: + if (lookahead == 'o') ADVANCE(45); + END_STATE(); + case 35: + if (lookahead == 'e') ADVANCE(46); + END_STATE(); + case 36: + if (lookahead == 'd') ADVANCE(47); + END_STATE(); + case 37: + if (lookahead == 't') ADVANCE(48); + END_STATE(); + case 38: + ACCEPT_TOKEN(anon_sym_else); + END_STATE(); + case 39: + if (lookahead == 'e') ADVANCE(49); + END_STATE(); + case 40: + if (lookahead == 'r') ADVANCE(50); + END_STATE(); + case 41: + if (lookahead == 't') ADVANCE(51); + END_STATE(); + case 42: + if (lookahead == 'h') ADVANCE(52); + END_STATE(); + case 43: + ACCEPT_TOKEN(anon_sym_null); + END_STATE(); + case 44: + if (lookahead == 'u') ADVANCE(53); + END_STATE(); + case 45: + if (lookahead == 'w') ADVANCE(54); + END_STATE(); + case 46: + ACCEPT_TOKEN(anon_sym_true); + END_STATE(); + case 47: + ACCEPT_TOKEN(anon_sym_void); + END_STATE(); + case 48: + if (lookahead == 'e') ADVANCE(55); + END_STATE(); + case 49: + ACCEPT_TOKEN(anon_sym_false); + END_STATE(); + case 50: + if (lookahead == 't') ADVANCE(56); + END_STATE(); + case 51: + ACCEPT_TOKEN(anon_sym_input); + END_STATE(); + case 52: + ACCEPT_TOKEN(anon_sym_match); + END_STATE(); + case 53: + if (lookahead == 't') ADVANCE(57); + END_STATE(); + case 54: + ACCEPT_TOKEN(anon_sym_throw); + END_STATE(); + case 55: + if (lookahead == 'd') ADVANCE(58); + END_STATE(); + case 56: + ACCEPT_TOKEN(anon_sym_import); + END_STATE(); + case 57: + ACCEPT_TOKEN(anon_sym_output); + END_STATE(); + case 58: + ACCEPT_TOKEN(anon_sym_deleted); + END_STATE(); + default: + return false; + } +} + +static const TSLexMode ts_lex_modes[STATE_COUNT] = { + [0] = {.lex_state = 0, .external_lex_state = 1}, + [1] = {.lex_state = 0, .external_lex_state = 1}, + [2] = {.lex_state = 0, .external_lex_state = 2}, + [3] = {.lex_state = 0, .external_lex_state = 2}, + [4] = {.lex_state = 0, .external_lex_state = 2}, + [5] = {.lex_state = 0, .external_lex_state = 2}, + [6] = {.lex_state = 0, .external_lex_state = 2}, + [7] = {.lex_state = 0, .external_lex_state = 2}, + [8] = {.lex_state = 0, .external_lex_state = 2}, + [9] = {.lex_state = 0, .external_lex_state = 2}, + [10] = {.lex_state = 0, .external_lex_state = 2}, + [11] = {.lex_state = 0, .external_lex_state = 2}, + [12] = {.lex_state = 0, .external_lex_state = 2}, + [13] = {.lex_state = 0, .external_lex_state = 2}, + [14] = {.lex_state = 0, .external_lex_state = 2}, + [15] = {.lex_state = 0, .external_lex_state = 2}, + [16] = {.lex_state = 0, .external_lex_state = 2}, + [17] = {.lex_state = 0, .external_lex_state = 2}, + [18] = {.lex_state = 0, .external_lex_state = 1}, + [19] = {.lex_state = 0, .external_lex_state = 1}, + [20] = {.lex_state = 0, .external_lex_state = 2}, + [21] = {.lex_state = 0, .external_lex_state = 1}, + [22] = {.lex_state = 0, .external_lex_state = 2}, + [23] = {.lex_state = 0, .external_lex_state = 1}, + [24] = {.lex_state = 0, .external_lex_state = 1}, + [25] = {.lex_state = 0, .external_lex_state = 2}, + [26] = {.lex_state = 0, .external_lex_state = 1}, + [27] = {.lex_state = 0, .external_lex_state = 2}, + [28] = {.lex_state = 0, .external_lex_state = 2}, + [29] = {.lex_state = 0, .external_lex_state = 1}, + [30] = {.lex_state = 0, .external_lex_state = 1}, + [31] = {.lex_state = 0, .external_lex_state = 2}, + [32] = {.lex_state = 0, .external_lex_state = 2}, + [33] = {.lex_state = 0, .external_lex_state = 2}, + [34] = {.lex_state = 0, .external_lex_state = 2}, + [35] = {.lex_state = 0, .external_lex_state = 2}, + [36] = {.lex_state = 0, .external_lex_state = 2}, + [37] = {.lex_state = 0, .external_lex_state = 2}, + [38] = {.lex_state = 0, .external_lex_state = 2}, + [39] = {.lex_state = 0, .external_lex_state = 2}, + [40] = {.lex_state = 0, .external_lex_state = 2}, + [41] = {.lex_state = 0, .external_lex_state = 2}, + [42] = {.lex_state = 0, .external_lex_state = 2}, + [43] = {.lex_state = 0, .external_lex_state = 2}, + [44] = {.lex_state = 1, .external_lex_state = 2}, + [45] = {.lex_state = 0, .external_lex_state = 2}, + [46] = {.lex_state = 1, .external_lex_state = 2}, + [47] = {.lex_state = 0, .external_lex_state = 2}, + [48] = {.lex_state = 0, .external_lex_state = 2}, + [49] = {.lex_state = 1, .external_lex_state = 2}, + [50] = {.lex_state = 1, .external_lex_state = 2}, + [51] = {.lex_state = 0, .external_lex_state = 2}, + [52] = {.lex_state = 0, .external_lex_state = 2}, + [53] = {.lex_state = 0, .external_lex_state = 2}, + [54] = {.lex_state = 1, .external_lex_state = 2}, + [55] = {.lex_state = 0, .external_lex_state = 2}, + [56] = {.lex_state = 1, .external_lex_state = 2}, + [57] = {.lex_state = 0, .external_lex_state = 2}, + [58] = {.lex_state = 0, .external_lex_state = 2}, + [59] = {.lex_state = 1, .external_lex_state = 2}, + [60] = {.lex_state = 1, .external_lex_state = 2}, + [61] = {.lex_state = 0, .external_lex_state = 2}, + [62] = {.lex_state = 0, .external_lex_state = 2}, + [63] = {.lex_state = 0, .external_lex_state = 2}, + [64] = {.lex_state = 0, .external_lex_state = 2}, + [65] = {.lex_state = 1, .external_lex_state = 2}, + [66] = {.lex_state = 0, .external_lex_state = 2}, + [67] = {.lex_state = 0, .external_lex_state = 2}, + [68] = {.lex_state = 0, .external_lex_state = 2}, + [69] = {.lex_state = 0, .external_lex_state = 2}, + [70] = {.lex_state = 0, .external_lex_state = 2}, + [71] = {.lex_state = 0, .external_lex_state = 2}, + [72] = {.lex_state = 0, .external_lex_state = 2}, + [73] = {.lex_state = 0, .external_lex_state = 2}, + [74] = {.lex_state = 0, .external_lex_state = 2}, + [75] = {.lex_state = 0, .external_lex_state = 2}, + [76] = {.lex_state = 0, .external_lex_state = 2}, + [77] = {.lex_state = 0, .external_lex_state = 2}, + [78] = {.lex_state = 0, .external_lex_state = 2}, + [79] = {.lex_state = 0, .external_lex_state = 2}, + [80] = {.lex_state = 0, .external_lex_state = 2}, + [81] = {.lex_state = 0, .external_lex_state = 2}, + [82] = {.lex_state = 0, .external_lex_state = 2}, + [83] = {.lex_state = 0, .external_lex_state = 2}, + [84] = {.lex_state = 1, .external_lex_state = 2}, + [85] = {.lex_state = 0, .external_lex_state = 2}, + [86] = {.lex_state = 0, .external_lex_state = 2}, + [87] = {.lex_state = 0, .external_lex_state = 2}, + [88] = {.lex_state = 0, .external_lex_state = 2}, + [89] = {.lex_state = 0, .external_lex_state = 2}, + [90] = {.lex_state = 0, .external_lex_state = 2}, + [91] = {.lex_state = 0, .external_lex_state = 2}, + [92] = {.lex_state = 0, .external_lex_state = 2}, + [93] = {.lex_state = 0, .external_lex_state = 2}, + [94] = {.lex_state = 0, .external_lex_state = 2}, + [95] = {.lex_state = 0, .external_lex_state = 2}, + [96] = {.lex_state = 0, .external_lex_state = 2}, + [97] = {.lex_state = 0, .external_lex_state = 2}, + [98] = {.lex_state = 0, .external_lex_state = 2}, + [99] = {.lex_state = 0, .external_lex_state = 2}, + [100] = {.lex_state = 0, .external_lex_state = 2}, + [101] = {.lex_state = 0, .external_lex_state = 2}, + [102] = {.lex_state = 0, .external_lex_state = 2}, + [103] = {.lex_state = 0, .external_lex_state = 2}, + [104] = {.lex_state = 0, .external_lex_state = 2}, + [105] = {.lex_state = 0, .external_lex_state = 2}, + [106] = {.lex_state = 0, .external_lex_state = 2}, + [107] = {.lex_state = 0, .external_lex_state = 2}, + [108] = {.lex_state = 0, .external_lex_state = 2}, + [109] = {.lex_state = 0, .external_lex_state = 2}, + [110] = {.lex_state = 0, .external_lex_state = 2}, + [111] = {.lex_state = 0, .external_lex_state = 2}, + [112] = {.lex_state = 0, .external_lex_state = 2}, + [113] = {.lex_state = 0, .external_lex_state = 2}, + [114] = {.lex_state = 0, .external_lex_state = 2}, + [115] = {.lex_state = 0, .external_lex_state = 2}, + [116] = {.lex_state = 0, .external_lex_state = 2}, + [117] = {.lex_state = 0, .external_lex_state = 2}, + [118] = {.lex_state = 0, .external_lex_state = 2}, + [119] = {.lex_state = 0, .external_lex_state = 2}, + [120] = {.lex_state = 0, .external_lex_state = 2}, + [121] = {.lex_state = 0, .external_lex_state = 2}, + [122] = {.lex_state = 0, .external_lex_state = 2}, + [123] = {.lex_state = 0, .external_lex_state = 2}, + [124] = {.lex_state = 0, .external_lex_state = 2}, + [125] = {.lex_state = 0, .external_lex_state = 2}, + [126] = {.lex_state = 0, .external_lex_state = 2}, + [127] = {.lex_state = 0, .external_lex_state = 2}, + [128] = {.lex_state = 0, .external_lex_state = 2}, + [129] = {.lex_state = 0, .external_lex_state = 2}, + [130] = {.lex_state = 1, .external_lex_state = 2}, + [131] = {.lex_state = 1, .external_lex_state = 2}, + [132] = {.lex_state = 1, .external_lex_state = 2}, + [133] = {.lex_state = 1, .external_lex_state = 2}, + [134] = {.lex_state = 1, .external_lex_state = 2}, + [135] = {.lex_state = 1, .external_lex_state = 2}, + [136] = {.lex_state = 1, .external_lex_state = 2}, + [137] = {.lex_state = 1, .external_lex_state = 2}, + [138] = {.lex_state = 1, .external_lex_state = 2}, + [139] = {.lex_state = 1, .external_lex_state = 2}, + [140] = {.lex_state = 1, .external_lex_state = 2}, + [141] = {.lex_state = 1, .external_lex_state = 2}, + [142] = {.lex_state = 1, .external_lex_state = 2}, + [143] = {.lex_state = 1, .external_lex_state = 2}, + [144] = {.lex_state = 1, .external_lex_state = 2}, + [145] = {.lex_state = 1, .external_lex_state = 2}, + [146] = {.lex_state = 1, .external_lex_state = 2}, + [147] = {.lex_state = 1, .external_lex_state = 2}, + [148] = {.lex_state = 1, .external_lex_state = 2}, + [149] = {.lex_state = 1, .external_lex_state = 2}, + [150] = {.lex_state = 1, .external_lex_state = 2}, + [151] = {.lex_state = 1, .external_lex_state = 2}, + [152] = {.lex_state = 1, .external_lex_state = 2}, + [153] = {.lex_state = 1, .external_lex_state = 2}, + [154] = {.lex_state = 1, .external_lex_state = 2}, + [155] = {.lex_state = 1, .external_lex_state = 2}, + [156] = {.lex_state = 1, .external_lex_state = 2}, + [157] = {.lex_state = 1, .external_lex_state = 2}, + [158] = {.lex_state = 1, .external_lex_state = 2}, + [159] = {.lex_state = 1, .external_lex_state = 2}, + [160] = {.lex_state = 1, .external_lex_state = 2}, + [161] = {.lex_state = 1, .external_lex_state = 2}, + [162] = {.lex_state = 1, .external_lex_state = 2}, + [163] = {.lex_state = 1, .external_lex_state = 2}, + [164] = {.lex_state = 1, .external_lex_state = 2}, + [165] = {.lex_state = 1, .external_lex_state = 2}, + [166] = {.lex_state = 1, .external_lex_state = 2}, + [167] = {.lex_state = 1, .external_lex_state = 2}, + [168] = {.lex_state = 1, .external_lex_state = 2}, + [169] = {.lex_state = 1, .external_lex_state = 2}, + [170] = {.lex_state = 1, .external_lex_state = 2}, + [171] = {.lex_state = 1, .external_lex_state = 2}, + [172] = {.lex_state = 1, .external_lex_state = 2}, + [173] = {.lex_state = 1, .external_lex_state = 2}, + [174] = {.lex_state = 1, .external_lex_state = 2}, + [175] = {.lex_state = 1, .external_lex_state = 2}, + [176] = {.lex_state = 1, .external_lex_state = 2}, + [177] = {.lex_state = 1, .external_lex_state = 2}, + [178] = {.lex_state = 1, .external_lex_state = 2}, + [179] = {.lex_state = 1, .external_lex_state = 2}, + [180] = {.lex_state = 0, .external_lex_state = 2}, + [181] = {.lex_state = 0, .external_lex_state = 2}, + [182] = {.lex_state = 0, .external_lex_state = 2}, + [183] = {.lex_state = 0, .external_lex_state = 2}, + [184] = {.lex_state = 0, .external_lex_state = 2}, + [185] = {.lex_state = 0, .external_lex_state = 2}, + [186] = {.lex_state = 0, .external_lex_state = 2}, + [187] = {.lex_state = 0, .external_lex_state = 2}, + [188] = {.lex_state = 0, .external_lex_state = 1}, + [189] = {.lex_state = 0, .external_lex_state = 1}, + [190] = {.lex_state = 0, .external_lex_state = 1}, + [191] = {.lex_state = 0, .external_lex_state = 1}, + [192] = {.lex_state = 0, .external_lex_state = 1}, + [193] = {.lex_state = 0, .external_lex_state = 1}, + [194] = {.lex_state = 0, .external_lex_state = 1}, + [195] = {.lex_state = 0, .external_lex_state = 1}, + [196] = {.lex_state = 0, .external_lex_state = 1}, + [197] = {.lex_state = 0, .external_lex_state = 1}, + [198] = {.lex_state = 0, .external_lex_state = 1}, + [199] = {.lex_state = 0, .external_lex_state = 1}, + [200] = {.lex_state = 0, .external_lex_state = 1}, + [201] = {.lex_state = 0, .external_lex_state = 1}, + [202] = {.lex_state = 0, .external_lex_state = 1}, + [203] = {.lex_state = 0, .external_lex_state = 1}, + [204] = {.lex_state = 0, .external_lex_state = 1}, + [205] = {.lex_state = 0, .external_lex_state = 1}, + [206] = {.lex_state = 0, .external_lex_state = 1}, + [207] = {.lex_state = 0, .external_lex_state = 1}, + [208] = {.lex_state = 0, .external_lex_state = 1}, + [209] = {.lex_state = 0, .external_lex_state = 1}, + [210] = {.lex_state = 0, .external_lex_state = 1}, + [211] = {.lex_state = 0, .external_lex_state = 1}, + [212] = {.lex_state = 0, .external_lex_state = 1}, + [213] = {.lex_state = 0, .external_lex_state = 1}, + [214] = {.lex_state = 0, .external_lex_state = 1}, + [215] = {.lex_state = 0, .external_lex_state = 1}, + [216] = {.lex_state = 0, .external_lex_state = 1}, + [217] = {.lex_state = 0, .external_lex_state = 1}, + [218] = {.lex_state = 0, .external_lex_state = 1}, + [219] = {.lex_state = 0, .external_lex_state = 1}, + [220] = {.lex_state = 0, .external_lex_state = 1}, + [221] = {.lex_state = 0, .external_lex_state = 1}, + [222] = {.lex_state = 0, .external_lex_state = 1}, + [223] = {.lex_state = 0, .external_lex_state = 1}, + [224] = {.lex_state = 0, .external_lex_state = 1}, + [225] = {.lex_state = 0, .external_lex_state = 1}, + [226] = {.lex_state = 0, .external_lex_state = 1}, + [227] = {.lex_state = 0, .external_lex_state = 1}, + [228] = {.lex_state = 0, .external_lex_state = 1}, + [229] = {.lex_state = 0, .external_lex_state = 1}, + [230] = {.lex_state = 0, .external_lex_state = 1}, + [231] = {.lex_state = 0, .external_lex_state = 1}, + [232] = {.lex_state = 0, .external_lex_state = 1}, + [233] = {.lex_state = 0, .external_lex_state = 1}, + [234] = {.lex_state = 0, .external_lex_state = 1}, + [235] = {.lex_state = 0, .external_lex_state = 1}, + [236] = {.lex_state = 0, .external_lex_state = 1}, + [237] = {.lex_state = 0, .external_lex_state = 1}, + [238] = {.lex_state = 0, .external_lex_state = 1}, + [239] = {.lex_state = 0, .external_lex_state = 1}, + [240] = {.lex_state = 0, .external_lex_state = 1}, + [241] = {.lex_state = 0, .external_lex_state = 1}, + [242] = {.lex_state = 0, .external_lex_state = 1}, + [243] = {.lex_state = 0, .external_lex_state = 1}, + [244] = {.lex_state = 0, .external_lex_state = 1}, + [245] = {.lex_state = 0, .external_lex_state = 1}, + [246] = {.lex_state = 0, .external_lex_state = 1}, + [247] = {.lex_state = 0, .external_lex_state = 1}, + [248] = {.lex_state = 0, .external_lex_state = 1}, + [249] = {.lex_state = 0, .external_lex_state = 1}, + [250] = {.lex_state = 0, .external_lex_state = 1}, + [251] = {.lex_state = 0, .external_lex_state = 1}, + [252] = {.lex_state = 0, .external_lex_state = 1}, + [253] = {.lex_state = 0, .external_lex_state = 1}, + [254] = {.lex_state = 0, .external_lex_state = 1}, + [255] = {.lex_state = 0, .external_lex_state = 1}, + [256] = {.lex_state = 0, .external_lex_state = 1}, + [257] = {.lex_state = 0, .external_lex_state = 1}, + [258] = {.lex_state = 0, .external_lex_state = 1}, + [259] = {.lex_state = 0, .external_lex_state = 1}, + [260] = {.lex_state = 1, .external_lex_state = 2}, + [261] = {.lex_state = 0, .external_lex_state = 2}, + [262] = {.lex_state = 0, .external_lex_state = 2}, + [263] = {.lex_state = 0, .external_lex_state = 2}, + [264] = {.lex_state = 0, .external_lex_state = 2}, + [265] = {.lex_state = 0, .external_lex_state = 2}, + [266] = {.lex_state = 0, .external_lex_state = 2}, + [267] = {.lex_state = 0, .external_lex_state = 2}, + [268] = {.lex_state = 0, .external_lex_state = 2}, + [269] = {.lex_state = 1, .external_lex_state = 2}, + [270] = {.lex_state = 1, .external_lex_state = 2}, + [271] = {.lex_state = 0, .external_lex_state = 2}, + [272] = {.lex_state = 0, .external_lex_state = 2}, + [273] = {.lex_state = 0, .external_lex_state = 2}, + [274] = {.lex_state = 1, .external_lex_state = 2}, + [275] = {.lex_state = 1, .external_lex_state = 2}, + [276] = {.lex_state = 1, .external_lex_state = 2}, + [277] = {.lex_state = 1, .external_lex_state = 2}, + [278] = {.lex_state = 2, .external_lex_state = 2}, + [279] = {.lex_state = 2, .external_lex_state = 2}, + [280] = {.lex_state = 2, .external_lex_state = 1}, + [281] = {.lex_state = 0, .external_lex_state = 2}, + [282] = {.lex_state = 0, .external_lex_state = 2}, + [283] = {.lex_state = 0, .external_lex_state = 2}, + [284] = {.lex_state = 0, .external_lex_state = 2}, + [285] = {.lex_state = 0, .external_lex_state = 2}, + [286] = {.lex_state = 0, .external_lex_state = 2}, + [287] = {.lex_state = 0, .external_lex_state = 2}, + [288] = {.lex_state = 1, .external_lex_state = 2}, + [289] = {.lex_state = 0, .external_lex_state = 2}, + [290] = {.lex_state = 0, .external_lex_state = 2}, + [291] = {.lex_state = 0, .external_lex_state = 1}, + [292] = {.lex_state = 0, .external_lex_state = 2}, + [293] = {.lex_state = 0, .external_lex_state = 2}, + [294] = {.lex_state = 0, .external_lex_state = 2}, + [295] = {.lex_state = 0, .external_lex_state = 2}, + [296] = {.lex_state = 0, .external_lex_state = 2}, + [297] = {.lex_state = 0, .external_lex_state = 2}, + [298] = {.lex_state = 0, .external_lex_state = 2}, + [299] = {.lex_state = 0, .external_lex_state = 2}, + [300] = {.lex_state = 0, .external_lex_state = 2}, + [301] = {.lex_state = 0, .external_lex_state = 2}, + [302] = {.lex_state = 0, .external_lex_state = 2}, + [303] = {.lex_state = 1, .external_lex_state = 2}, + [304] = {.lex_state = 0, .external_lex_state = 2}, + [305] = {.lex_state = 0, .external_lex_state = 2}, + [306] = {.lex_state = 0, .external_lex_state = 1}, + [307] = {.lex_state = 0, .external_lex_state = 2}, + [308] = {.lex_state = 0, .external_lex_state = 2}, + [309] = {.lex_state = 0, .external_lex_state = 2}, + [310] = {.lex_state = 0, .external_lex_state = 1}, + [311] = {.lex_state = 1, .external_lex_state = 2}, + [312] = {.lex_state = 1, .external_lex_state = 2}, + [313] = {.lex_state = 0, .external_lex_state = 2}, + [314] = {.lex_state = 0, .external_lex_state = 1}, + [315] = {.lex_state = 0, .external_lex_state = 2}, + [316] = {.lex_state = 0, .external_lex_state = 2}, + [317] = {.lex_state = 0, .external_lex_state = 2}, + [318] = {.lex_state = 0, .external_lex_state = 2}, + [319] = {.lex_state = 0, .external_lex_state = 2}, + [320] = {.lex_state = 0, .external_lex_state = 2}, + [321] = {.lex_state = 0, .external_lex_state = 2}, + [322] = {.lex_state = 0, .external_lex_state = 2}, + [323] = {.lex_state = 0, .external_lex_state = 2}, + [324] = {.lex_state = 0, .external_lex_state = 2}, + [325] = {.lex_state = 0, .external_lex_state = 2}, + [326] = {.lex_state = 0, .external_lex_state = 2}, + [327] = {.lex_state = 0, .external_lex_state = 1}, + [328] = {.lex_state = 0, .external_lex_state = 1}, + [329] = {.lex_state = 0, .external_lex_state = 1}, + [330] = {.lex_state = 0, .external_lex_state = 1}, + [331] = {.lex_state = 0, .external_lex_state = 1}, + [332] = {.lex_state = 0, .external_lex_state = 1}, + [333] = {.lex_state = 0, .external_lex_state = 1}, + [334] = {.lex_state = 0, .external_lex_state = 1}, + [335] = {.lex_state = 0, .external_lex_state = 1}, + [336] = {.lex_state = 0, .external_lex_state = 1}, + [337] = {.lex_state = 0, .external_lex_state = 2}, + [338] = {.lex_state = 0, .external_lex_state = 1}, + [339] = {.lex_state = 0, .external_lex_state = 1}, + [340] = {.lex_state = 0, .external_lex_state = 1}, + [341] = {.lex_state = 0, .external_lex_state = 1}, + [342] = {.lex_state = 0, .external_lex_state = 1}, + [343] = {.lex_state = 0, .external_lex_state = 1}, + [344] = {.lex_state = 0, .external_lex_state = 1}, + [345] = {.lex_state = 0, .external_lex_state = 1}, + [346] = {.lex_state = 0, .external_lex_state = 1}, + [347] = {.lex_state = 0, .external_lex_state = 1}, + [348] = {.lex_state = 0, .external_lex_state = 1}, + [349] = {.lex_state = 0, .external_lex_state = 1}, + [350] = {.lex_state = 0, .external_lex_state = 1}, + [351] = {.lex_state = 0, .external_lex_state = 1}, + [352] = {.lex_state = 0, .external_lex_state = 1}, + [353] = {.lex_state = 0, .external_lex_state = 1}, + [354] = {.lex_state = 2, .external_lex_state = 2}, + [355] = {.lex_state = 2, .external_lex_state = 2}, + [356] = {.lex_state = 2, .external_lex_state = 2}, + [357] = {.lex_state = 0, .external_lex_state = 2}, + [358] = {.lex_state = 2, .external_lex_state = 2}, + [359] = {.lex_state = 2, .external_lex_state = 2}, + [360] = {.lex_state = 2, .external_lex_state = 2}, + [361] = {.lex_state = 2, .external_lex_state = 2}, + [362] = {.lex_state = 2, .external_lex_state = 2}, + [363] = {.lex_state = 3, .external_lex_state = 2}, + [364] = {.lex_state = 3, .external_lex_state = 2}, + [365] = {.lex_state = 3, .external_lex_state = 2}, + [366] = {.lex_state = 2, .external_lex_state = 2}, + [367] = {.lex_state = 3, .external_lex_state = 2}, + [368] = {.lex_state = 0, .external_lex_state = 2}, + [369] = {.lex_state = 3, .external_lex_state = 2}, + [370] = {.lex_state = 0, .external_lex_state = 2}, + [371] = {.lex_state = 0, .external_lex_state = 2}, + [372] = {.lex_state = 0, .external_lex_state = 2}, + [373] = {.lex_state = 0, .external_lex_state = 2}, + [374] = {.lex_state = 0, .external_lex_state = 2}, + [375] = {.lex_state = 0, .external_lex_state = 2}, + [376] = {.lex_state = 2, .external_lex_state = 2}, + [377] = {.lex_state = 0, .external_lex_state = 2}, + [378] = {.lex_state = 0, .external_lex_state = 2}, + [379] = {.lex_state = 2, .external_lex_state = 2}, + [380] = {.lex_state = 0, .external_lex_state = 2}, + [381] = {.lex_state = 0, .external_lex_state = 2}, + [382] = {.lex_state = 0, .external_lex_state = 2}, + [383] = {.lex_state = 0, .external_lex_state = 2}, + [384] = {.lex_state = 2, .external_lex_state = 2}, + [385] = {.lex_state = 0, .external_lex_state = 2}, + [386] = {.lex_state = 0, .external_lex_state = 2}, + [387] = {.lex_state = 0, .external_lex_state = 2}, + [388] = {.lex_state = 0, .external_lex_state = 2}, + [389] = {.lex_state = 0, .external_lex_state = 2}, + [390] = {.lex_state = 2, .external_lex_state = 2}, + [391] = {.lex_state = 0, .external_lex_state = 2}, + [392] = {.lex_state = 0, .external_lex_state = 2}, + [393] = {.lex_state = 0, .external_lex_state = 2}, + [394] = {.lex_state = 0, .external_lex_state = 1}, + [395] = {.lex_state = 0, .external_lex_state = 2}, + [396] = {.lex_state = 0, .external_lex_state = 1}, + [397] = {.lex_state = 0, .external_lex_state = 1}, + [398] = {.lex_state = 0, .external_lex_state = 2}, + [399] = {.lex_state = 0, .external_lex_state = 2}, + [400] = {.lex_state = 0, .external_lex_state = 1}, + [401] = {.lex_state = 0, .external_lex_state = 1}, + [402] = {.lex_state = 0, .external_lex_state = 1}, + [403] = {.lex_state = 0, .external_lex_state = 1}, + [404] = {.lex_state = 0, .external_lex_state = 1}, + [405] = {.lex_state = 0, .external_lex_state = 1}, + [406] = {.lex_state = 0, .external_lex_state = 1}, + [407] = {.lex_state = 0, .external_lex_state = 2}, + [408] = {.lex_state = 0, .external_lex_state = 1}, + [409] = {.lex_state = 0, .external_lex_state = 1}, + [410] = {.lex_state = 1, .external_lex_state = 2}, + [411] = {.lex_state = 0, .external_lex_state = 1}, + [412] = {.lex_state = 0, .external_lex_state = 2}, + [413] = {.lex_state = 0, .external_lex_state = 1}, + [414] = {.lex_state = 0, .external_lex_state = 2}, + [415] = {.lex_state = 0, .external_lex_state = 1}, + [416] = {.lex_state = 0, .external_lex_state = 2}, + [417] = {.lex_state = 1, .external_lex_state = 2}, + [418] = {.lex_state = 0, .external_lex_state = 2}, + [419] = {.lex_state = 0, .external_lex_state = 2}, + [420] = {.lex_state = 0, .external_lex_state = 2}, + [421] = {.lex_state = 0, .external_lex_state = 2}, + [422] = {.lex_state = 0, .external_lex_state = 2}, + [423] = {.lex_state = 0, .external_lex_state = 2}, + [424] = {.lex_state = 0, .external_lex_state = 2}, + [425] = {.lex_state = 0, .external_lex_state = 2}, + [426] = {.lex_state = 0, .external_lex_state = 2}, + [427] = {.lex_state = 0, .external_lex_state = 2}, + [428] = {.lex_state = 0, .external_lex_state = 2}, + [429] = {.lex_state = 0, .external_lex_state = 2}, + [430] = {.lex_state = 0, .external_lex_state = 2}, + [431] = {.lex_state = 0, .external_lex_state = 2}, + [432] = {.lex_state = 0, .external_lex_state = 2}, + [433] = {.lex_state = 0, .external_lex_state = 2}, + [434] = {.lex_state = 0, .external_lex_state = 2}, + [435] = {.lex_state = 0, .external_lex_state = 2}, + [436] = {.lex_state = 0, .external_lex_state = 2}, + [437] = {.lex_state = 0, .external_lex_state = 2}, + [438] = {.lex_state = 0, .external_lex_state = 2}, + [439] = {.lex_state = 0, .external_lex_state = 2}, + [440] = {.lex_state = 0, .external_lex_state = 2}, + [441] = {.lex_state = 0, .external_lex_state = 2}, + [442] = {.lex_state = 0, .external_lex_state = 2}, + [443] = {.lex_state = 0, .external_lex_state = 1}, + [444] = {.lex_state = 0, .external_lex_state = 2}, + [445] = {.lex_state = 0, .external_lex_state = 2}, + [446] = {.lex_state = 0, .external_lex_state = 2}, + [447] = {.lex_state = 0, .external_lex_state = 2}, + [448] = {.lex_state = 0, .external_lex_state = 2}, + [449] = {.lex_state = 0, .external_lex_state = 2}, + [450] = {.lex_state = 0, .external_lex_state = 2}, + [451] = {.lex_state = 0, .external_lex_state = 2}, + [452] = {.lex_state = 0, .external_lex_state = 2}, + [453] = {.lex_state = 0, .external_lex_state = 2}, + [454] = {.lex_state = 0, .external_lex_state = 2}, + [455] = {.lex_state = 0, .external_lex_state = 2}, + [456] = {.lex_state = 0, .external_lex_state = 2}, + [457] = {.lex_state = 0, .external_lex_state = 2}, + [458] = {.lex_state = 0, .external_lex_state = 2}, + [459] = {.lex_state = 0, .external_lex_state = 2}, + [460] = {.lex_state = 0, .external_lex_state = 2}, + [461] = {.lex_state = 0, .external_lex_state = 2}, + [462] = {.lex_state = 0, .external_lex_state = 2}, + [463] = {.lex_state = 0, .external_lex_state = 2}, + [464] = {.lex_state = 0, .external_lex_state = 2}, + [465] = {.lex_state = 0, .external_lex_state = 2}, + [466] = {.lex_state = 0, .external_lex_state = 2}, + [467] = {.lex_state = 2, .external_lex_state = 2}, + [468] = {.lex_state = 0, .external_lex_state = 2}, + [469] = {.lex_state = 0, .external_lex_state = 2}, + [470] = {.lex_state = 0, .external_lex_state = 2}, + [471] = {.lex_state = 0, .external_lex_state = 2}, + [472] = {.lex_state = 0, .external_lex_state = 2}, + [473] = {.lex_state = 0, .external_lex_state = 2}, + [474] = {.lex_state = 0, .external_lex_state = 2}, + [475] = {.lex_state = 0, .external_lex_state = 2}, + [476] = {.lex_state = 0, .external_lex_state = 2}, + [477] = {.lex_state = 0, .external_lex_state = 2}, + [478] = {.lex_state = 0, .external_lex_state = 2}, + [479] = {.lex_state = 0, .external_lex_state = 2}, + [480] = {.lex_state = 0, .external_lex_state = 2}, + [481] = {.lex_state = 0, .external_lex_state = 2}, + [482] = {.lex_state = 0, .external_lex_state = 2}, + [483] = {.lex_state = 0, .external_lex_state = 2}, + [484] = {.lex_state = 0, .external_lex_state = 2}, +}; + +static const uint16_t ts_parse_table[LARGE_STATE_COUNT][SYMBOL_COUNT] = { + [0] = { + [ts_builtin_sym_end] = ACTIONS(1), + [sym_identifier] = ACTIONS(1), + [anon_sym_EQ] = ACTIONS(1), + [anon_sym_output] = ACTIONS(1), + [sym_metadata_access] = ACTIONS(1), + [anon_sym_DOT] = ACTIONS(1), + [anon_sym_LPAREN] = ACTIONS(1), + [anon_sym_RPAREN] = ACTIONS(1), + [anon_sym_LBRACK] = ACTIONS(1), + [anon_sym_RBRACK] = ACTIONS(1), + [anon_sym_map] = ACTIONS(1), + [anon_sym_LBRACE] = ACTIONS(1), + [anon_sym_RBRACE] = ACTIONS(1), + [anon_sym_COMMA] = ACTIONS(1), + [anon_sym__] = ACTIONS(1), + [anon_sym_import] = ACTIONS(1), + [anon_sym_as] = ACTIONS(1), + [anon_sym_input] = ACTIONS(1), + [anon_sym_QMARK_DOT] = ACTIONS(1), + [anon_sym_QMARK_LBRACK] = ACTIONS(1), + [anon_sym_if] = ACTIONS(1), + [anon_sym_else] = ACTIONS(1), + [anon_sym_match] = ACTIONS(1), + [anon_sym_true] = ACTIONS(1), + [anon_sym_false] = ACTIONS(1), + [anon_sym_null] = ACTIONS(1), + [anon_sym_deleted] = ACTIONS(1), + [anon_sym_throw] = ACTIONS(1), + [anon_sym_void] = ACTIONS(1), + [anon_sym_COLON_COLON] = ACTIONS(1), + [anon_sym_COLON] = ACTIONS(1), + [anon_sym_BANG] = ACTIONS(1), + [anon_sym_DASH] = ACTIONS(1), + [anon_sym_PIPE_PIPE] = ACTIONS(1), + [anon_sym_AMP_AMP] = ACTIONS(1), + [anon_sym_PLUS] = ACTIONS(1), + [anon_sym_STAR] = ACTIONS(1), + [anon_sym_SLASH] = ACTIONS(1), + [anon_sym_PERCENT] = ACTIONS(1), + [anon_sym_EQ_EQ] = ACTIONS(1), + [anon_sym_BANG_EQ] = ACTIONS(1), + [anon_sym_GT] = ACTIONS(1), + [anon_sym_GT_EQ] = ACTIONS(1), + [anon_sym_LT] = ACTIONS(1), + [anon_sym_LT_EQ] = ACTIONS(1), + [anon_sym_EQ_GT] = ACTIONS(1), + [anon_sym_DASH_GT] = ACTIONS(1), + [sym_integer] = ACTIONS(1), + [sym_float] = ACTIONS(1), + [anon_sym_DQUOTE] = ACTIONS(1), + [sym_escape_sequence] = ACTIONS(1), + [sym_raw_string] = ACTIONS(1), + [sym_variable] = ACTIONS(1), + [sym_comment] = ACTIONS(3), + [sym__newline] = ACTIONS(1), + [sym__nl_skip] = ACTIONS(3), + }, + [1] = { + [sym_source_file] = STATE(435), + [sym__source_item] = STATE(328), + [sym__top_level_statement] = STATE(328), + [sym_assignment] = STATE(328), + [sym_assign_target] = STATE(467), + [sym_map_declaration] = STATE(328), + [sym_import_statement] = STATE(328), + [sym_if_statement] = STATE(328), + [sym_match_statement] = STATE(328), + [aux_sym_source_file_repeat1] = STATE(328), + [ts_builtin_sym_end] = ACTIONS(5), + [anon_sym_output] = ACTIONS(7), + [anon_sym_map] = ACTIONS(9), + [anon_sym_import] = ACTIONS(11), + [anon_sym_if] = ACTIONS(13), + [anon_sym_match] = ACTIONS(15), + [sym_variable] = ACTIONS(17), + [sym_comment] = ACTIONS(3), + [sym__newline] = ACTIONS(19), + [sym__nl_skip] = ACTIONS(3), + }, +}; + +static const uint16_t ts_small_parse_table[] = { + [0] = 24, + ACTIONS(21), 1, + sym_identifier, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(27), 1, + anon_sym_RPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(49), 1, + sym_integer, + ACTIONS(53), 1, + anon_sym_DQUOTE, + STATE(169), 1, + sym_qualified_name, + STATE(374), 1, + sym_named_argument, + STATE(433), 1, + sym_argument_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + STATE(484), 2, + sym_positional_arguments, + sym_named_arguments, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(51), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(283), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [102] = 25, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(57), 1, + anon_sym_RBRACE, + ACTIONS(59), 1, + anon_sym__, + ACTIONS(61), 1, + sym_integer, + STATE(33), 1, + aux_sym_match_statement_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(264), 1, + sym_match_statement_case, + STATE(370), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + STATE(482), 1, + sym_match_statement_cases, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(63), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(288), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [206] = 25, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(65), 1, + anon_sym_RBRACE, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(69), 1, + sym_integer, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(370), 1, + sym_object_entry, + STATE(455), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(71), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(303), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [310] = 24, + ACTIONS(21), 1, + sym_identifier, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(49), 1, + sym_integer, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(73), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(374), 1, + sym_named_argument, + STATE(448), 1, + sym_argument_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + STATE(484), 2, + sym_positional_arguments, + sym_named_arguments, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(51), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(283), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [412] = 24, + ACTIONS(21), 1, + sym_identifier, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(49), 1, + sym_integer, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(75), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(374), 1, + sym_named_argument, + STATE(427), 1, + sym_argument_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + STATE(484), 2, + sym_positional_arguments, + sym_named_arguments, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(51), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(283), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [514] = 24, + ACTIONS(21), 1, + sym_identifier, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(49), 1, + sym_integer, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(77), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(374), 1, + sym_named_argument, + STATE(451), 1, + sym_argument_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + STATE(484), 2, + sym_positional_arguments, + sym_named_arguments, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(51), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(283), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [616] = 26, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(79), 1, + anon_sym_RBRACE, + ACTIONS(81), 1, + sym_integer, + ACTIONS(85), 1, + sym_variable, + STATE(43), 1, + aux_sym_expr_body_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(370), 1, + sym_object_entry, + STATE(443), 1, + sym_var_assignment, + STATE(465), 1, + sym_expr_body, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(83), 2, + sym_float, + sym_raw_string, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(304), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [722] = 26, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(79), 1, + anon_sym_RBRACE, + ACTIONS(81), 1, + sym_integer, + ACTIONS(85), 1, + sym_variable, + STATE(43), 1, + aux_sym_expr_body_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(370), 1, + sym_object_entry, + STATE(443), 1, + sym_var_assignment, + STATE(469), 1, + sym_expr_body, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(83), 2, + sym_float, + sym_raw_string, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(304), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [828] = 24, + ACTIONS(21), 1, + sym_identifier, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(49), 1, + sym_integer, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(87), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(374), 1, + sym_named_argument, + STATE(426), 1, + sym_argument_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + STATE(484), 2, + sym_positional_arguments, + sym_named_arguments, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(51), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(283), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [930] = 24, + ACTIONS(21), 1, + sym_identifier, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(49), 1, + sym_integer, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(89), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(374), 1, + sym_named_argument, + STATE(478), 1, + sym_argument_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + STATE(484), 2, + sym_positional_arguments, + sym_named_arguments, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(51), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(283), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1032] = 25, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(69), 1, + sym_integer, + ACTIONS(91), 1, + anon_sym_RBRACE, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(370), 1, + sym_object_entry, + STATE(455), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(71), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(303), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1136] = 26, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(81), 1, + sym_integer, + ACTIONS(85), 1, + sym_variable, + ACTIONS(93), 1, + anon_sym_RBRACE, + STATE(43), 1, + aux_sym_expr_body_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(378), 1, + sym_object_entry, + STATE(443), 1, + sym_var_assignment, + STATE(460), 1, + sym_expr_body, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(83), 2, + sym_float, + sym_raw_string, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(304), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1242] = 25, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(69), 1, + sym_integer, + ACTIONS(95), 1, + anon_sym_RBRACE, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(370), 1, + sym_object_entry, + STATE(455), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(71), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(303), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1346] = 25, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(69), 1, + sym_integer, + ACTIONS(97), 1, + anon_sym_RBRACE, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(370), 1, + sym_object_entry, + STATE(422), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(71), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(303), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1450] = 24, + ACTIONS(21), 1, + sym_identifier, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(49), 1, + sym_integer, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(99), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(374), 1, + sym_named_argument, + STATE(476), 1, + sym_argument_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + STATE(484), 2, + sym_positional_arguments, + sym_named_arguments, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(51), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(283), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1552] = 24, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(101), 1, + anon_sym_RBRACE, + ACTIONS(103), 1, + sym_integer, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(439), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(105), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(311), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1653] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(141), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(394), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1756] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(143), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(406), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1859] = 24, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(59), 1, + anon_sym__, + ACTIONS(145), 1, + anon_sym_RBRACE, + ACTIONS(147), 1, + sym_integer, + STATE(33), 1, + aux_sym_match_statement_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(264), 1, + sym_match_statement_case, + STATE(432), 1, + sym_match_statement_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(149), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(312), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [1960] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(151), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(404), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2063] = 24, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(59), 1, + anon_sym__, + ACTIONS(147), 1, + sym_integer, + ACTIONS(153), 1, + anon_sym_RBRACE, + STATE(33), 1, + aux_sym_match_statement_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(264), 1, + sym_match_statement_case, + STATE(436), 1, + sym_match_statement_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(149), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(312), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2164] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(155), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(409), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2267] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(157), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(396), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2370] = 24, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(103), 1, + sym_integer, + ACTIONS(159), 1, + anon_sym_RBRACE, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(472), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(105), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(311), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2471] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(161), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(397), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2574] = 24, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(103), 1, + sym_integer, + ACTIONS(163), 1, + anon_sym_RBRACE, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(445), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(105), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(311), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2675] = 24, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(103), 1, + sym_integer, + ACTIONS(165), 1, + anon_sym_RBRACE, + STATE(41), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(421), 1, + sym_match_cases, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(105), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(311), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2776] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(167), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(400), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2879] = 25, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(169), 1, + sym__newline, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(402), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [2982] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(401), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3082] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(405), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3182] = 23, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(59), 1, + anon_sym__, + ACTIONS(147), 1, + sym_integer, + ACTIONS(171), 1, + anon_sym_RBRACE, + STATE(35), 1, + aux_sym_match_statement_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(264), 1, + sym_match_statement_case, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(149), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(312), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3280] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(411), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3380] = 23, + ACTIONS(173), 1, + sym_identifier, + ACTIONS(176), 1, + anon_sym_output, + ACTIONS(179), 1, + anon_sym_LPAREN, + ACTIONS(182), 1, + anon_sym_LBRACK, + ACTIONS(185), 1, + anon_sym_LBRACE, + ACTIONS(188), 1, + anon_sym_RBRACE, + ACTIONS(190), 1, + anon_sym__, + ACTIONS(193), 1, + anon_sym_input, + ACTIONS(196), 1, + anon_sym_if, + ACTIONS(199), 1, + anon_sym_match, + ACTIONS(205), 1, + anon_sym_null, + ACTIONS(214), 1, + sym_integer, + ACTIONS(220), 1, + anon_sym_DQUOTE, + STATE(35), 1, + aux_sym_match_statement_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(264), 1, + sym_match_statement_case, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(202), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(211), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(208), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(217), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(312), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3478] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(408), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3578] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(415), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3678] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(413), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3778] = 23, + ACTIONS(223), 1, + sym_identifier, + ACTIONS(226), 1, + anon_sym_output, + ACTIONS(229), 1, + anon_sym_LPAREN, + ACTIONS(232), 1, + anon_sym_LBRACK, + ACTIONS(235), 1, + anon_sym_LBRACE, + ACTIONS(238), 1, + anon_sym_RBRACE, + ACTIONS(240), 1, + anon_sym__, + ACTIONS(243), 1, + anon_sym_input, + ACTIONS(246), 1, + anon_sym_if, + ACTIONS(249), 1, + anon_sym_match, + ACTIONS(255), 1, + anon_sym_null, + ACTIONS(264), 1, + sym_integer, + ACTIONS(270), 1, + anon_sym_DQUOTE, + STATE(39), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(252), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(261), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(258), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(267), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(311), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3876] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(409), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [3976] = 23, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(67), 1, + anon_sym__, + ACTIONS(103), 1, + sym_integer, + ACTIONS(273), 1, + anon_sym_RBRACE, + STATE(39), 1, + aux_sym_match_cases_repeat1, + STATE(169), 1, + sym_qualified_name, + STATE(268), 1, + sym_match_case, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(105), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(311), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4074] = 24, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(133), 1, + sym_integer, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + STATE(53), 1, + aux_sym_expr_body_repeat1, + STATE(202), 1, + sym_qualified_name, + STATE(403), 1, + sym_expr_body, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(135), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(291), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4174] = 23, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(85), 1, + sym_variable, + ACTIONS(275), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(267), 1, + aux_sym_expr_body_repeat1, + STATE(443), 1, + sym_var_assignment, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(277), 2, + sym_float, + sym_raw_string, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(318), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4271] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(179), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(56), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(279), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(281), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [4334] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(93), 1, + anon_sym_RBRACE, + ACTIONS(285), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(378), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(287), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(313), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4429] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(139), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(65), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(289), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(291), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [4492] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(293), 1, + sym_identifier, + ACTIONS(295), 1, + anon_sym__, + ACTIONS(297), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(373), 1, + sym_parameter, + STATE(464), 1, + sym_parameter_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(299), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(317), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4587] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(79), 1, + anon_sym_RBRACE, + ACTIONS(285), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(370), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(287), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(313), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4682] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(145), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(59), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(301), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(303), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [4745] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(136), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(54), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(305), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(307), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [4808] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(285), 1, + sym_integer, + ACTIONS(309), 1, + anon_sym_RBRACE, + STATE(169), 1, + sym_qualified_name, + STATE(416), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(287), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(313), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4903] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(285), 1, + sym_integer, + ACTIONS(311), 1, + anon_sym_RBRACE, + STATE(169), 1, + sym_qualified_name, + STATE(416), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(287), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(313), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [4998] = 23, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(139), 1, + sym_variable, + ACTIONS(313), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(267), 1, + aux_sym_expr_body_repeat1, + STATE(443), 1, + sym_var_assignment, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(315), 2, + sym_float, + sym_raw_string, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + STATE(306), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [5095] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(138), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(65), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(317), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(319), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [5158] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(285), 1, + sym_integer, + ACTIONS(321), 1, + anon_sym_RBRACE, + STATE(169), 1, + sym_qualified_name, + STATE(416), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(287), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(313), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [5253] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(136), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(65), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(305), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(307), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [5316] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(285), 1, + sym_integer, + ACTIONS(323), 1, + anon_sym_RBRACE, + STATE(169), 1, + sym_qualified_name, + STATE(416), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(287), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(313), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [5411] = 22, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(293), 1, + sym_identifier, + ACTIONS(295), 1, + anon_sym__, + ACTIONS(325), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(373), 1, + sym_parameter, + STATE(464), 1, + sym_parameter_list, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(327), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(316), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [5506] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(137), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(65), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(329), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(331), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [5569] = 6, + ACTIONS(283), 1, + anon_sym_else, + STATE(137), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(46), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(329), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(331), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [5632] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(333), 1, + anon_sym_RBRACK, + ACTIONS(335), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(337), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(285), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [5724] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(339), 1, + anon_sym_RBRACK, + ACTIONS(341), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(343), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(284), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [5816] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(345), 1, + anon_sym_RBRACK, + ACTIONS(347), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(349), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(282), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [5908] = 21, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(351), 1, + anon_sym_LBRACE, + ACTIONS(353), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(213), 1, + sym_lambda_block, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(355), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(225), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6000] = 5, + ACTIONS(361), 1, + anon_sym_else, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(65), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(357), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(359), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [6060] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(335), 1, + sym_integer, + ACTIONS(364), 1, + anon_sym_RBRACK, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(337), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(285), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6152] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(366), 1, + anon_sym_LBRACE, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(374), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(266), 1, + sym_match_block, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(376), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(185), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6244] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(335), 1, + sym_integer, + ACTIONS(378), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(337), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(285), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6336] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(380), 1, + anon_sym_LBRACE, + ACTIONS(382), 1, + sym_integer, + STATE(149), 1, + sym_lambda_block, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(384), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(182), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6428] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(380), 1, + anon_sym_LBRACE, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(392), 1, + sym_integer, + STATE(149), 1, + sym_lambda_block, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(394), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(302), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6520] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(335), 1, + sym_integer, + ACTIONS(396), 1, + anon_sym_RBRACK, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(337), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(285), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6612] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(335), 1, + sym_integer, + ACTIONS(398), 1, + anon_sym_RBRACK, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(337), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(285), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6704] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(285), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(416), 1, + sym_object_entry, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(287), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(313), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6796] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(335), 1, + sym_integer, + ACTIONS(400), 1, + anon_sym_RPAREN, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(337), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(285), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6888] = 21, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(380), 1, + anon_sym_LBRACE, + ACTIONS(402), 1, + sym_integer, + STATE(149), 1, + sym_lambda_block, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(404), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(270), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [6980] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(406), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(408), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(155), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7069] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(410), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(412), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(326), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7158] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(414), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(416), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(324), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7247] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(418), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(420), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(292), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7336] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(422), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(424), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(320), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7425] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(426), 1, + anon_sym_LBRACE, + ACTIONS(428), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(430), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(293), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7514] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(432), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(434), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(290), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7603] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(436), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(438), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(309), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7692] = 6, + ACTIONS(444), 1, + anon_sym_LPAREN, + ACTIONS(446), 1, + anon_sym_COLON_COLON, + ACTIONS(448), 1, + anon_sym_DASH_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 19, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_COLON, + anon_sym_BANG, + anon_sym_DASH, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(442), 24, + anon_sym_DOT, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [7753] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(450), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(452), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(289), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7842] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(335), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(337), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(285), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [7931] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(454), 1, + anon_sym_LBRACE, + ACTIONS(456), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(458), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(296), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8020] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(460), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(462), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(314), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8109] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(464), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(466), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(258), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8198] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(468), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(470), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(310), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8287] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(472), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(474), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(315), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8376] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(476), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(478), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(257), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8465] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(480), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(482), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(219), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8554] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(484), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(486), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(220), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8643] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(488), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(490), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(221), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8732] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(492), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(494), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(222), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8821] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(496), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(498), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(223), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8910] = 20, + ACTIONS(107), 1, + sym_identifier, + ACTIONS(109), 1, + anon_sym_output, + ACTIONS(111), 1, + anon_sym_LPAREN, + ACTIONS(113), 1, + anon_sym_LBRACK, + ACTIONS(115), 1, + anon_sym_LBRACE, + ACTIONS(117), 1, + anon_sym__, + ACTIONS(119), 1, + anon_sym_input, + ACTIONS(121), 1, + anon_sym_if, + ACTIONS(123), 1, + anon_sym_match, + ACTIONS(127), 1, + anon_sym_null, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(500), 1, + sym_integer, + STATE(202), 1, + sym_qualified_name, + STATE(446), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(125), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(131), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(129), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(502), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(224), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [8999] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(504), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(506), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(325), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9088] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(508), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(510), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(321), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9177] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(428), 1, + sym_integer, + ACTIONS(512), 1, + anon_sym_LBRACE, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(430), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(293), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9266] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(406), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(408), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(155), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9355] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(514), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(516), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(319), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9444] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(518), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(520), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(323), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9533] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(522), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(524), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(297), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9622] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(526), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(528), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(298), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9711] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(530), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(532), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(299), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9800] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(534), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(536), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(148), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9889] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(538), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(540), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(300), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [9978] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(542), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(544), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(301), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10067] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(546), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(548), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(184), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10156] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(550), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(552), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(186), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10245] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(554), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(556), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(187), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10334] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(534), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(536), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(148), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10423] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(558), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(560), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(181), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10512] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(562), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(564), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(183), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10601] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(428), 1, + sym_integer, + ACTIONS(566), 1, + anon_sym_LBRACE, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(430), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(293), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10690] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(406), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(408), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(155), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10779] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(568), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(570), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(274), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10868] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(572), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(574), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(275), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [10957] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(576), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(578), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(276), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [11046] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(534), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(536), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(148), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [11135] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(580), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(582), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(269), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [11224] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(33), 1, + anon_sym__, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(39), 1, + anon_sym_match, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(584), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(480), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(47), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(586), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(277), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [11313] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(386), 1, + anon_sym__, + ACTIONS(388), 1, + anon_sym_match, + ACTIONS(588), 1, + anon_sym_LBRACE, + ACTIONS(590), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(475), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(390), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(592), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(305), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [11402] = 20, + ACTIONS(23), 1, + anon_sym_output, + ACTIONS(25), 1, + anon_sym_LPAREN, + ACTIONS(29), 1, + anon_sym_LBRACK, + ACTIONS(31), 1, + anon_sym_LBRACE, + ACTIONS(35), 1, + anon_sym_input, + ACTIONS(37), 1, + anon_sym_if, + ACTIONS(43), 1, + anon_sym_null, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(55), 1, + sym_identifier, + ACTIONS(368), 1, + anon_sym__, + ACTIONS(370), 1, + anon_sym_match, + ACTIONS(594), 1, + sym_integer, + STATE(169), 1, + sym_qualified_name, + STATE(459), 1, + sym__lambda_params, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(41), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(372), 2, + anon_sym_BANG, + anon_sym_DASH, + ACTIONS(45), 3, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + ACTIONS(596), 3, + sym_float, + sym_raw_string, + sym_variable, + STATE(322), 22, + sym__expression, + sym__primary, + sym_input, + sym_output, + sym_field_access, + sym_null_safe_field_access, + sym_method_call, + sym_null_safe_method_call, + sym_index, + sym_null_safe_index, + sym_call_expression, + sym_unary_expression, + sym_binary_expression, + sym_if_expression, + sym_match_expression, + sym_lambda_expression, + sym_parenthesized_expression, + sym_string, + sym_boolean, + sym_null, + sym_array, + sym_object, + [11491] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(598), 18, + anon_sym_EQ, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(600), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11545] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(602), 18, + anon_sym_EQ, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(604), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11599] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(606), 18, + anon_sym_EQ, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(608), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11653] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(610), 18, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(612), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11707] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(614), 18, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(616), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11761] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(618), 18, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(620), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11815] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(622), 18, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(624), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11869] = 4, + ACTIONS(628), 1, + sym_metadata_access, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(626), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(630), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11925] = 4, + ACTIONS(634), 1, + sym_metadata_access, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(632), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(636), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [11981] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(317), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(319), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12034] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(289), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(291), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12087] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(638), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(640), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12140] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(642), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(644), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12193] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(646), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(648), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12246] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(650), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(652), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12299] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(654), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(656), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12352] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(658), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(660), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12405] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(662), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(664), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12458] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(329), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(331), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12511] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(666), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(668), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12564] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(670), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(672), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12617] = 7, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(678), 23, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12678] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(686), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(688), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12731] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(690), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(692), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12784] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(694), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(696), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12837] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(698), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(700), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12890] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(702), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(704), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12943] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(706), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(708), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [12996] = 7, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(710), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(712), 23, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13057] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(714), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(716), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13110] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(718), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(720), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13163] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(722), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(724), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13216] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(726), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(728), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13269] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(730), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(732), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13322] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(734), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(736), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13375] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(738), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(740), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13428] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(742), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(744), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13481] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(746), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(748), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13534] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(750), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(752), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13587] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(754), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(756), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13640] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(758), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(760), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13693] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(762), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(764), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13746] = 4, + ACTIONS(444), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(442), 26, + anon_sym_DOT, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13801] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(766), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(768), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13854] = 4, + ACTIONS(774), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(770), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(772), 26, + anon_sym_DOT, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13909] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(776), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(778), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [13962] = 4, + ACTIONS(784), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(780), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(782), 26, + anon_sym_DOT, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14017] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(786), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(788), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14070] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(790), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(792), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14123] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(794), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(796), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14176] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(798), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(800), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14229] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(802), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(804), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14282] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(305), 17, + anon_sym_output, + anon_sym__, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(307), 27, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_RBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14335] = 6, + ACTIONS(702), 1, + anon_sym_as, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(813), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(811), 7, + anon_sym_LPAREN, + anon_sym_RBRACE, + anon_sym_COMMA, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(806), 14, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + sym_integer, + sym_identifier, + ACTIONS(808), 16, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [14390] = 11, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 12, + anon_sym_LPAREN, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(674), 14, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + sym_integer, + sym_identifier, + [14454] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(688), 8, + anon_sym_LPAREN, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(686), 14, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + sym_integer, + sym_identifier, + [14524] = 9, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 14, + anon_sym_LPAREN, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(674), 16, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + [14584] = 8, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(674), 16, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + anon_sym_GT, + anon_sym_LT, + sym_integer, + sym_identifier, + ACTIONS(678), 16, + anon_sym_LPAREN, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + [14642] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(832), 8, + anon_sym_LPAREN, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(830), 14, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + sym_integer, + sym_identifier, + [14712] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 9, + anon_sym_LPAREN, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_PIPE_PIPE, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(674), 14, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + sym_integer, + sym_identifier, + [14780] = 12, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 10, + anon_sym_LPAREN, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(674), 14, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + anon_sym_BANG, + sym_integer, + sym_identifier, + [14846] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(248), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(289), 2, + anon_sym_GT, + anon_sym_LT, + STATE(197), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(291), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [14891] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(244), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(305), 2, + anon_sym_GT, + anon_sym_LT, + STATE(195), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(307), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [14936] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(239), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(279), 2, + anon_sym_GT, + anon_sym_LT, + STATE(192), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(281), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [14981] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(243), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(301), 2, + anon_sym_GT, + anon_sym_LT, + STATE(194), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(303), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15026] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(244), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(305), 2, + anon_sym_GT, + anon_sym_LT, + STATE(197), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(307), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15071] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(246), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(329), 2, + anon_sym_GT, + anon_sym_LT, + STATE(188), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(331), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15116] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(246), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(329), 2, + anon_sym_GT, + anon_sym_LT, + STATE(197), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(331), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15161] = 6, + ACTIONS(834), 1, + anon_sym_else, + STATE(247), 1, + sym_else_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(317), 2, + anon_sym_GT, + anon_sym_LT, + STATE(197), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(319), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15206] = 6, + ACTIONS(448), 1, + anon_sym_DASH_GT, + ACTIONS(836), 1, + anon_sym_LPAREN, + ACTIONS(838), 1, + anon_sym_COLON_COLON, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 3, + anon_sym_DASH, + anon_sym_GT, + anon_sym_LT, + ACTIONS(442), 23, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15250] = 5, + ACTIONS(840), 1, + anon_sym_else, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(357), 2, + anon_sym_GT, + anon_sym_LT, + STATE(197), 2, + sym_else_if_clause, + aux_sym_if_expression_repeat1, + ACTIONS(359), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15292] = 4, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(702), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(843), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + ACTIONS(704), 17, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_as, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [15331] = 5, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(704), 2, + anon_sym_LBRACE, + anon_sym_as, + ACTIONS(813), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(811), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + ACTIONS(808), 15, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [15372] = 4, + ACTIONS(845), 1, + sym_metadata_access, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(626), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(630), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15410] = 4, + ACTIONS(847), 1, + sym_metadata_access, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(632), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(636), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15448] = 4, + ACTIONS(836), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(442), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15486] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(598), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(600), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15522] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(602), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(604), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15558] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(606), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(608), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15594] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(762), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(764), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15630] = 4, + ACTIONS(849), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(770), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(772), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15668] = 4, + ACTIONS(851), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(780), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(782), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15706] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(610), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(612), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15742] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(614), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(616), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15778] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(618), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(620), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15814] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(622), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(624), 25, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15850] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(686), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(688), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15885] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(718), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(720), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15920] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(722), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(724), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15955] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(742), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(744), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [15990] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(754), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(756), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16025] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(726), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(728), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16060] = 8, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 17, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16105] = 12, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(865), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 10, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + anon_sym_PIPE_PIPE, + sym_variable, + [16158] = 11, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 11, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + sym_variable, + [16209] = 7, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(678), 20, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16252] = 10, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 13, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + sym_variable, + [16301] = 9, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 15, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16348] = 13, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(865), 1, + anon_sym_AMP_AMP, + ACTIONS(873), 1, + anon_sym_PIPE_PIPE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(688), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [16403] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(690), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(692), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16438] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(694), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(696), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16473] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(730), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(732), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16508] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(738), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(740), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16543] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(746), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(748), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16578] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(750), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(752), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16613] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(758), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(760), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16648] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(734), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(736), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16683] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(776), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(778), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16718] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(786), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(788), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16753] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(790), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(792), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16788] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(794), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(796), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16823] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(798), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(800), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16858] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(305), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(307), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16893] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(666), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(668), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16928] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(670), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(672), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16963] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(662), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(664), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [16998] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(329), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(331), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17033] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(317), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(319), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17068] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(802), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(804), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17103] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(289), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(291), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17138] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(638), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(640), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17173] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(642), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(644), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17208] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(646), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(648), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17243] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(650), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(652), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17278] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(654), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(656), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17313] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(658), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(660), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17348] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(714), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(716), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17383] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(698), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(700), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17418] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(702), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(704), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17453] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(706), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(708), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17488] = 7, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(710), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(712), 20, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17531] = 13, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(865), 1, + anon_sym_AMP_AMP, + ACTIONS(873), 1, + anon_sym_PIPE_PIPE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(875), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [17586] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(766), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(768), 24, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_if, + anon_sym_match, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + sym_variable, + [17621] = 5, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(704), 2, + anon_sym_LBRACE, + anon_sym_as, + ACTIONS(813), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(811), 6, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_EQ_GT, + ACTIONS(808), 15, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [17659] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(879), 11, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(877), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17692] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(883), 11, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(881), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17725] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(887), 11, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(885), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17758] = 4, + ACTIONS(893), 1, + anon_sym_COMMA, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(891), 10, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(889), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17793] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(897), 11, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(895), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17826] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(832), 11, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(830), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17859] = 6, + ACTIONS(903), 1, + sym_variable, + STATE(267), 1, + aux_sym_expr_body_repeat1, + STATE(443), 1, + sym_var_assignment, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(901), 8, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + ACTIONS(899), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17898] = 4, + ACTIONS(910), 1, + anon_sym_COMMA, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(908), 10, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(906), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [17933] = 10, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 10, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_EQ_GT, + [17979] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(688), 6, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_EQ_GT, + [18031] = 7, + ACTIONS(444), 1, + anon_sym_LPAREN, + ACTIONS(446), 1, + anon_sym_COLON_COLON, + ACTIONS(448), 1, + anon_sym_DASH_GT, + ACTIONS(926), 1, + anon_sym_COLON, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 3, + anon_sym_DASH, + anon_sym_GT, + anon_sym_LT, + ACTIONS(442), 16, + anon_sym_DOT, + anon_sym_RPAREN, + anon_sym_LBRACK, + anon_sym_COMMA, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [18071] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(188), 10, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(928), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [18103] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(238), 10, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_RBRACE, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(930), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [18135] = 8, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 14, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + [18177] = 12, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 7, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_PIPE_PIPE, + anon_sym_EQ_GT, + [18227] = 11, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 8, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_GT, + [18275] = 9, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 12, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_RBRACE, + anon_sym_COMMA, + anon_sym_COLON, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + anon_sym_EQ_GT, + [18319] = 9, + ACTIONS(444), 1, + anon_sym_LPAREN, + ACTIONS(446), 1, + anon_sym_COLON_COLON, + ACTIONS(448), 1, + anon_sym_DASH_GT, + ACTIONS(932), 1, + anon_sym_EQ, + ACTIONS(934), 1, + anon_sym_RPAREN, + ACTIONS(937), 1, + anon_sym_COMMA, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 3, + anon_sym_DASH, + anon_sym_GT, + anon_sym_LT, + ACTIONS(442), 14, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [18363] = 7, + ACTIONS(939), 1, + anon_sym_EQ, + ACTIONS(941), 1, + anon_sym_DOT, + ACTIONS(944), 1, + anon_sym_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 2, + anon_sym_GT, + anon_sym_LT, + STATE(356), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + ACTIONS(442), 15, + anon_sym_RBRACE, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_COLON, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [18402] = 7, + ACTIONS(939), 1, + anon_sym_EQ, + ACTIONS(941), 1, + anon_sym_DOT, + ACTIONS(944), 1, + anon_sym_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(440), 2, + anon_sym_GT, + anon_sym_LT, + STATE(356), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + ACTIONS(442), 15, + sym__newline, + anon_sym_RBRACE, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [18441] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(901), 9, + anon_sym_LPAREN, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_BANG, + anon_sym_DASH, + sym_float, + anon_sym_DQUOTE, + sym_raw_string, + sym_variable, + ACTIONS(899), 13, + anon_sym_output, + anon_sym__, + anon_sym_input, + anon_sym_if, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + sym_integer, + sym_identifier, + [18472] = 15, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(947), 1, + anon_sym_RBRACK, + ACTIONS(949), 1, + anon_sym_COMMA, + STATE(381), 1, + aux_sym_positional_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18525] = 15, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(951), 1, + anon_sym_RPAREN, + ACTIONS(953), 1, + anon_sym_COMMA, + STATE(387), 1, + aux_sym_positional_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18578] = 15, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(955), 1, + anon_sym_RBRACK, + ACTIONS(957), 1, + anon_sym_COMMA, + STATE(371), 1, + aux_sym_positional_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18631] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(959), 3, + anon_sym_RPAREN, + anon_sym_RBRACK, + anon_sym_COMMA, + [18680] = 5, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(961), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(366), 3, + sym__field_name, + sym__word, + sym_string, + ACTIONS(963), 14, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + [18712] = 3, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(813), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(808), 17, + anon_sym_DOT, + anon_sym_LBRACK, + anon_sym_LBRACE, + anon_sym_as, + anon_sym_QMARK_DOT, + anon_sym_QMARK_LBRACK, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [18740] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(965), 1, + anon_sym_COLON, + ACTIONS(967), 1, + anon_sym_EQ_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18790] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(969), 1, + anon_sym_LBRACE, + STATE(339), 1, + sym_statement_block, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18840] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(971), 2, + anon_sym_RPAREN, + anon_sym_COMMA, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18888] = 13, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(865), 1, + anon_sym_AMP_AMP, + ACTIONS(873), 1, + anon_sym_PIPE_PIPE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(973), 2, + sym__newline, + anon_sym_RBRACE, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18936] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(975), 2, + anon_sym_RBRACE, + anon_sym_COMMA, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [18984] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(977), 1, + anon_sym_LBRACE, + ACTIONS(979), 1, + anon_sym_as, + ACTIONS(983), 1, + anon_sym_PIPE_PIPE, + ACTIONS(985), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(989), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(991), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(993), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19034] = 5, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(995), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(171), 3, + sym__field_name, + sym__word, + sym_string, + ACTIONS(963), 14, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + [19066] = 5, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(997), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(173), 3, + sym__field_name, + sym__word, + sym_string, + ACTIONS(963), 14, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + [19098] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(983), 1, + anon_sym_PIPE_PIPE, + ACTIONS(985), 1, + anon_sym_AMP_AMP, + ACTIONS(999), 1, + anon_sym_LBRACE, + ACTIONS(1001), 1, + anon_sym_as, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(989), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(991), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(993), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19148] = 8, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 10, + anon_sym_LBRACE, + anon_sym_as, + anon_sym_DASH, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_PLUS, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [19186] = 12, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(985), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(989), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(991), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(993), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(678), 3, + anon_sym_LBRACE, + anon_sym_as, + anon_sym_PIPE_PIPE, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19232] = 11, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(989), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(991), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(993), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 4, + anon_sym_LBRACE, + anon_sym_as, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + [19276] = 10, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(991), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(993), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 6, + anon_sym_LBRACE, + anon_sym_as, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + [19318] = 9, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(674), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + ACTIONS(678), 8, + anon_sym_LBRACE, + anon_sym_as, + anon_sym_PIPE_PIPE, + anon_sym_AMP_AMP, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + [19358] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(983), 1, + anon_sym_PIPE_PIPE, + ACTIONS(985), 1, + anon_sym_AMP_AMP, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(688), 2, + anon_sym_LBRACE, + anon_sym_as, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(989), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(991), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(993), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19406] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(965), 1, + anon_sym_COLON, + ACTIONS(1003), 1, + anon_sym_EQ_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19456] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(965), 1, + anon_sym_COLON, + ACTIONS(973), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19506] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(983), 1, + anon_sym_PIPE_PIPE, + ACTIONS(985), 1, + anon_sym_AMP_AMP, + ACTIONS(1005), 1, + anon_sym_LBRACE, + ACTIONS(1007), 1, + anon_sym_as, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(981), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(989), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(991), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(993), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(987), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19556] = 13, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(865), 1, + anon_sym_AMP_AMP, + ACTIONS(873), 1, + anon_sym_PIPE_PIPE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(1009), 2, + sym__newline, + anon_sym_RBRACE, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19604] = 5, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(1011), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(207), 3, + sym__field_name, + sym__word, + sym_string, + ACTIONS(1013), 14, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + [19636] = 5, + ACTIONS(137), 1, + anon_sym_DQUOTE, + ACTIONS(1015), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(208), 3, + sym__field_name, + sym__word, + sym_string, + ACTIONS(1013), 14, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_as, + anon_sym_input, + anon_sym_if, + anon_sym_else, + anon_sym_match, + anon_sym_true, + anon_sym_false, + anon_sym_null, + anon_sym_deleted, + anon_sym_throw, + anon_sym_void, + [19668] = 14, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(969), 1, + anon_sym_LBRACE, + STATE(330), 1, + sym_statement_block, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19718] = 13, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(865), 1, + anon_sym_AMP_AMP, + ACTIONS(873), 1, + anon_sym_PIPE_PIPE, + ACTIONS(1017), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19765] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1003), 1, + anon_sym_EQ_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19812] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(967), 1, + anon_sym_EQ_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19859] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(965), 1, + anon_sym_COLON, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19906] = 13, + ACTIONS(853), 1, + anon_sym_DOT, + ACTIONS(855), 1, + anon_sym_LBRACK, + ACTIONS(857), 1, + anon_sym_QMARK_DOT, + ACTIONS(859), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(865), 1, + anon_sym_AMP_AMP, + ACTIONS(873), 1, + anon_sym_PIPE_PIPE, + ACTIONS(1019), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(863), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(867), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(869), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(871), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(861), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [19953] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(1021), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20000] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1023), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20047] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1025), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20094] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1009), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20141] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1027), 1, + anon_sym_RBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20188] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(1029), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20235] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1031), 1, + anon_sym_RBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20282] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(1033), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20329] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1035), 1, + anon_sym_RBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20376] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1037), 1, + anon_sym_RBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20423] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(920), 1, + anon_sym_PIPE_PIPE, + ACTIONS(922), 1, + anon_sym_AMP_AMP, + ACTIONS(1039), 1, + anon_sym_RBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(912), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(916), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(918), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(924), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(914), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20470] = 13, + ACTIONS(676), 1, + anon_sym_DOT, + ACTIONS(680), 1, + anon_sym_LBRACK, + ACTIONS(682), 1, + anon_sym_QMARK_DOT, + ACTIONS(684), 1, + anon_sym_QMARK_LBRACK, + ACTIONS(824), 1, + anon_sym_PIPE_PIPE, + ACTIONS(826), 1, + anon_sym_AMP_AMP, + ACTIONS(1041), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(816), 2, + anon_sym_DASH, + anon_sym_PLUS, + ACTIONS(820), 2, + anon_sym_GT, + anon_sym_LT, + ACTIONS(822), 2, + anon_sym_GT_EQ, + anon_sym_LT_EQ, + ACTIONS(828), 2, + anon_sym_EQ_EQ, + anon_sym_BANG_EQ, + ACTIONS(818), 3, + anon_sym_STAR, + anon_sym_SLASH, + anon_sym_PERCENT, + [20517] = 11, + ACTIONS(1043), 1, + ts_builtin_sym_end, + ACTIONS(1045), 1, + anon_sym_output, + ACTIONS(1048), 1, + anon_sym_map, + ACTIONS(1051), 1, + anon_sym_import, + ACTIONS(1054), 1, + anon_sym_if, + ACTIONS(1057), 1, + anon_sym_match, + ACTIONS(1060), 1, + sym_variable, + ACTIONS(1063), 1, + sym__newline, + STATE(467), 1, + sym_assign_target, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(327), 8, + sym__source_item, + sym__top_level_statement, + sym_assignment, + sym_map_declaration, + sym_import_statement, + sym_if_statement, + sym_match_statement, + aux_sym_source_file_repeat1, + [20559] = 11, + ACTIONS(7), 1, + anon_sym_output, + ACTIONS(9), 1, + anon_sym_map, + ACTIONS(11), 1, + anon_sym_import, + ACTIONS(13), 1, + anon_sym_if, + ACTIONS(15), 1, + anon_sym_match, + ACTIONS(17), 1, + sym_variable, + ACTIONS(1066), 1, + ts_builtin_sym_end, + ACTIONS(1068), 1, + sym__newline, + STATE(467), 1, + sym_assign_target, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(327), 8, + sym__source_item, + sym__top_level_statement, + sym_assignment, + sym_map_declaration, + sym_import_statement, + sym_if_statement, + sym_match_statement, + aux_sym_source_file_repeat1, + [20601] = 5, + ACTIONS(1072), 1, + anon_sym_else, + STATE(343), 1, + sym_else_statement_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(333), 2, + sym_else_if_statement_clause, + aux_sym_if_statement_repeat1, + ACTIONS(1070), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [20627] = 5, + ACTIONS(1072), 1, + anon_sym_else, + STATE(347), 1, + sym_else_statement_clause, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(329), 2, + sym_else_if_statement_clause, + aux_sym_if_statement_repeat1, + ACTIONS(1074), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [20653] = 9, + ACTIONS(7), 1, + anon_sym_output, + ACTIONS(13), 1, + anon_sym_if, + ACTIONS(15), 1, + anon_sym_match, + ACTIONS(17), 1, + sym_variable, + ACTIONS(1076), 1, + anon_sym_RBRACE, + ACTIONS(1078), 1, + sym__newline, + STATE(467), 1, + sym_assign_target, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(332), 5, + sym_assignment, + sym_if_statement, + sym__statement, + sym_match_statement, + aux_sym_statement_block_repeat1, + [20686] = 9, + ACTIONS(7), 1, + anon_sym_output, + ACTIONS(13), 1, + anon_sym_if, + ACTIONS(15), 1, + anon_sym_match, + ACTIONS(17), 1, + sym_variable, + ACTIONS(1080), 1, + anon_sym_RBRACE, + ACTIONS(1082), 1, + sym__newline, + STATE(467), 1, + sym_assign_target, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(334), 5, + sym_assignment, + sym_if_statement, + sym__statement, + sym_match_statement, + aux_sym_statement_block_repeat1, + [20719] = 4, + ACTIONS(1086), 1, + anon_sym_else, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(333), 2, + sym_else_if_statement_clause, + aux_sym_if_statement_repeat1, + ACTIONS(1084), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [20742] = 9, + ACTIONS(1089), 1, + anon_sym_output, + ACTIONS(1092), 1, + anon_sym_RBRACE, + ACTIONS(1094), 1, + anon_sym_if, + ACTIONS(1097), 1, + anon_sym_match, + ACTIONS(1100), 1, + sym_variable, + ACTIONS(1103), 1, + sym__newline, + STATE(467), 1, + sym_assign_target, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(334), 5, + sym_assignment, + sym_if_statement, + sym__statement, + sym_match_statement, + aux_sym_statement_block_repeat1, + [20775] = 9, + ACTIONS(7), 1, + anon_sym_output, + ACTIONS(13), 1, + anon_sym_if, + ACTIONS(15), 1, + anon_sym_match, + ACTIONS(17), 1, + sym_variable, + ACTIONS(1082), 1, + sym__newline, + ACTIONS(1106), 1, + anon_sym_RBRACE, + STATE(467), 1, + sym_assign_target, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(334), 5, + sym_assignment, + sym_if_statement, + sym__statement, + sym_match_statement, + aux_sym_statement_block_repeat1, + [20808] = 9, + ACTIONS(7), 1, + anon_sym_output, + ACTIONS(13), 1, + anon_sym_if, + ACTIONS(15), 1, + anon_sym_match, + ACTIONS(17), 1, + sym_variable, + ACTIONS(1108), 1, + anon_sym_RBRACE, + ACTIONS(1110), 1, + sym__newline, + STATE(467), 1, + sym_assign_target, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(335), 5, + sym_assignment, + sym_if_statement, + sym__statement, + sym_match_statement, + aux_sym_statement_block_repeat1, + [20841] = 7, + ACTIONS(53), 1, + anon_sym_DQUOTE, + ACTIONS(1114), 1, + anon_sym_null, + ACTIONS(1116), 1, + sym_integer, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1112), 2, + anon_sym_true, + anon_sym_false, + ACTIONS(1118), 2, + sym_float, + sym_raw_string, + STATE(398), 4, + sym__literal, + sym_string, + sym_boolean, + sym_null, + [20869] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(879), 10, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_else, + anon_sym_match, + sym_variable, + [20886] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1120), 10, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_else, + anon_sym_match, + sym_variable, + [20903] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(897), 10, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_else, + anon_sym_match, + sym_variable, + [20920] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1122), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [20936] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1124), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [20952] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1126), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [20968] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1128), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [20984] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1130), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21000] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1132), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21016] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1070), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21032] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1134), 9, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_RBRACE, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21048] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1136), 8, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21063] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1138), 8, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21078] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1140), 8, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21093] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1142), 8, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21108] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1144), 8, + sym__newline, + ts_builtin_sym_end, + anon_sym_output, + anon_sym_map, + anon_sym_import, + anon_sym_if, + anon_sym_match, + sym_variable, + [21123] = 6, + ACTIONS(1146), 1, + anon_sym_EQ, + ACTIONS(1148), 1, + sym_metadata_access, + ACTIONS(1150), 1, + anon_sym_DOT, + ACTIONS(1152), 1, + anon_sym_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(360), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21144] = 5, + ACTIONS(1146), 1, + anon_sym_EQ, + ACTIONS(1150), 1, + anon_sym_DOT, + ACTIONS(1152), 1, + anon_sym_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(360), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21162] = 5, + ACTIONS(1150), 1, + anon_sym_DOT, + ACTIONS(1152), 1, + anon_sym_LBRACK, + ACTIONS(1154), 1, + anon_sym_EQ, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(361), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21180] = 6, + ACTIONS(1156), 1, + sym_identifier, + ACTIONS(1158), 1, + anon_sym_RPAREN, + ACTIONS(1160), 1, + anon_sym__, + STATE(373), 1, + sym_parameter, + STATE(470), 1, + sym_parameter_list, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21200] = 5, + ACTIONS(1150), 1, + anon_sym_DOT, + ACTIONS(1152), 1, + anon_sym_LBRACK, + ACTIONS(1162), 1, + anon_sym_EQ, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(359), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21218] = 5, + ACTIONS(1150), 1, + anon_sym_DOT, + ACTIONS(1152), 1, + anon_sym_LBRACK, + ACTIONS(1164), 1, + anon_sym_EQ, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(361), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21236] = 5, + ACTIONS(1150), 1, + anon_sym_DOT, + ACTIONS(1152), 1, + anon_sym_LBRACK, + ACTIONS(1162), 1, + anon_sym_EQ, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(361), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21254] = 5, + ACTIONS(1166), 1, + anon_sym_EQ, + ACTIONS(1168), 1, + anon_sym_DOT, + ACTIONS(1171), 1, + anon_sym_LBRACK, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(361), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21272] = 5, + ACTIONS(1150), 1, + anon_sym_DOT, + ACTIONS(1152), 1, + anon_sym_LBRACK, + ACTIONS(1174), 1, + anon_sym_EQ, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + STATE(356), 2, + sym_target_path_segment, + aux_sym_assign_target_repeat1, + [21290] = 5, + ACTIONS(3), 1, + sym__nl_skip, + ACTIONS(1176), 1, + anon_sym_DQUOTE, + ACTIONS(1180), 1, + sym_comment, + STATE(367), 1, + aux_sym_string_repeat1, + ACTIONS(1178), 2, + sym_string_content, + sym_escape_sequence, + [21307] = 5, + ACTIONS(3), 1, + sym__nl_skip, + ACTIONS(1180), 1, + sym_comment, + ACTIONS(1182), 1, + anon_sym_DQUOTE, + STATE(365), 1, + aux_sym_string_repeat1, + ACTIONS(1184), 2, + sym_string_content, + sym_escape_sequence, + [21324] = 5, + ACTIONS(3), 1, + sym__nl_skip, + ACTIONS(1180), 1, + sym_comment, + ACTIONS(1186), 1, + anon_sym_DQUOTE, + STATE(369), 1, + aux_sym_string_repeat1, + ACTIONS(1188), 2, + sym_string_content, + sym_escape_sequence, + [21341] = 3, + ACTIONS(1192), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1190), 3, + anon_sym_EQ, + anon_sym_DOT, + anon_sym_LBRACK, + [21354] = 5, + ACTIONS(3), 1, + sym__nl_skip, + ACTIONS(1180), 1, + sym_comment, + ACTIONS(1194), 1, + anon_sym_DQUOTE, + STATE(369), 1, + aux_sym_string_repeat1, + ACTIONS(1188), 2, + sym_string_content, + sym_escape_sequence, + [21371] = 4, + ACTIONS(1196), 1, + anon_sym_COMMA, + STATE(368), 1, + aux_sym_positional_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(959), 2, + anon_sym_RPAREN, + anon_sym_RBRACK, + [21386] = 5, + ACTIONS(3), 1, + sym__nl_skip, + ACTIONS(1180), 1, + sym_comment, + ACTIONS(1199), 1, + anon_sym_DQUOTE, + STATE(369), 1, + aux_sym_string_repeat1, + ACTIONS(1201), 2, + sym_string_content, + sym_escape_sequence, + [21403] = 4, + ACTIONS(1204), 1, + anon_sym_RBRACE, + ACTIONS(1206), 1, + anon_sym_COMMA, + STATE(386), 1, + aux_sym_object_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21417] = 4, + ACTIONS(396), 1, + anon_sym_RBRACK, + ACTIONS(1208), 1, + anon_sym_COMMA, + STATE(368), 1, + aux_sym_positional_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21431] = 4, + ACTIONS(1210), 1, + anon_sym_RPAREN, + ACTIONS(1212), 1, + anon_sym_COMMA, + STATE(372), 1, + aux_sym_parameter_list_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21445] = 4, + ACTIONS(1215), 1, + anon_sym_RPAREN, + ACTIONS(1217), 1, + anon_sym_COMMA, + STATE(391), 1, + aux_sym_parameter_list_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21459] = 4, + ACTIONS(1219), 1, + anon_sym_RPAREN, + ACTIONS(1221), 1, + anon_sym_COMMA, + STATE(389), 1, + aux_sym_named_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21473] = 4, + ACTIONS(969), 1, + anon_sym_LBRACE, + ACTIONS(1223), 1, + anon_sym_if, + STATE(341), 1, + sym_statement_block, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21487] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1225), 3, + anon_sym_EQ, + anon_sym_DOT, + anon_sym_LBRACK, + [21497] = 4, + ACTIONS(1156), 1, + sym_identifier, + ACTIONS(1160), 1, + anon_sym__, + STATE(393), 1, + sym_parameter, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21511] = 4, + ACTIONS(1227), 1, + anon_sym_RBRACE, + ACTIONS(1229), 1, + anon_sym_COMMA, + STATE(383), 1, + aux_sym_object_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21525] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1231), 3, + anon_sym_EQ, + anon_sym_DOT, + anon_sym_LBRACK, + [21535] = 4, + ACTIONS(1233), 1, + sym_identifier, + ACTIONS(1235), 1, + anon_sym_RPAREN, + STATE(395), 1, + sym_named_argument, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21549] = 4, + ACTIONS(364), 1, + anon_sym_RBRACK, + ACTIONS(1237), 1, + anon_sym_COMMA, + STATE(368), 1, + aux_sym_positional_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21563] = 4, + ACTIONS(1239), 1, + anon_sym_RPAREN, + ACTIONS(1241), 1, + anon_sym_COMMA, + STATE(382), 1, + aux_sym_named_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21577] = 4, + ACTIONS(311), 1, + anon_sym_RBRACE, + ACTIONS(1244), 1, + anon_sym_COMMA, + STATE(392), 1, + aux_sym_object_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21591] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1246), 3, + anon_sym_EQ, + anon_sym_DOT, + anon_sym_LBRACK, + [21601] = 3, + ACTIONS(448), 1, + anon_sym_DASH_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(937), 2, + anon_sym_RPAREN, + anon_sym_COMMA, + [21613] = 4, + ACTIONS(323), 1, + anon_sym_RBRACE, + ACTIONS(1248), 1, + anon_sym_COMMA, + STATE(392), 1, + aux_sym_object_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21627] = 4, + ACTIONS(378), 1, + anon_sym_RPAREN, + ACTIONS(1250), 1, + anon_sym_COMMA, + STATE(368), 1, + aux_sym_positional_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21641] = 4, + ACTIONS(1233), 1, + sym_identifier, + ACTIONS(1252), 1, + anon_sym_RPAREN, + STATE(395), 1, + sym_named_argument, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21655] = 4, + ACTIONS(1252), 1, + anon_sym_RPAREN, + ACTIONS(1254), 1, + anon_sym_COMMA, + STATE(382), 1, + aux_sym_named_arguments_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21669] = 3, + ACTIONS(1256), 1, + anon_sym_EQ, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(937), 2, + anon_sym_RPAREN, + anon_sym_COMMA, + [21681] = 4, + ACTIONS(1217), 1, + anon_sym_COMMA, + ACTIONS(1258), 1, + anon_sym_RPAREN, + STATE(372), 1, + aux_sym_parameter_list_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21695] = 4, + ACTIONS(1260), 1, + anon_sym_RBRACE, + ACTIONS(1262), 1, + anon_sym_COMMA, + STATE(392), 1, + aux_sym_object_repeat1, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21709] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1210), 2, + anon_sym_RPAREN, + anon_sym_COMMA, + [21718] = 3, + ACTIONS(1265), 1, + anon_sym_RBRACE, + ACTIONS(1267), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21729] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1239), 2, + anon_sym_RPAREN, + anon_sym_COMMA, + [21738] = 3, + ACTIONS(1269), 1, + anon_sym_RBRACE, + ACTIONS(1271), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21749] = 3, + ACTIONS(1273), 1, + anon_sym_RBRACE, + ACTIONS(1275), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21760] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1277), 2, + anon_sym_RPAREN, + anon_sym_COMMA, + [21769] = 3, + ACTIONS(53), 1, + anon_sym_DQUOTE, + STATE(461), 1, + sym_string, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21780] = 3, + ACTIONS(1279), 1, + anon_sym_RBRACE, + ACTIONS(1281), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21791] = 3, + ACTIONS(1283), 1, + anon_sym_RBRACE, + ACTIONS(1285), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21802] = 3, + ACTIONS(1287), 1, + anon_sym_RBRACE, + ACTIONS(1289), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21813] = 3, + ACTIONS(1291), 1, + anon_sym_RBRACE, + ACTIONS(1293), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21824] = 3, + ACTIONS(1295), 1, + anon_sym_RBRACE, + ACTIONS(1297), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21835] = 3, + ACTIONS(1299), 1, + anon_sym_RBRACE, + ACTIONS(1301), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21846] = 3, + ACTIONS(1303), 1, + anon_sym_RBRACE, + ACTIONS(1305), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21857] = 3, + ACTIONS(1307), 1, + anon_sym_LBRACE, + STATE(263), 1, + sym_statement_block, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21868] = 3, + ACTIONS(1309), 1, + anon_sym_RBRACE, + ACTIONS(1311), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21879] = 3, + ACTIONS(1313), 1, + anon_sym_RBRACE, + ACTIONS(1315), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21890] = 3, + ACTIONS(448), 1, + anon_sym_DASH_GT, + ACTIONS(1003), 1, + anon_sym_EQ_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21901] = 3, + ACTIONS(1317), 1, + anon_sym_RBRACE, + ACTIONS(1319), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21912] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(937), 2, + anon_sym_RPAREN, + anon_sym_COMMA, + [21921] = 3, + ACTIONS(1321), 1, + anon_sym_RBRACE, + ACTIONS(1323), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21932] = 3, + ACTIONS(1325), 1, + anon_sym_LBRACE, + ACTIONS(1327), 1, + anon_sym_if, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21943] = 3, + ACTIONS(1329), 1, + anon_sym_RBRACE, + ACTIONS(1331), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21954] = 2, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + ACTIONS(1260), 2, + anon_sym_RBRACE, + anon_sym_COMMA, + [21963] = 3, + ACTIONS(448), 1, + anon_sym_DASH_GT, + ACTIONS(967), 1, + anon_sym_EQ_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21974] = 3, + ACTIONS(1233), 1, + sym_identifier, + STATE(395), 1, + sym_named_argument, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21985] = 3, + ACTIONS(1333), 1, + anon_sym_LBRACE, + ACTIONS(1335), 1, + anon_sym_if, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [21996] = 2, + ACTIONS(1337), 1, + anon_sym_DASH_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22004] = 2, + ACTIONS(1339), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22012] = 2, + ACTIONS(1341), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22020] = 2, + ACTIONS(1343), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22028] = 2, + ACTIONS(1345), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22036] = 2, + ACTIONS(1347), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22044] = 2, + ACTIONS(1349), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22052] = 2, + ACTIONS(1351), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22060] = 2, + ACTIONS(1353), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22068] = 2, + ACTIONS(1355), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22076] = 2, + ACTIONS(1223), 1, + anon_sym_if, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22084] = 2, + ACTIONS(1357), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22092] = 2, + ACTIONS(1359), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22100] = 2, + ACTIONS(1361), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22108] = 2, + ACTIONS(836), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22116] = 2, + ACTIONS(1363), 1, + ts_builtin_sym_end, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22124] = 2, + ACTIONS(1365), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22132] = 2, + ACTIONS(1367), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22140] = 2, + ACTIONS(1369), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22148] = 2, + ACTIONS(1371), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22156] = 2, + ACTIONS(1373), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22164] = 2, + ACTIONS(1375), 1, + anon_sym_COLON, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22172] = 2, + ACTIONS(444), 1, + anon_sym_LPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22180] = 2, + ACTIONS(1377), 1, + sym__newline, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22188] = 2, + ACTIONS(1317), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22196] = 2, + ACTIONS(1379), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22204] = 2, + ACTIONS(1381), 1, + anon_sym_DASH_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22212] = 2, + ACTIONS(1313), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22220] = 2, + ACTIONS(1383), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22228] = 2, + ACTIONS(1385), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22236] = 2, + ACTIONS(1387), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22244] = 2, + ACTIONS(1389), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22252] = 2, + ACTIONS(1391), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22260] = 2, + ACTIONS(1393), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22268] = 2, + ACTIONS(1395), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22276] = 2, + ACTIONS(1397), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22284] = 2, + ACTIONS(1399), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22292] = 2, + ACTIONS(1401), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22300] = 2, + ACTIONS(1403), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22308] = 2, + ACTIONS(1405), 1, + anon_sym_DASH_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22316] = 2, + ACTIONS(1407), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22324] = 2, + ACTIONS(1409), 1, + anon_sym_as, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22332] = 2, + ACTIONS(1411), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22340] = 2, + ACTIONS(1413), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22348] = 2, + ACTIONS(1415), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22356] = 2, + ACTIONS(1417), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22364] = 2, + ACTIONS(1419), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22372] = 2, + ACTIONS(1421), 1, + anon_sym_EQ, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22380] = 2, + ACTIONS(1423), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22388] = 2, + ACTIONS(1425), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22396] = 2, + ACTIONS(1427), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22404] = 2, + ACTIONS(1429), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22412] = 2, + ACTIONS(1431), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22420] = 2, + ACTIONS(1433), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22428] = 2, + ACTIONS(1435), 1, + anon_sym_LBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22436] = 2, + ACTIONS(1437), 1, + anon_sym_DASH_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22444] = 2, + ACTIONS(1439), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22452] = 2, + ACTIONS(1441), 1, + sym_identifier, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22460] = 2, + ACTIONS(1443), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22468] = 2, + ACTIONS(1327), 1, + anon_sym_if, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22476] = 2, + ACTIONS(1445), 1, + anon_sym_DASH_GT, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22484] = 2, + ACTIONS(1447), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22492] = 2, + ACTIONS(1449), 1, + anon_sym_RBRACE, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22500] = 2, + ACTIONS(1335), 1, + anon_sym_if, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, + [22508] = 2, + ACTIONS(1451), 1, + anon_sym_RPAREN, + ACTIONS(3), 2, + sym__nl_skip, + sym_comment, +}; + +static const uint32_t ts_small_parse_table_map[] = { + [SMALL_STATE(2)] = 0, + [SMALL_STATE(3)] = 102, + [SMALL_STATE(4)] = 206, + [SMALL_STATE(5)] = 310, + [SMALL_STATE(6)] = 412, + [SMALL_STATE(7)] = 514, + [SMALL_STATE(8)] = 616, + [SMALL_STATE(9)] = 722, + [SMALL_STATE(10)] = 828, + [SMALL_STATE(11)] = 930, + [SMALL_STATE(12)] = 1032, + [SMALL_STATE(13)] = 1136, + [SMALL_STATE(14)] = 1242, + [SMALL_STATE(15)] = 1346, + [SMALL_STATE(16)] = 1450, + [SMALL_STATE(17)] = 1552, + [SMALL_STATE(18)] = 1653, + [SMALL_STATE(19)] = 1756, + [SMALL_STATE(20)] = 1859, + [SMALL_STATE(21)] = 1960, + [SMALL_STATE(22)] = 2063, + [SMALL_STATE(23)] = 2164, + [SMALL_STATE(24)] = 2267, + [SMALL_STATE(25)] = 2370, + [SMALL_STATE(26)] = 2471, + [SMALL_STATE(27)] = 2574, + [SMALL_STATE(28)] = 2675, + [SMALL_STATE(29)] = 2776, + [SMALL_STATE(30)] = 2879, + [SMALL_STATE(31)] = 2982, + [SMALL_STATE(32)] = 3082, + [SMALL_STATE(33)] = 3182, + [SMALL_STATE(34)] = 3280, + [SMALL_STATE(35)] = 3380, + [SMALL_STATE(36)] = 3478, + [SMALL_STATE(37)] = 3578, + [SMALL_STATE(38)] = 3678, + [SMALL_STATE(39)] = 3778, + [SMALL_STATE(40)] = 3876, + [SMALL_STATE(41)] = 3976, + [SMALL_STATE(42)] = 4074, + [SMALL_STATE(43)] = 4174, + [SMALL_STATE(44)] = 4271, + [SMALL_STATE(45)] = 4334, + [SMALL_STATE(46)] = 4429, + [SMALL_STATE(47)] = 4492, + [SMALL_STATE(48)] = 4587, + [SMALL_STATE(49)] = 4682, + [SMALL_STATE(50)] = 4745, + [SMALL_STATE(51)] = 4808, + [SMALL_STATE(52)] = 4903, + [SMALL_STATE(53)] = 4998, + [SMALL_STATE(54)] = 5095, + [SMALL_STATE(55)] = 5158, + [SMALL_STATE(56)] = 5253, + [SMALL_STATE(57)] = 5316, + [SMALL_STATE(58)] = 5411, + [SMALL_STATE(59)] = 5506, + [SMALL_STATE(60)] = 5569, + [SMALL_STATE(61)] = 5632, + [SMALL_STATE(62)] = 5724, + [SMALL_STATE(63)] = 5816, + [SMALL_STATE(64)] = 5908, + [SMALL_STATE(65)] = 6000, + [SMALL_STATE(66)] = 6060, + [SMALL_STATE(67)] = 6152, + [SMALL_STATE(68)] = 6244, + [SMALL_STATE(69)] = 6336, + [SMALL_STATE(70)] = 6428, + [SMALL_STATE(71)] = 6520, + [SMALL_STATE(72)] = 6612, + [SMALL_STATE(73)] = 6704, + [SMALL_STATE(74)] = 6796, + [SMALL_STATE(75)] = 6888, + [SMALL_STATE(76)] = 6980, + [SMALL_STATE(77)] = 7069, + [SMALL_STATE(78)] = 7158, + [SMALL_STATE(79)] = 7247, + [SMALL_STATE(80)] = 7336, + [SMALL_STATE(81)] = 7425, + [SMALL_STATE(82)] = 7514, + [SMALL_STATE(83)] = 7603, + [SMALL_STATE(84)] = 7692, + [SMALL_STATE(85)] = 7753, + [SMALL_STATE(86)] = 7842, + [SMALL_STATE(87)] = 7931, + [SMALL_STATE(88)] = 8020, + [SMALL_STATE(89)] = 8109, + [SMALL_STATE(90)] = 8198, + [SMALL_STATE(91)] = 8287, + [SMALL_STATE(92)] = 8376, + [SMALL_STATE(93)] = 8465, + [SMALL_STATE(94)] = 8554, + [SMALL_STATE(95)] = 8643, + [SMALL_STATE(96)] = 8732, + [SMALL_STATE(97)] = 8821, + [SMALL_STATE(98)] = 8910, + [SMALL_STATE(99)] = 8999, + [SMALL_STATE(100)] = 9088, + [SMALL_STATE(101)] = 9177, + [SMALL_STATE(102)] = 9266, + [SMALL_STATE(103)] = 9355, + [SMALL_STATE(104)] = 9444, + [SMALL_STATE(105)] = 9533, + [SMALL_STATE(106)] = 9622, + [SMALL_STATE(107)] = 9711, + [SMALL_STATE(108)] = 9800, + [SMALL_STATE(109)] = 9889, + [SMALL_STATE(110)] = 9978, + [SMALL_STATE(111)] = 10067, + [SMALL_STATE(112)] = 10156, + [SMALL_STATE(113)] = 10245, + [SMALL_STATE(114)] = 10334, + [SMALL_STATE(115)] = 10423, + [SMALL_STATE(116)] = 10512, + [SMALL_STATE(117)] = 10601, + [SMALL_STATE(118)] = 10690, + [SMALL_STATE(119)] = 10779, + [SMALL_STATE(120)] = 10868, + [SMALL_STATE(121)] = 10957, + [SMALL_STATE(122)] = 11046, + [SMALL_STATE(123)] = 11135, + [SMALL_STATE(124)] = 11224, + [SMALL_STATE(125)] = 11313, + [SMALL_STATE(126)] = 11402, + [SMALL_STATE(127)] = 11491, + [SMALL_STATE(128)] = 11545, + [SMALL_STATE(129)] = 11599, + [SMALL_STATE(130)] = 11653, + [SMALL_STATE(131)] = 11707, + [SMALL_STATE(132)] = 11761, + [SMALL_STATE(133)] = 11815, + [SMALL_STATE(134)] = 11869, + [SMALL_STATE(135)] = 11925, + [SMALL_STATE(136)] = 11981, + [SMALL_STATE(137)] = 12034, + [SMALL_STATE(138)] = 12087, + [SMALL_STATE(139)] = 12140, + [SMALL_STATE(140)] = 12193, + [SMALL_STATE(141)] = 12246, + [SMALL_STATE(142)] = 12299, + [SMALL_STATE(143)] = 12352, + [SMALL_STATE(144)] = 12405, + [SMALL_STATE(145)] = 12458, + [SMALL_STATE(146)] = 12511, + [SMALL_STATE(147)] = 12564, + [SMALL_STATE(148)] = 12617, + [SMALL_STATE(149)] = 12678, + [SMALL_STATE(150)] = 12731, + [SMALL_STATE(151)] = 12784, + [SMALL_STATE(152)] = 12837, + [SMALL_STATE(153)] = 12890, + [SMALL_STATE(154)] = 12943, + [SMALL_STATE(155)] = 12996, + [SMALL_STATE(156)] = 13057, + [SMALL_STATE(157)] = 13110, + [SMALL_STATE(158)] = 13163, + [SMALL_STATE(159)] = 13216, + [SMALL_STATE(160)] = 13269, + [SMALL_STATE(161)] = 13322, + [SMALL_STATE(162)] = 13375, + [SMALL_STATE(163)] = 13428, + [SMALL_STATE(164)] = 13481, + [SMALL_STATE(165)] = 13534, + [SMALL_STATE(166)] = 13587, + [SMALL_STATE(167)] = 13640, + [SMALL_STATE(168)] = 13693, + [SMALL_STATE(169)] = 13746, + [SMALL_STATE(170)] = 13801, + [SMALL_STATE(171)] = 13854, + [SMALL_STATE(172)] = 13909, + [SMALL_STATE(173)] = 13962, + [SMALL_STATE(174)] = 14017, + [SMALL_STATE(175)] = 14070, + [SMALL_STATE(176)] = 14123, + [SMALL_STATE(177)] = 14176, + [SMALL_STATE(178)] = 14229, + [SMALL_STATE(179)] = 14282, + [SMALL_STATE(180)] = 14335, + [SMALL_STATE(181)] = 14390, + [SMALL_STATE(182)] = 14454, + [SMALL_STATE(183)] = 14524, + [SMALL_STATE(184)] = 14584, + [SMALL_STATE(185)] = 14642, + [SMALL_STATE(186)] = 14712, + [SMALL_STATE(187)] = 14780, + [SMALL_STATE(188)] = 14846, + [SMALL_STATE(189)] = 14891, + [SMALL_STATE(190)] = 14936, + [SMALL_STATE(191)] = 14981, + [SMALL_STATE(192)] = 15026, + [SMALL_STATE(193)] = 15071, + [SMALL_STATE(194)] = 15116, + [SMALL_STATE(195)] = 15161, + [SMALL_STATE(196)] = 15206, + [SMALL_STATE(197)] = 15250, + [SMALL_STATE(198)] = 15292, + [SMALL_STATE(199)] = 15331, + [SMALL_STATE(200)] = 15372, + [SMALL_STATE(201)] = 15410, + [SMALL_STATE(202)] = 15448, + [SMALL_STATE(203)] = 15486, + [SMALL_STATE(204)] = 15522, + [SMALL_STATE(205)] = 15558, + [SMALL_STATE(206)] = 15594, + [SMALL_STATE(207)] = 15630, + [SMALL_STATE(208)] = 15668, + [SMALL_STATE(209)] = 15706, + [SMALL_STATE(210)] = 15742, + [SMALL_STATE(211)] = 15778, + [SMALL_STATE(212)] = 15814, + [SMALL_STATE(213)] = 15850, + [SMALL_STATE(214)] = 15885, + [SMALL_STATE(215)] = 15920, + [SMALL_STATE(216)] = 15955, + [SMALL_STATE(217)] = 15990, + [SMALL_STATE(218)] = 16025, + [SMALL_STATE(219)] = 16060, + [SMALL_STATE(220)] = 16105, + [SMALL_STATE(221)] = 16158, + [SMALL_STATE(222)] = 16209, + [SMALL_STATE(223)] = 16252, + [SMALL_STATE(224)] = 16301, + [SMALL_STATE(225)] = 16348, + [SMALL_STATE(226)] = 16403, + [SMALL_STATE(227)] = 16438, + [SMALL_STATE(228)] = 16473, + [SMALL_STATE(229)] = 16508, + [SMALL_STATE(230)] = 16543, + [SMALL_STATE(231)] = 16578, + [SMALL_STATE(232)] = 16613, + [SMALL_STATE(233)] = 16648, + [SMALL_STATE(234)] = 16683, + [SMALL_STATE(235)] = 16718, + [SMALL_STATE(236)] = 16753, + [SMALL_STATE(237)] = 16788, + [SMALL_STATE(238)] = 16823, + [SMALL_STATE(239)] = 16858, + [SMALL_STATE(240)] = 16893, + [SMALL_STATE(241)] = 16928, + [SMALL_STATE(242)] = 16963, + [SMALL_STATE(243)] = 16998, + [SMALL_STATE(244)] = 17033, + [SMALL_STATE(245)] = 17068, + [SMALL_STATE(246)] = 17103, + [SMALL_STATE(247)] = 17138, + [SMALL_STATE(248)] = 17173, + [SMALL_STATE(249)] = 17208, + [SMALL_STATE(250)] = 17243, + [SMALL_STATE(251)] = 17278, + [SMALL_STATE(252)] = 17313, + [SMALL_STATE(253)] = 17348, + [SMALL_STATE(254)] = 17383, + [SMALL_STATE(255)] = 17418, + [SMALL_STATE(256)] = 17453, + [SMALL_STATE(257)] = 17488, + [SMALL_STATE(258)] = 17531, + [SMALL_STATE(259)] = 17586, + [SMALL_STATE(260)] = 17621, + [SMALL_STATE(261)] = 17659, + [SMALL_STATE(262)] = 17692, + [SMALL_STATE(263)] = 17725, + [SMALL_STATE(264)] = 17758, + [SMALL_STATE(265)] = 17793, + [SMALL_STATE(266)] = 17826, + [SMALL_STATE(267)] = 17859, + [SMALL_STATE(268)] = 17898, + [SMALL_STATE(269)] = 17933, + [SMALL_STATE(270)] = 17979, + [SMALL_STATE(271)] = 18031, + [SMALL_STATE(272)] = 18071, + [SMALL_STATE(273)] = 18103, + [SMALL_STATE(274)] = 18135, + [SMALL_STATE(275)] = 18177, + [SMALL_STATE(276)] = 18227, + [SMALL_STATE(277)] = 18275, + [SMALL_STATE(278)] = 18319, + [SMALL_STATE(279)] = 18363, + [SMALL_STATE(280)] = 18402, + [SMALL_STATE(281)] = 18441, + [SMALL_STATE(282)] = 18472, + [SMALL_STATE(283)] = 18525, + [SMALL_STATE(284)] = 18578, + [SMALL_STATE(285)] = 18631, + [SMALL_STATE(286)] = 18680, + [SMALL_STATE(287)] = 18712, + [SMALL_STATE(288)] = 18740, + [SMALL_STATE(289)] = 18790, + [SMALL_STATE(290)] = 18840, + [SMALL_STATE(291)] = 18888, + [SMALL_STATE(292)] = 18936, + [SMALL_STATE(293)] = 18984, + [SMALL_STATE(294)] = 19034, + [SMALL_STATE(295)] = 19066, + [SMALL_STATE(296)] = 19098, + [SMALL_STATE(297)] = 19148, + [SMALL_STATE(298)] = 19186, + [SMALL_STATE(299)] = 19232, + [SMALL_STATE(300)] = 19276, + [SMALL_STATE(301)] = 19318, + [SMALL_STATE(302)] = 19358, + [SMALL_STATE(303)] = 19406, + [SMALL_STATE(304)] = 19456, + [SMALL_STATE(305)] = 19506, + [SMALL_STATE(306)] = 19556, + [SMALL_STATE(307)] = 19604, + [SMALL_STATE(308)] = 19636, + [SMALL_STATE(309)] = 19668, + [SMALL_STATE(310)] = 19718, + [SMALL_STATE(311)] = 19765, + [SMALL_STATE(312)] = 19812, + [SMALL_STATE(313)] = 19859, + [SMALL_STATE(314)] = 19906, + [SMALL_STATE(315)] = 19953, + [SMALL_STATE(316)] = 20000, + [SMALL_STATE(317)] = 20047, + [SMALL_STATE(318)] = 20094, + [SMALL_STATE(319)] = 20141, + [SMALL_STATE(320)] = 20188, + [SMALL_STATE(321)] = 20235, + [SMALL_STATE(322)] = 20282, + [SMALL_STATE(323)] = 20329, + [SMALL_STATE(324)] = 20376, + [SMALL_STATE(325)] = 20423, + [SMALL_STATE(326)] = 20470, + [SMALL_STATE(327)] = 20517, + [SMALL_STATE(328)] = 20559, + [SMALL_STATE(329)] = 20601, + [SMALL_STATE(330)] = 20627, + [SMALL_STATE(331)] = 20653, + [SMALL_STATE(332)] = 20686, + [SMALL_STATE(333)] = 20719, + [SMALL_STATE(334)] = 20742, + [SMALL_STATE(335)] = 20775, + [SMALL_STATE(336)] = 20808, + [SMALL_STATE(337)] = 20841, + [SMALL_STATE(338)] = 20869, + [SMALL_STATE(339)] = 20886, + [SMALL_STATE(340)] = 20903, + [SMALL_STATE(341)] = 20920, + [SMALL_STATE(342)] = 20936, + [SMALL_STATE(343)] = 20952, + [SMALL_STATE(344)] = 20968, + [SMALL_STATE(345)] = 20984, + [SMALL_STATE(346)] = 21000, + [SMALL_STATE(347)] = 21016, + [SMALL_STATE(348)] = 21032, + [SMALL_STATE(349)] = 21048, + [SMALL_STATE(350)] = 21063, + [SMALL_STATE(351)] = 21078, + [SMALL_STATE(352)] = 21093, + [SMALL_STATE(353)] = 21108, + [SMALL_STATE(354)] = 21123, + [SMALL_STATE(355)] = 21144, + [SMALL_STATE(356)] = 21162, + [SMALL_STATE(357)] = 21180, + [SMALL_STATE(358)] = 21200, + [SMALL_STATE(359)] = 21218, + [SMALL_STATE(360)] = 21236, + [SMALL_STATE(361)] = 21254, + [SMALL_STATE(362)] = 21272, + [SMALL_STATE(363)] = 21290, + [SMALL_STATE(364)] = 21307, + [SMALL_STATE(365)] = 21324, + [SMALL_STATE(366)] = 21341, + [SMALL_STATE(367)] = 21354, + [SMALL_STATE(368)] = 21371, + [SMALL_STATE(369)] = 21386, + [SMALL_STATE(370)] = 21403, + [SMALL_STATE(371)] = 21417, + [SMALL_STATE(372)] = 21431, + [SMALL_STATE(373)] = 21445, + [SMALL_STATE(374)] = 21459, + [SMALL_STATE(375)] = 21473, + [SMALL_STATE(376)] = 21487, + [SMALL_STATE(377)] = 21497, + [SMALL_STATE(378)] = 21511, + [SMALL_STATE(379)] = 21525, + [SMALL_STATE(380)] = 21535, + [SMALL_STATE(381)] = 21549, + [SMALL_STATE(382)] = 21563, + [SMALL_STATE(383)] = 21577, + [SMALL_STATE(384)] = 21591, + [SMALL_STATE(385)] = 21601, + [SMALL_STATE(386)] = 21613, + [SMALL_STATE(387)] = 21627, + [SMALL_STATE(388)] = 21641, + [SMALL_STATE(389)] = 21655, + [SMALL_STATE(390)] = 21669, + [SMALL_STATE(391)] = 21681, + [SMALL_STATE(392)] = 21695, + [SMALL_STATE(393)] = 21709, + [SMALL_STATE(394)] = 21718, + [SMALL_STATE(395)] = 21729, + [SMALL_STATE(396)] = 21738, + [SMALL_STATE(397)] = 21749, + [SMALL_STATE(398)] = 21760, + [SMALL_STATE(399)] = 21769, + [SMALL_STATE(400)] = 21780, + [SMALL_STATE(401)] = 21791, + [SMALL_STATE(402)] = 21802, + [SMALL_STATE(403)] = 21813, + [SMALL_STATE(404)] = 21824, + [SMALL_STATE(405)] = 21835, + [SMALL_STATE(406)] = 21846, + [SMALL_STATE(407)] = 21857, + [SMALL_STATE(408)] = 21868, + [SMALL_STATE(409)] = 21879, + [SMALL_STATE(410)] = 21890, + [SMALL_STATE(411)] = 21901, + [SMALL_STATE(412)] = 21912, + [SMALL_STATE(413)] = 21921, + [SMALL_STATE(414)] = 21932, + [SMALL_STATE(415)] = 21943, + [SMALL_STATE(416)] = 21954, + [SMALL_STATE(417)] = 21963, + [SMALL_STATE(418)] = 21974, + [SMALL_STATE(419)] = 21985, + [SMALL_STATE(420)] = 21996, + [SMALL_STATE(421)] = 22004, + [SMALL_STATE(422)] = 22012, + [SMALL_STATE(423)] = 22020, + [SMALL_STATE(424)] = 22028, + [SMALL_STATE(425)] = 22036, + [SMALL_STATE(426)] = 22044, + [SMALL_STATE(427)] = 22052, + [SMALL_STATE(428)] = 22060, + [SMALL_STATE(429)] = 22068, + [SMALL_STATE(430)] = 22076, + [SMALL_STATE(431)] = 22084, + [SMALL_STATE(432)] = 22092, + [SMALL_STATE(433)] = 22100, + [SMALL_STATE(434)] = 22108, + [SMALL_STATE(435)] = 22116, + [SMALL_STATE(436)] = 22124, + [SMALL_STATE(437)] = 22132, + [SMALL_STATE(438)] = 22140, + [SMALL_STATE(439)] = 22148, + [SMALL_STATE(440)] = 22156, + [SMALL_STATE(441)] = 22164, + [SMALL_STATE(442)] = 22172, + [SMALL_STATE(443)] = 22180, + [SMALL_STATE(444)] = 22188, + [SMALL_STATE(445)] = 22196, + [SMALL_STATE(446)] = 22204, + [SMALL_STATE(447)] = 22212, + [SMALL_STATE(448)] = 22220, + [SMALL_STATE(449)] = 22228, + [SMALL_STATE(450)] = 22236, + [SMALL_STATE(451)] = 22244, + [SMALL_STATE(452)] = 22252, + [SMALL_STATE(453)] = 22260, + [SMALL_STATE(454)] = 22268, + [SMALL_STATE(455)] = 22276, + [SMALL_STATE(456)] = 22284, + [SMALL_STATE(457)] = 22292, + [SMALL_STATE(458)] = 22300, + [SMALL_STATE(459)] = 22308, + [SMALL_STATE(460)] = 22316, + [SMALL_STATE(461)] = 22324, + [SMALL_STATE(462)] = 22332, + [SMALL_STATE(463)] = 22340, + [SMALL_STATE(464)] = 22348, + [SMALL_STATE(465)] = 22356, + [SMALL_STATE(466)] = 22364, + [SMALL_STATE(467)] = 22372, + [SMALL_STATE(468)] = 22380, + [SMALL_STATE(469)] = 22388, + [SMALL_STATE(470)] = 22396, + [SMALL_STATE(471)] = 22404, + [SMALL_STATE(472)] = 22412, + [SMALL_STATE(473)] = 22420, + [SMALL_STATE(474)] = 22428, + [SMALL_STATE(475)] = 22436, + [SMALL_STATE(476)] = 22444, + [SMALL_STATE(477)] = 22452, + [SMALL_STATE(478)] = 22460, + [SMALL_STATE(479)] = 22468, + [SMALL_STATE(480)] = 22476, + [SMALL_STATE(481)] = 22484, + [SMALL_STATE(482)] = 22492, + [SMALL_STATE(483)] = 22500, + [SMALL_STATE(484)] = 22508, +}; + +static const TSParseActionEntry ts_parse_actions[] = { + [0] = {.entry = {.count = 0, .reusable = false}}, + [1] = {.entry = {.count = 1, .reusable = false}}, RECOVER(), + [3] = {.entry = {.count = 1, .reusable = true}}, SHIFT_EXTRA(), + [5] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_source_file, 0, 0, 0), + [7] = {.entry = {.count = 1, .reusable = true}}, SHIFT(354), + [9] = {.entry = {.count = 1, .reusable = true}}, SHIFT(452), + [11] = {.entry = {.count = 1, .reusable = true}}, SHIFT(399), + [13] = {.entry = {.count = 1, .reusable = true}}, SHIFT(83), + [15] = {.entry = {.count = 1, .reusable = true}}, SHIFT(87), + [17] = {.entry = {.count = 1, .reusable = true}}, SHIFT(355), + [19] = {.entry = {.count = 1, .reusable = true}}, SHIFT(328), + [21] = {.entry = {.count = 1, .reusable = false}}, SHIFT(271), + [23] = {.entry = {.count = 1, .reusable = false}}, SHIFT(134), + [25] = {.entry = {.count = 1, .reusable = true}}, SHIFT(47), + [27] = {.entry = {.count = 1, .reusable = true}}, SHIFT(176), + [29] = {.entry = {.count = 1, .reusable = true}}, SHIFT(62), + [31] = {.entry = {.count = 1, .reusable = true}}, SHIFT(48), + [33] = {.entry = {.count = 1, .reusable = false}}, SHIFT(480), + [35] = {.entry = {.count = 1, .reusable = false}}, SHIFT(135), + [37] = {.entry = {.count = 1, .reusable = false}}, SHIFT(80), + [39] = {.entry = {.count = 1, .reusable = false}}, SHIFT(117), + [41] = {.entry = {.count = 1, .reusable = false}}, SHIFT(159), + [43] = {.entry = {.count = 1, .reusable = false}}, SHIFT(161), + [45] = {.entry = {.count = 1, .reusable = false}}, SHIFT(442), + [47] = {.entry = {.count = 1, .reusable = true}}, SHIFT(118), + [49] = {.entry = {.count = 1, .reusable = false}}, SHIFT(283), + [51] = {.entry = {.count = 1, .reusable = true}}, SHIFT(283), + [53] = {.entry = {.count = 1, .reusable = true}}, SHIFT(363), + [55] = {.entry = {.count = 1, .reusable = false}}, SHIFT(84), + [57] = {.entry = {.count = 1, .reusable = true}}, SHIFT(198), + [59] = {.entry = {.count = 1, .reusable = false}}, SHIFT(417), + [61] = {.entry = {.count = 1, .reusable = false}}, SHIFT(288), + [63] = {.entry = {.count = 1, .reusable = true}}, SHIFT(288), + [65] = {.entry = {.count = 1, .reusable = true}}, SHIFT(180), + [67] = {.entry = {.count = 1, .reusable = false}}, SHIFT(410), + [69] = {.entry = {.count = 1, .reusable = false}}, SHIFT(303), + [71] = {.entry = {.count = 1, .reusable = true}}, SHIFT(303), + [73] = {.entry = {.count = 1, .reusable = true}}, SHIFT(166), + [75] = {.entry = {.count = 1, .reusable = true}}, SHIFT(175), + [77] = {.entry = {.count = 1, .reusable = true}}, SHIFT(384), + [79] = {.entry = {.count = 1, .reusable = true}}, SHIFT(153), + [81] = {.entry = {.count = 1, .reusable = false}}, SHIFT(304), + [83] = {.entry = {.count = 1, .reusable = true}}, SHIFT(304), + [85] = {.entry = {.count = 1, .reusable = true}}, SHIFT(279), + [87] = {.entry = {.count = 1, .reusable = true}}, SHIFT(217), + [89] = {.entry = {.count = 1, .reusable = true}}, SHIFT(237), + [91] = {.entry = {.count = 1, .reusable = true}}, SHIFT(260), + [93] = {.entry = {.count = 1, .reusable = true}}, SHIFT(255), + [95] = {.entry = {.count = 1, .reusable = true}}, SHIFT(287), + [97] = {.entry = {.count = 1, .reusable = true}}, SHIFT(199), + [99] = {.entry = {.count = 1, .reusable = true}}, SHIFT(236), + [101] = {.entry = {.count = 1, .reusable = true}}, SHIFT(162), + [103] = {.entry = {.count = 1, .reusable = false}}, SHIFT(311), + [105] = {.entry = {.count = 1, .reusable = true}}, SHIFT(311), + [107] = {.entry = {.count = 1, .reusable = false}}, SHIFT(196), + [109] = {.entry = {.count = 1, .reusable = false}}, SHIFT(200), + [111] = {.entry = {.count = 1, .reusable = true}}, SHIFT(58), + [113] = {.entry = {.count = 1, .reusable = true}}, SHIFT(63), + [115] = {.entry = {.count = 1, .reusable = true}}, SHIFT(45), + [117] = {.entry = {.count = 1, .reusable = false}}, SHIFT(446), + [119] = {.entry = {.count = 1, .reusable = false}}, SHIFT(201), + [121] = {.entry = {.count = 1, .reusable = false}}, SHIFT(77), + [123] = {.entry = {.count = 1, .reusable = false}}, SHIFT(125), + [125] = {.entry = {.count = 1, .reusable = false}}, SHIFT(218), + [127] = {.entry = {.count = 1, .reusable = false}}, SHIFT(233), + [129] = {.entry = {.count = 1, .reusable = false}}, SHIFT(434), + [131] = {.entry = {.count = 1, .reusable = true}}, SHIFT(92), + [133] = {.entry = {.count = 1, .reusable = false}}, SHIFT(291), + [135] = {.entry = {.count = 1, .reusable = true}}, SHIFT(291), + [137] = {.entry = {.count = 1, .reusable = true}}, SHIFT(364), + [139] = {.entry = {.count = 1, .reusable = true}}, SHIFT(280), + [141] = {.entry = {.count = 1, .reusable = true}}, SHIFT(38), + [143] = {.entry = {.count = 1, .reusable = true}}, SHIFT(40), + [145] = {.entry = {.count = 1, .reusable = true}}, SHIFT(345), + [147] = {.entry = {.count = 1, .reusable = false}}, SHIFT(312), + [149] = {.entry = {.count = 1, .reusable = true}}, SHIFT(312), + [151] = {.entry = {.count = 1, .reusable = true}}, SHIFT(32), + [153] = {.entry = {.count = 1, .reusable = true}}, SHIFT(342), + [155] = {.entry = {.count = 1, .reusable = true}}, SHIFT(34), + [157] = {.entry = {.count = 1, .reusable = true}}, SHIFT(37), + [159] = {.entry = {.count = 1, .reusable = true}}, SHIFT(146), + [161] = {.entry = {.count = 1, .reusable = true}}, SHIFT(36), + [163] = {.entry = {.count = 1, .reusable = true}}, SHIFT(229), + [165] = {.entry = {.count = 1, .reusable = true}}, SHIFT(240), + [167] = {.entry = {.count = 1, .reusable = true}}, SHIFT(31), + [169] = {.entry = {.count = 1, .reusable = true}}, SHIFT(42), + [171] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement_cases, 1, 0, 0), + [173] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(84), + [176] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(134), + [179] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(47), + [182] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(62), + [185] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(48), + [188] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), + [190] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(417), + [193] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(135), + [196] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(80), + [199] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(117), + [202] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(159), + [205] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(161), + [208] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(442), + [211] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(118), + [214] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(312), + [217] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(312), + [220] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(363), + [223] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(84), + [226] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(134), + [229] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(47), + [232] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(62), + [235] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(48), + [238] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), + [240] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(410), + [243] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(135), + [246] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(80), + [249] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(117), + [252] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(159), + [255] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(161), + [258] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(442), + [261] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(118), + [264] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(311), + [267] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(311), + [270] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), SHIFT_REPEAT(363), + [273] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_cases, 1, 0, 0), + [275] = {.entry = {.count = 1, .reusable = false}}, SHIFT(318), + [277] = {.entry = {.count = 1, .reusable = true}}, SHIFT(318), + [279] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 5, 0, 13), + [281] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 5, 0, 13), + [283] = {.entry = {.count = 1, .reusable = false}}, SHIFT(414), + [285] = {.entry = {.count = 1, .reusable = false}}, SHIFT(313), + [287] = {.entry = {.count = 1, .reusable = true}}, SHIFT(313), + [289] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 8, 0, 18), + [291] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 8, 0, 18), + [293] = {.entry = {.count = 1, .reusable = false}}, SHIFT(278), + [295] = {.entry = {.count = 1, .reusable = false}}, SHIFT(385), + [297] = {.entry = {.count = 1, .reusable = false}}, SHIFT(317), + [299] = {.entry = {.count = 1, .reusable = true}}, SHIFT(317), + [301] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 6, 0, 18), + [303] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 6, 0, 18), + [305] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 6, 0, 13), + [307] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 6, 0, 13), + [309] = {.entry = {.count = 1, .reusable = true}}, SHIFT(172), + [311] = {.entry = {.count = 1, .reusable = true}}, SHIFT(227), + [313] = {.entry = {.count = 1, .reusable = false}}, SHIFT(306), + [315] = {.entry = {.count = 1, .reusable = true}}, SHIFT(306), + [317] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 7, 0, 13), + [319] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 7, 0, 13), + [321] = {.entry = {.count = 1, .reusable = true}}, SHIFT(234), + [323] = {.entry = {.count = 1, .reusable = true}}, SHIFT(151), + [325] = {.entry = {.count = 1, .reusable = false}}, SHIFT(316), + [327] = {.entry = {.count = 1, .reusable = true}}, SHIFT(316), + [329] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 7, 0, 18), + [331] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 7, 0, 18), + [333] = {.entry = {.count = 1, .reusable = true}}, SHIFT(170), + [335] = {.entry = {.count = 1, .reusable = false}}, SHIFT(285), + [337] = {.entry = {.count = 1, .reusable = true}}, SHIFT(285), + [339] = {.entry = {.count = 1, .reusable = true}}, SHIFT(152), + [341] = {.entry = {.count = 1, .reusable = false}}, SHIFT(284), + [343] = {.entry = {.count = 1, .reusable = true}}, SHIFT(284), + [345] = {.entry = {.count = 1, .reusable = true}}, SHIFT(254), + [347] = {.entry = {.count = 1, .reusable = false}}, SHIFT(282), + [349] = {.entry = {.count = 1, .reusable = true}}, SHIFT(282), + [351] = {.entry = {.count = 1, .reusable = true}}, SHIFT(13), + [353] = {.entry = {.count = 1, .reusable = false}}, SHIFT(225), + [355] = {.entry = {.count = 1, .reusable = true}}, SHIFT(225), + [357] = {.entry = {.count = 1, .reusable = false}}, REDUCE(aux_sym_if_expression_repeat1, 2, 0, 0), + [359] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_if_expression_repeat1, 2, 0, 0), + [361] = {.entry = {.count = 2, .reusable = false}}, REDUCE(aux_sym_if_expression_repeat1, 2, 0, 0), SHIFT_REPEAT(479), + [364] = {.entry = {.count = 1, .reusable = true}}, SHIFT(226), + [366] = {.entry = {.count = 1, .reusable = true}}, SHIFT(9), + [368] = {.entry = {.count = 1, .reusable = false}}, SHIFT(459), + [370] = {.entry = {.count = 1, .reusable = false}}, SHIFT(81), + [372] = {.entry = {.count = 1, .reusable = true}}, SHIFT(76), + [374] = {.entry = {.count = 1, .reusable = false}}, SHIFT(185), + [376] = {.entry = {.count = 1, .reusable = true}}, SHIFT(185), + [378] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_positional_arguments, 2, 0, 0), + [380] = {.entry = {.count = 1, .reusable = true}}, SHIFT(8), + [382] = {.entry = {.count = 1, .reusable = false}}, SHIFT(182), + [384] = {.entry = {.count = 1, .reusable = true}}, SHIFT(182), + [386] = {.entry = {.count = 1, .reusable = false}}, SHIFT(475), + [388] = {.entry = {.count = 1, .reusable = false}}, SHIFT(101), + [390] = {.entry = {.count = 1, .reusable = true}}, SHIFT(102), + [392] = {.entry = {.count = 1, .reusable = false}}, SHIFT(302), + [394] = {.entry = {.count = 1, .reusable = true}}, SHIFT(302), + [396] = {.entry = {.count = 1, .reusable = true}}, SHIFT(150), + [398] = {.entry = {.count = 1, .reusable = true}}, SHIFT(259), + [400] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_positional_arguments, 3, 0, 0), + [402] = {.entry = {.count = 1, .reusable = false}}, SHIFT(270), + [404] = {.entry = {.count = 1, .reusable = true}}, SHIFT(270), + [406] = {.entry = {.count = 1, .reusable = false}}, SHIFT(155), + [408] = {.entry = {.count = 1, .reusable = true}}, SHIFT(155), + [410] = {.entry = {.count = 1, .reusable = false}}, SHIFT(326), + [412] = {.entry = {.count = 1, .reusable = true}}, SHIFT(326), + [414] = {.entry = {.count = 1, .reusable = false}}, SHIFT(324), + [416] = {.entry = {.count = 1, .reusable = true}}, SHIFT(324), + [418] = {.entry = {.count = 1, .reusable = false}}, SHIFT(292), + [420] = {.entry = {.count = 1, .reusable = true}}, SHIFT(292), + [422] = {.entry = {.count = 1, .reusable = false}}, SHIFT(320), + [424] = {.entry = {.count = 1, .reusable = true}}, SHIFT(320), + [426] = {.entry = {.count = 1, .reusable = true}}, SHIFT(4), + [428] = {.entry = {.count = 1, .reusable = false}}, SHIFT(293), + [430] = {.entry = {.count = 1, .reusable = true}}, SHIFT(293), + [432] = {.entry = {.count = 1, .reusable = false}}, SHIFT(290), + [434] = {.entry = {.count = 1, .reusable = true}}, SHIFT(290), + [436] = {.entry = {.count = 1, .reusable = false}}, SHIFT(309), + [438] = {.entry = {.count = 1, .reusable = true}}, SHIFT(309), + [440] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym__primary, 1, 0, 0), + [442] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym__primary, 1, 0, 0), + [444] = {.entry = {.count = 1, .reusable = true}}, SHIFT(5), + [446] = {.entry = {.count = 1, .reusable = true}}, SHIFT(449), + [448] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym__lambda_params, 1, 0, 0), + [450] = {.entry = {.count = 1, .reusable = false}}, SHIFT(289), + [452] = {.entry = {.count = 1, .reusable = true}}, SHIFT(289), + [454] = {.entry = {.count = 1, .reusable = true}}, SHIFT(3), + [456] = {.entry = {.count = 1, .reusable = false}}, SHIFT(296), + [458] = {.entry = {.count = 1, .reusable = true}}, SHIFT(296), + [460] = {.entry = {.count = 1, .reusable = false}}, SHIFT(314), + [462] = {.entry = {.count = 1, .reusable = true}}, SHIFT(314), + [464] = {.entry = {.count = 1, .reusable = false}}, SHIFT(258), + [466] = {.entry = {.count = 1, .reusable = true}}, SHIFT(258), + [468] = {.entry = {.count = 1, .reusable = false}}, SHIFT(310), + [470] = {.entry = {.count = 1, .reusable = true}}, SHIFT(310), + [472] = {.entry = {.count = 1, .reusable = false}}, SHIFT(315), + [474] = {.entry = {.count = 1, .reusable = true}}, SHIFT(315), + [476] = {.entry = {.count = 1, .reusable = false}}, SHIFT(257), + [478] = {.entry = {.count = 1, .reusable = true}}, SHIFT(257), + [480] = {.entry = {.count = 1, .reusable = false}}, SHIFT(219), + [482] = {.entry = {.count = 1, .reusable = true}}, SHIFT(219), + [484] = {.entry = {.count = 1, .reusable = false}}, SHIFT(220), + [486] = {.entry = {.count = 1, .reusable = true}}, SHIFT(220), + [488] = {.entry = {.count = 1, .reusable = false}}, SHIFT(221), + [490] = {.entry = {.count = 1, .reusable = true}}, SHIFT(221), + [492] = {.entry = {.count = 1, .reusable = false}}, SHIFT(222), + [494] = {.entry = {.count = 1, .reusable = true}}, SHIFT(222), + [496] = {.entry = {.count = 1, .reusable = false}}, SHIFT(223), + [498] = {.entry = {.count = 1, .reusable = true}}, SHIFT(223), + [500] = {.entry = {.count = 1, .reusable = false}}, SHIFT(224), + [502] = {.entry = {.count = 1, .reusable = true}}, SHIFT(224), + [504] = {.entry = {.count = 1, .reusable = false}}, SHIFT(325), + [506] = {.entry = {.count = 1, .reusable = true}}, SHIFT(325), + [508] = {.entry = {.count = 1, .reusable = false}}, SHIFT(321), + [510] = {.entry = {.count = 1, .reusable = true}}, SHIFT(321), + [512] = {.entry = {.count = 1, .reusable = true}}, SHIFT(14), + [514] = {.entry = {.count = 1, .reusable = false}}, SHIFT(319), + [516] = {.entry = {.count = 1, .reusable = true}}, SHIFT(319), + [518] = {.entry = {.count = 1, .reusable = false}}, SHIFT(323), + [520] = {.entry = {.count = 1, .reusable = true}}, SHIFT(323), + [522] = {.entry = {.count = 1, .reusable = false}}, SHIFT(297), + [524] = {.entry = {.count = 1, .reusable = true}}, SHIFT(297), + [526] = {.entry = {.count = 1, .reusable = false}}, SHIFT(298), + [528] = {.entry = {.count = 1, .reusable = true}}, SHIFT(298), + [530] = {.entry = {.count = 1, .reusable = false}}, SHIFT(299), + [532] = {.entry = {.count = 1, .reusable = true}}, SHIFT(299), + [534] = {.entry = {.count = 1, .reusable = false}}, SHIFT(148), + [536] = {.entry = {.count = 1, .reusable = true}}, SHIFT(148), + [538] = {.entry = {.count = 1, .reusable = false}}, SHIFT(300), + [540] = {.entry = {.count = 1, .reusable = true}}, SHIFT(300), + [542] = {.entry = {.count = 1, .reusable = false}}, SHIFT(301), + [544] = {.entry = {.count = 1, .reusable = true}}, SHIFT(301), + [546] = {.entry = {.count = 1, .reusable = false}}, SHIFT(184), + [548] = {.entry = {.count = 1, .reusable = true}}, SHIFT(184), + [550] = {.entry = {.count = 1, .reusable = false}}, SHIFT(186), + [552] = {.entry = {.count = 1, .reusable = true}}, SHIFT(186), + [554] = {.entry = {.count = 1, .reusable = false}}, SHIFT(187), + [556] = {.entry = {.count = 1, .reusable = true}}, SHIFT(187), + [558] = {.entry = {.count = 1, .reusable = false}}, SHIFT(181), + [560] = {.entry = {.count = 1, .reusable = true}}, SHIFT(181), + [562] = {.entry = {.count = 1, .reusable = false}}, SHIFT(183), + [564] = {.entry = {.count = 1, .reusable = true}}, SHIFT(183), + [566] = {.entry = {.count = 1, .reusable = true}}, SHIFT(12), + [568] = {.entry = {.count = 1, .reusable = false}}, SHIFT(274), + [570] = {.entry = {.count = 1, .reusable = true}}, SHIFT(274), + [572] = {.entry = {.count = 1, .reusable = false}}, SHIFT(275), + [574] = {.entry = {.count = 1, .reusable = true}}, SHIFT(275), + [576] = {.entry = {.count = 1, .reusable = false}}, SHIFT(276), + [578] = {.entry = {.count = 1, .reusable = true}}, SHIFT(276), + [580] = {.entry = {.count = 1, .reusable = false}}, SHIFT(269), + [582] = {.entry = {.count = 1, .reusable = true}}, SHIFT(269), + [584] = {.entry = {.count = 1, .reusable = false}}, SHIFT(277), + [586] = {.entry = {.count = 1, .reusable = true}}, SHIFT(277), + [588] = {.entry = {.count = 1, .reusable = true}}, SHIFT(15), + [590] = {.entry = {.count = 1, .reusable = false}}, SHIFT(305), + [592] = {.entry = {.count = 1, .reusable = true}}, SHIFT(305), + [594] = {.entry = {.count = 1, .reusable = false}}, SHIFT(322), + [596] = {.entry = {.count = 1, .reusable = true}}, SHIFT(322), + [598] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym__word, 1, 0, 1), + [600] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym__word, 1, 0, 1), + [602] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_string, 2, 0, 0), + [604] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_string, 2, 0, 0), + [606] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_string, 3, 0, 0), + [608] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_string, 3, 0, 0), + [610] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_if_clause, 6, 0, 22), + [612] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_if_clause, 6, 0, 22), + [614] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_if_clause, 7, 0, 23), + [616] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_if_clause, 7, 0, 23), + [618] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_if_clause, 7, 0, 22), + [620] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_if_clause, 7, 0, 22), + [622] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_if_clause, 8, 0, 23), + [624] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_if_clause, 8, 0, 23), + [626] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_output, 1, 0, 0), + [628] = {.entry = {.count = 1, .reusable = true}}, SHIFT(156), + [630] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_output, 1, 0, 0), + [632] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_input, 1, 0, 0), + [634] = {.entry = {.count = 1, .reusable = true}}, SHIFT(154), + [636] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_input, 1, 0, 0), + [638] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 8, 0, 13), + [640] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 8, 0, 13), + [642] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_if_expression, 9, 0, 18), + [644] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_expression, 9, 0, 18), + [646] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_clause, 4, 0, 20), + [648] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_clause, 4, 0, 20), + [650] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_clause, 5, 0, 21), + [652] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_clause, 5, 0, 21), + [654] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_clause, 5, 0, 20), + [656] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_clause, 5, 0, 20), + [658] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_else_clause, 6, 0, 21), + [660] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_clause, 6, 0, 21), + [662] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_null_safe_method_call, 6, 0, 15), + [664] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_null_safe_method_call, 6, 0, 15), + [666] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_expression, 6, 0, 16), + [668] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_expression, 6, 0, 16), + [670] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_method_call, 6, 0, 15), + [672] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_method_call, 6, 0, 15), + [674] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_binary_expression, 3, 0, 7), + [676] = {.entry = {.count = 1, .reusable = true}}, SHIFT(294), + [678] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_binary_expression, 3, 0, 7), + [680] = {.entry = {.count = 1, .reusable = true}}, SHIFT(99), + [682] = {.entry = {.count = 1, .reusable = true}}, SHIFT(295), + [684] = {.entry = {.count = 1, .reusable = true}}, SHIFT(100), + [686] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_lambda_expression, 3, 0, 8), + [688] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_lambda_expression, 3, 0, 8), + [690] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_array, 4, 0, 0), + [692] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_array, 4, 0, 0), + [694] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_object, 4, 0, 0), + [696] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_object, 4, 0, 0), + [698] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_array, 2, 0, 0), + [700] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_array, 2, 0, 0), + [702] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_object, 2, 0, 0), + [704] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_object, 2, 0, 0), + [706] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_input, 2, 0, 0), + [708] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_input, 2, 0, 0), + [710] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_unary_expression, 2, 0, 2), + [712] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_unary_expression, 2, 0, 2), + [714] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_output, 2, 0, 0), + [716] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_output, 2, 0, 0), + [718] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_parenthesized_expression, 3, 0, 0), + [720] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_parenthesized_expression, 3, 0, 0), + [722] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_array, 3, 0, 0), + [724] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_array, 3, 0, 0), + [726] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_boolean, 1, 0, 0), + [728] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_boolean, 1, 0, 0), + [730] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_expression, 4, 0, 0), + [732] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_expression, 4, 0, 0), + [734] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_null, 1, 0, 0), + [736] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_null, 1, 0, 0), + [738] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_expression, 4, 0, 9), + [740] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_expression, 4, 0, 9), + [742] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_object, 3, 0, 0), + [744] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_object, 3, 0, 0), + [746] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_call_expression, 4, 0, 4), + [748] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_call_expression, 4, 0, 4), + [750] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_index, 4, 0, 11), + [752] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_index, 4, 0, 11), + [754] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_call_expression, 3, 0, 4), + [756] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_call_expression, 3, 0, 4), + [758] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_null_safe_index, 4, 0, 11), + [760] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_null_safe_index, 4, 0, 11), + [762] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_qualified_name, 3, 0, 5), + [764] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_qualified_name, 3, 0, 5), + [766] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_array, 5, 0, 0), + [768] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_array, 5, 0, 0), + [770] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_field_access, 3, 0, 6), + [772] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_field_access, 3, 0, 6), + [774] = {.entry = {.count = 1, .reusable = true}}, SHIFT(6), + [776] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_object, 5, 0, 0), + [778] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_object, 5, 0, 0), + [780] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_null_safe_field_access, 3, 0, 6), + [782] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_null_safe_field_access, 3, 0, 6), + [784] = {.entry = {.count = 1, .reusable = true}}, SHIFT(2), + [786] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_expression, 5, 0, 9), + [788] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_expression, 5, 0, 9), + [790] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_method_call, 5, 0, 15), + [792] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_method_call, 5, 0, 15), + [794] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_null_safe_method_call, 5, 0, 15), + [796] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_null_safe_method_call, 5, 0, 15), + [798] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_lambda_block, 3, 0, 0), + [800] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_lambda_block, 3, 0, 0), + [802] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_expression, 7, 0, 16), + [804] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_expression, 7, 0, 16), + [806] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_expression, 3, 0, 0), + [808] = {.entry = {.count = 2, .reusable = true}}, REDUCE(sym_object, 2, 0, 0), REDUCE(sym_match_expression, 3, 0, 0), + [811] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_expression, 3, 0, 0), + [813] = {.entry = {.count = 2, .reusable = false}}, REDUCE(sym_object, 2, 0, 0), REDUCE(sym_match_expression, 3, 0, 0), + [816] = {.entry = {.count = 1, .reusable = true}}, SHIFT(111), + [818] = {.entry = {.count = 1, .reusable = true}}, SHIFT(114), + [820] = {.entry = {.count = 1, .reusable = false}}, SHIFT(116), + [822] = {.entry = {.count = 1, .reusable = true}}, SHIFT(116), + [824] = {.entry = {.count = 1, .reusable = true}}, SHIFT(112), + [826] = {.entry = {.count = 1, .reusable = true}}, SHIFT(113), + [828] = {.entry = {.count = 1, .reusable = true}}, SHIFT(115), + [830] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_case, 3, 0, 12), + [832] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_case, 3, 0, 12), + [834] = {.entry = {.count = 1, .reusable = true}}, SHIFT(419), + [836] = {.entry = {.count = 1, .reusable = true}}, SHIFT(10), + [838] = {.entry = {.count = 1, .reusable = true}}, SHIFT(462), + [840] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_if_expression_repeat1, 2, 0, 0), SHIFT_REPEAT(483), + [843] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement, 3, 0, 0), + [845] = {.entry = {.count = 1, .reusable = true}}, SHIFT(253), + [847] = {.entry = {.count = 1, .reusable = true}}, SHIFT(256), + [849] = {.entry = {.count = 1, .reusable = true}}, SHIFT(16), + [851] = {.entry = {.count = 1, .reusable = true}}, SHIFT(11), + [853] = {.entry = {.count = 1, .reusable = true}}, SHIFT(307), + [855] = {.entry = {.count = 1, .reusable = true}}, SHIFT(103), + [857] = {.entry = {.count = 1, .reusable = true}}, SHIFT(308), + [859] = {.entry = {.count = 1, .reusable = true}}, SHIFT(104), + [861] = {.entry = {.count = 1, .reusable = true}}, SHIFT(96), + [863] = {.entry = {.count = 1, .reusable = true}}, SHIFT(93), + [865] = {.entry = {.count = 1, .reusable = true}}, SHIFT(95), + [867] = {.entry = {.count = 1, .reusable = true}}, SHIFT(97), + [869] = {.entry = {.count = 1, .reusable = false}}, SHIFT(98), + [871] = {.entry = {.count = 1, .reusable = true}}, SHIFT(98), + [873] = {.entry = {.count = 1, .reusable = true}}, SHIFT(94), + [875] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_assignment, 3, 0, 0), + [877] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_statement_block, 3, 0, 0), + [879] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_statement_block, 3, 0, 0), + [881] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_block, 3, 0, 0), + [883] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_block, 3, 0, 0), + [885] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_match_statement_case, 3, 0, 12), + [887] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement_case, 3, 0, 12), + [889] = {.entry = {.count = 1, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 1, 0, 0), + [891] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_match_statement_cases_repeat1, 1, 0, 0), + [893] = {.entry = {.count = 1, .reusable = true}}, SHIFT(272), + [895] = {.entry = {.count = 1, .reusable = false}}, REDUCE(sym_statement_block, 2, 0, 0), + [897] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_statement_block, 2, 0, 0), + [899] = {.entry = {.count = 1, .reusable = false}}, REDUCE(aux_sym_expr_body_repeat1, 2, 0, 0), + [901] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_expr_body_repeat1, 2, 0, 0), + [903] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_expr_body_repeat1, 2, 0, 0), SHIFT_REPEAT(362), + [906] = {.entry = {.count = 1, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 1, 0, 0), + [908] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_match_cases_repeat1, 1, 0, 0), + [910] = {.entry = {.count = 1, .reusable = true}}, SHIFT(273), + [912] = {.entry = {.count = 1, .reusable = true}}, SHIFT(119), + [914] = {.entry = {.count = 1, .reusable = true}}, SHIFT(122), + [916] = {.entry = {.count = 1, .reusable = false}}, SHIFT(124), + [918] = {.entry = {.count = 1, .reusable = true}}, SHIFT(124), + [920] = {.entry = {.count = 1, .reusable = true}}, SHIFT(120), + [922] = {.entry = {.count = 1, .reusable = true}}, SHIFT(121), + [924] = {.entry = {.count = 1, .reusable = true}}, SHIFT(123), + [926] = {.entry = {.count = 1, .reusable = false}}, SHIFT(82), + [928] = {.entry = {.count = 1, .reusable = false}}, REDUCE(aux_sym_match_statement_cases_repeat1, 2, 0, 0), + [930] = {.entry = {.count = 1, .reusable = false}}, REDUCE(aux_sym_match_cases_repeat1, 2, 0, 0), + [932] = {.entry = {.count = 1, .reusable = false}}, SHIFT(337), + [934] = {.entry = {.count = 2, .reusable = true}}, REDUCE(sym_parameter, 1, 0, 0), REDUCE(sym__primary, 1, 0, 0), + [937] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_parameter, 1, 0, 0), + [939] = {.entry = {.count = 1, .reusable = false}}, SHIFT(88), + [941] = {.entry = {.count = 2, .reusable = true}}, REDUCE(sym__primary, 1, 0, 0), SHIFT(286), + [944] = {.entry = {.count = 2, .reusable = true}}, REDUCE(sym__primary, 1, 0, 0), SHIFT(78), + [947] = {.entry = {.count = 1, .reusable = true}}, SHIFT(215), + [949] = {.entry = {.count = 1, .reusable = true}}, SHIFT(66), + [951] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_positional_arguments, 1, 0, 0), + [953] = {.entry = {.count = 1, .reusable = true}}, SHIFT(68), + [955] = {.entry = {.count = 1, .reusable = true}}, SHIFT(158), + [957] = {.entry = {.count = 1, .reusable = true}}, SHIFT(71), + [959] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_positional_arguments_repeat1, 2, 0, 0), + [961] = {.entry = {.count = 1, .reusable = false}}, SHIFT(366), + [963] = {.entry = {.count = 1, .reusable = false}}, SHIFT(127), + [965] = {.entry = {.count = 1, .reusable = true}}, SHIFT(79), + [967] = {.entry = {.count = 1, .reusable = true}}, SHIFT(407), + [969] = {.entry = {.count = 1, .reusable = true}}, SHIFT(336), + [971] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_named_argument, 3, 0, 14), + [973] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_expr_body, 1, 0, 0), + [975] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_object_entry, 3, 0, 10), + [977] = {.entry = {.count = 1, .reusable = true}}, SHIFT(17), + [979] = {.entry = {.count = 1, .reusable = true}}, SHIFT(473), + [981] = {.entry = {.count = 1, .reusable = true}}, SHIFT(105), + [983] = {.entry = {.count = 1, .reusable = true}}, SHIFT(106), + [985] = {.entry = {.count = 1, .reusable = true}}, SHIFT(107), + [987] = {.entry = {.count = 1, .reusable = true}}, SHIFT(108), + [989] = {.entry = {.count = 1, .reusable = true}}, SHIFT(109), + [991] = {.entry = {.count = 1, .reusable = false}}, SHIFT(110), + [993] = {.entry = {.count = 1, .reusable = true}}, SHIFT(110), + [995] = {.entry = {.count = 1, .reusable = false}}, SHIFT(171), + [997] = {.entry = {.count = 1, .reusable = false}}, SHIFT(173), + [999] = {.entry = {.count = 1, .reusable = true}}, SHIFT(20), + [1001] = {.entry = {.count = 1, .reusable = true}}, SHIFT(471), + [1003] = {.entry = {.count = 1, .reusable = true}}, SHIFT(67), + [1005] = {.entry = {.count = 1, .reusable = true}}, SHIFT(27), + [1007] = {.entry = {.count = 1, .reusable = true}}, SHIFT(477), + [1009] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_expr_body, 2, 0, 0), + [1011] = {.entry = {.count = 1, .reusable = false}}, SHIFT(207), + [1013] = {.entry = {.count = 1, .reusable = false}}, SHIFT(203), + [1015] = {.entry = {.count = 1, .reusable = false}}, SHIFT(208), + [1017] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_var_assignment, 4, 0, 0), + [1019] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_var_assignment, 3, 0, 0), + [1021] = {.entry = {.count = 1, .reusable = true}}, SHIFT(24), + [1023] = {.entry = {.count = 1, .reusable = true}}, SHIFT(214), + [1025] = {.entry = {.count = 1, .reusable = true}}, SHIFT(157), + [1027] = {.entry = {.count = 1, .reusable = true}}, SHIFT(231), + [1029] = {.entry = {.count = 1, .reusable = true}}, SHIFT(18), + [1031] = {.entry = {.count = 1, .reusable = true}}, SHIFT(167), + [1033] = {.entry = {.count = 1, .reusable = true}}, SHIFT(21), + [1035] = {.entry = {.count = 1, .reusable = true}}, SHIFT(232), + [1037] = {.entry = {.count = 1, .reusable = true}}, SHIFT(376), + [1039] = {.entry = {.count = 1, .reusable = true}}, SHIFT(165), + [1041] = {.entry = {.count = 1, .reusable = true}}, SHIFT(29), + [1043] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), + [1045] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), SHIFT_REPEAT(354), + [1048] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), SHIFT_REPEAT(452), + [1051] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), SHIFT_REPEAT(399), + [1054] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), SHIFT_REPEAT(83), + [1057] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), SHIFT_REPEAT(87), + [1060] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), SHIFT_REPEAT(355), + [1063] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_source_file_repeat1, 2, 0, 0), SHIFT_REPEAT(327), + [1066] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_source_file, 1, 0, 0), + [1068] = {.entry = {.count = 1, .reusable = true}}, SHIFT(327), + [1070] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_statement, 4, 0, 3), + [1072] = {.entry = {.count = 1, .reusable = true}}, SHIFT(375), + [1074] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_statement, 3, 0, 3), + [1076] = {.entry = {.count = 1, .reusable = true}}, SHIFT(265), + [1078] = {.entry = {.count = 1, .reusable = true}}, SHIFT(332), + [1080] = {.entry = {.count = 1, .reusable = true}}, SHIFT(261), + [1082] = {.entry = {.count = 1, .reusable = true}}, SHIFT(334), + [1084] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_if_statement_repeat1, 2, 0, 0), + [1086] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_if_statement_repeat1, 2, 0, 0), SHIFT_REPEAT(430), + [1089] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_statement_block_repeat1, 2, 0, 0), SHIFT_REPEAT(354), + [1092] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_statement_block_repeat1, 2, 0, 0), + [1094] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_statement_block_repeat1, 2, 0, 0), SHIFT_REPEAT(83), + [1097] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_statement_block_repeat1, 2, 0, 0), SHIFT_REPEAT(87), + [1100] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_statement_block_repeat1, 2, 0, 0), SHIFT_REPEAT(355), + [1103] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_statement_block_repeat1, 2, 0, 0), SHIFT_REPEAT(334), + [1106] = {.entry = {.count = 1, .reusable = true}}, SHIFT(338), + [1108] = {.entry = {.count = 1, .reusable = true}}, SHIFT(340), + [1110] = {.entry = {.count = 1, .reusable = true}}, SHIFT(335), + [1112] = {.entry = {.count = 1, .reusable = true}}, SHIFT(159), + [1114] = {.entry = {.count = 1, .reusable = true}}, SHIFT(161), + [1116] = {.entry = {.count = 1, .reusable = false}}, SHIFT(398), + [1118] = {.entry = {.count = 1, .reusable = true}}, SHIFT(398), + [1120] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_if_statement_clause, 4, 0, 19), + [1122] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_else_statement_clause, 2, 0, 0), + [1124] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement, 6, 0, 16), + [1126] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_if_statement, 5, 0, 3), + [1128] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement, 7, 0, 16), + [1130] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement, 4, 0, 9), + [1132] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement, 4, 0, 0), + [1134] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_match_statement, 5, 0, 9), + [1136] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_map_declaration, 9, 0, 17), + [1138] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_map_declaration, 8, 0, 17), + [1140] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_map_declaration, 10, 0, 17), + [1142] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_import_statement, 4, 0, 0), + [1144] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_map_declaration, 7, 0, 17), + [1146] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_assign_target, 1, 0, 0), + [1148] = {.entry = {.count = 1, .reusable = true}}, SHIFT(358), + [1150] = {.entry = {.count = 1, .reusable = true}}, SHIFT(286), + [1152] = {.entry = {.count = 1, .reusable = true}}, SHIFT(78), + [1154] = {.entry = {.count = 1, .reusable = true}}, SHIFT(90), + [1156] = {.entry = {.count = 1, .reusable = false}}, SHIFT(390), + [1158] = {.entry = {.count = 1, .reusable = true}}, SHIFT(437), + [1160] = {.entry = {.count = 1, .reusable = false}}, SHIFT(412), + [1162] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_assign_target, 2, 0, 0), + [1164] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_assign_target, 3, 0, 0), + [1166] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_assign_target_repeat1, 2, 0, 0), + [1168] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_assign_target_repeat1, 2, 0, 0), SHIFT_REPEAT(286), + [1171] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_assign_target_repeat1, 2, 0, 0), SHIFT_REPEAT(78), + [1174] = {.entry = {.count = 1, .reusable = true}}, SHIFT(88), + [1176] = {.entry = {.count = 1, .reusable = false}}, SHIFT(128), + [1178] = {.entry = {.count = 1, .reusable = true}}, SHIFT(367), + [1180] = {.entry = {.count = 1, .reusable = false}}, SHIFT_EXTRA(), + [1182] = {.entry = {.count = 1, .reusable = false}}, SHIFT(204), + [1184] = {.entry = {.count = 1, .reusable = true}}, SHIFT(365), + [1186] = {.entry = {.count = 1, .reusable = false}}, SHIFT(205), + [1188] = {.entry = {.count = 1, .reusable = true}}, SHIFT(369), + [1190] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_target_path_segment, 2, 0, 0), + [1192] = {.entry = {.count = 1, .reusable = true}}, SHIFT(7), + [1194] = {.entry = {.count = 1, .reusable = false}}, SHIFT(129), + [1196] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_positional_arguments_repeat1, 2, 0, 0), SHIFT_REPEAT(86), + [1199] = {.entry = {.count = 1, .reusable = false}}, REDUCE(aux_sym_string_repeat1, 2, 0, 0), + [1201] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_string_repeat1, 2, 0, 0), SHIFT_REPEAT(369), + [1204] = {.entry = {.count = 1, .reusable = true}}, SHIFT(163), + [1206] = {.entry = {.count = 1, .reusable = true}}, SHIFT(57), + [1208] = {.entry = {.count = 1, .reusable = true}}, SHIFT(61), + [1210] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_parameter_list_repeat1, 2, 0, 0), + [1212] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_parameter_list_repeat1, 2, 0, 0), SHIFT_REPEAT(377), + [1215] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_parameter_list, 1, 0, 0), + [1217] = {.entry = {.count = 1, .reusable = true}}, SHIFT(377), + [1219] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_named_arguments, 1, 0, 0), + [1221] = {.entry = {.count = 1, .reusable = true}}, SHIFT(388), + [1223] = {.entry = {.count = 1, .reusable = true}}, SHIFT(85), + [1225] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_target_path_segment, 3, 0, 0), + [1227] = {.entry = {.count = 1, .reusable = true}}, SHIFT(216), + [1229] = {.entry = {.count = 1, .reusable = true}}, SHIFT(52), + [1231] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_target_path_segment, 5, 0, 0), + [1233] = {.entry = {.count = 1, .reusable = true}}, SHIFT(441), + [1235] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_named_arguments, 3, 0, 0), + [1237] = {.entry = {.count = 1, .reusable = true}}, SHIFT(72), + [1239] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_named_arguments_repeat1, 2, 0, 0), + [1241] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_named_arguments_repeat1, 2, 0, 0), SHIFT_REPEAT(418), + [1244] = {.entry = {.count = 1, .reusable = true}}, SHIFT(55), + [1246] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_target_path_segment, 4, 0, 0), + [1248] = {.entry = {.count = 1, .reusable = true}}, SHIFT(51), + [1250] = {.entry = {.count = 1, .reusable = true}}, SHIFT(74), + [1252] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_named_arguments, 2, 0, 0), + [1254] = {.entry = {.count = 1, .reusable = true}}, SHIFT(380), + [1256] = {.entry = {.count = 1, .reusable = true}}, SHIFT(337), + [1258] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_parameter_list, 2, 0, 0), + [1260] = {.entry = {.count = 1, .reusable = true}}, REDUCE(aux_sym_object_repeat1, 2, 0, 0), + [1262] = {.entry = {.count = 2, .reusable = true}}, REDUCE(aux_sym_object_repeat1, 2, 0, 0), SHIFT_REPEAT(73), + [1265] = {.entry = {.count = 1, .reusable = true}}, SHIFT(44), + [1267] = {.entry = {.count = 1, .reusable = true}}, SHIFT(458), + [1269] = {.entry = {.count = 1, .reusable = true}}, SHIFT(130), + [1271] = {.entry = {.count = 1, .reusable = true}}, SHIFT(463), + [1273] = {.entry = {.count = 1, .reusable = true}}, SHIFT(140), + [1275] = {.entry = {.count = 1, .reusable = true}}, SHIFT(457), + [1277] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_parameter, 3, 0, 0), + [1279] = {.entry = {.count = 1, .reusable = true}}, SHIFT(190), + [1281] = {.entry = {.count = 1, .reusable = true}}, SHIFT(468), + [1283] = {.entry = {.count = 1, .reusable = true}}, SHIFT(191), + [1285] = {.entry = {.count = 1, .reusable = true}}, SHIFT(456), + [1287] = {.entry = {.count = 1, .reusable = true}}, SHIFT(249), + [1289] = {.entry = {.count = 1, .reusable = true}}, SHIFT(424), + [1291] = {.entry = {.count = 1, .reusable = true}}, SHIFT(250), + [1293] = {.entry = {.count = 1, .reusable = true}}, SHIFT(425), + [1295] = {.entry = {.count = 1, .reusable = true}}, SHIFT(209), + [1297] = {.entry = {.count = 1, .reusable = true}}, SHIFT(428), + [1299] = {.entry = {.count = 1, .reusable = true}}, SHIFT(210), + [1301] = {.entry = {.count = 1, .reusable = true}}, SHIFT(429), + [1303] = {.entry = {.count = 1, .reusable = true}}, SHIFT(353), + [1305] = {.entry = {.count = 1, .reusable = true}}, SHIFT(447), + [1307] = {.entry = {.count = 1, .reusable = true}}, SHIFT(331), + [1309] = {.entry = {.count = 1, .reusable = true}}, SHIFT(141), + [1311] = {.entry = {.count = 1, .reusable = true}}, SHIFT(466), + [1313] = {.entry = {.count = 1, .reusable = true}}, SHIFT(350), + [1315] = {.entry = {.count = 1, .reusable = true}}, SHIFT(444), + [1317] = {.entry = {.count = 1, .reusable = true}}, SHIFT(349), + [1319] = {.entry = {.count = 1, .reusable = true}}, SHIFT(481), + [1321] = {.entry = {.count = 1, .reusable = true}}, SHIFT(49), + [1323] = {.entry = {.count = 1, .reusable = true}}, SHIFT(450), + [1325] = {.entry = {.count = 1, .reusable = true}}, SHIFT(26), + [1327] = {.entry = {.count = 1, .reusable = true}}, SHIFT(91), + [1329] = {.entry = {.count = 1, .reusable = true}}, SHIFT(131), + [1331] = {.entry = {.count = 1, .reusable = true}}, SHIFT(423), + [1333] = {.entry = {.count = 1, .reusable = true}}, SHIFT(30), + [1335] = {.entry = {.count = 1, .reusable = true}}, SHIFT(126), + [1337] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym__lambda_params, 3, 0, 0), + [1339] = {.entry = {.count = 1, .reusable = true}}, SHIFT(245), + [1341] = {.entry = {.count = 1, .reusable = true}}, SHIFT(228), + [1343] = {.entry = {.count = 1, .reusable = true}}, SHIFT(133), + [1345] = {.entry = {.count = 1, .reusable = true}}, SHIFT(251), + [1347] = {.entry = {.count = 1, .reusable = true}}, SHIFT(252), + [1349] = {.entry = {.count = 1, .reusable = true}}, SHIFT(230), + [1351] = {.entry = {.count = 1, .reusable = true}}, SHIFT(147), + [1353] = {.entry = {.count = 1, .reusable = true}}, SHIFT(211), + [1355] = {.entry = {.count = 1, .reusable = true}}, SHIFT(212), + [1357] = {.entry = {.count = 1, .reusable = true}}, SHIFT(352), + [1359] = {.entry = {.count = 1, .reusable = true}}, SHIFT(348), + [1361] = {.entry = {.count = 1, .reusable = true}}, SHIFT(144), + [1363] = {.entry = {.count = 1, .reusable = true}}, ACCEPT_INPUT(), + [1365] = {.entry = {.count = 1, .reusable = true}}, SHIFT(344), + [1367] = {.entry = {.count = 1, .reusable = true}}, SHIFT(19), + [1369] = {.entry = {.count = 1, .reusable = true}}, SHIFT(357), + [1371] = {.entry = {.count = 1, .reusable = true}}, SHIFT(174), + [1373] = {.entry = {.count = 1, .reusable = true}}, SHIFT(25), + [1375] = {.entry = {.count = 1, .reusable = true}}, SHIFT(82), + [1377] = {.entry = {.count = 1, .reusable = true}}, SHIFT(281), + [1379] = {.entry = {.count = 1, .reusable = true}}, SHIFT(235), + [1381] = {.entry = {.count = 1, .reusable = true}}, SHIFT(64), + [1383] = {.entry = {.count = 1, .reusable = true}}, SHIFT(164), + [1385] = {.entry = {.count = 1, .reusable = true}}, SHIFT(168), + [1387] = {.entry = {.count = 1, .reusable = true}}, SHIFT(60), + [1389] = {.entry = {.count = 1, .reusable = true}}, SHIFT(379), + [1391] = {.entry = {.count = 1, .reusable = true}}, SHIFT(438), + [1393] = {.entry = {.count = 1, .reusable = true}}, SHIFT(22), + [1395] = {.entry = {.count = 1, .reusable = true}}, SHIFT(28), + [1397] = {.entry = {.count = 1, .reusable = true}}, SHIFT(160), + [1399] = {.entry = {.count = 1, .reusable = true}}, SHIFT(193), + [1401] = {.entry = {.count = 1, .reusable = true}}, SHIFT(142), + [1403] = {.entry = {.count = 1, .reusable = true}}, SHIFT(50), + [1405] = {.entry = {.count = 1, .reusable = true}}, SHIFT(69), + [1407] = {.entry = {.count = 1, .reusable = true}}, SHIFT(238), + [1409] = {.entry = {.count = 1, .reusable = true}}, SHIFT(431), + [1411] = {.entry = {.count = 1, .reusable = true}}, SHIFT(206), + [1413] = {.entry = {.count = 1, .reusable = true}}, SHIFT(132), + [1415] = {.entry = {.count = 1, .reusable = true}}, SHIFT(420), + [1417] = {.entry = {.count = 1, .reusable = true}}, SHIFT(177), + [1419] = {.entry = {.count = 1, .reusable = true}}, SHIFT(143), + [1421] = {.entry = {.count = 1, .reusable = true}}, SHIFT(89), + [1423] = {.entry = {.count = 1, .reusable = true}}, SHIFT(189), + [1425] = {.entry = {.count = 1, .reusable = true}}, SHIFT(262), + [1427] = {.entry = {.count = 1, .reusable = true}}, SHIFT(474), + [1429] = {.entry = {.count = 1, .reusable = true}}, SHIFT(453), + [1431] = {.entry = {.count = 1, .reusable = true}}, SHIFT(178), + [1433] = {.entry = {.count = 1, .reusable = true}}, SHIFT(440), + [1435] = {.entry = {.count = 1, .reusable = true}}, SHIFT(23), + [1437] = {.entry = {.count = 1, .reusable = true}}, SHIFT(70), + [1439] = {.entry = {.count = 1, .reusable = true}}, SHIFT(241), + [1441] = {.entry = {.count = 1, .reusable = true}}, SHIFT(454), + [1443] = {.entry = {.count = 1, .reusable = true}}, SHIFT(242), + [1445] = {.entry = {.count = 1, .reusable = true}}, SHIFT(75), + [1447] = {.entry = {.count = 1, .reusable = true}}, SHIFT(351), + [1449] = {.entry = {.count = 1, .reusable = true}}, SHIFT(346), + [1451] = {.entry = {.count = 1, .reusable = true}}, REDUCE(sym_argument_list, 1, 0, 0), +}; + +enum ts_external_scanner_symbol_identifiers { + ts_external_token__newline = 0, + ts_external_token__nl_skip = 1, +}; + +static const TSSymbol ts_external_scanner_symbol_map[EXTERNAL_TOKEN_COUNT] = { + [ts_external_token__newline] = sym__newline, + [ts_external_token__nl_skip] = sym__nl_skip, +}; + +static const bool ts_external_scanner_states[3][EXTERNAL_TOKEN_COUNT] = { + [1] = { + [ts_external_token__newline] = true, + [ts_external_token__nl_skip] = true, + }, + [2] = { + [ts_external_token__nl_skip] = true, + }, +}; + +#ifdef __cplusplus +extern "C" { +#endif +void *tree_sitter_bloblang2_external_scanner_create(void); +void tree_sitter_bloblang2_external_scanner_destroy(void *); +bool tree_sitter_bloblang2_external_scanner_scan(void *, TSLexer *, const bool *); +unsigned tree_sitter_bloblang2_external_scanner_serialize(void *, char *); +void tree_sitter_bloblang2_external_scanner_deserialize(void *, const char *, unsigned); + +#ifdef TREE_SITTER_HIDE_SYMBOLS +#define TS_PUBLIC +#elif defined(_WIN32) +#define TS_PUBLIC __declspec(dllexport) +#else +#define TS_PUBLIC __attribute__((visibility("default"))) +#endif + +TS_PUBLIC const TSLanguage *tree_sitter_bloblang2(void) { + static const TSLanguage language = { + .version = LANGUAGE_VERSION, + .symbol_count = SYMBOL_COUNT, + .alias_count = ALIAS_COUNT, + .token_count = TOKEN_COUNT, + .external_token_count = EXTERNAL_TOKEN_COUNT, + .state_count = STATE_COUNT, + .large_state_count = LARGE_STATE_COUNT, + .production_id_count = PRODUCTION_ID_COUNT, + .field_count = FIELD_COUNT, + .max_alias_sequence_length = MAX_ALIAS_SEQUENCE_LENGTH, + .parse_table = &ts_parse_table[0][0], + .small_parse_table = ts_small_parse_table, + .small_parse_table_map = ts_small_parse_table_map, + .parse_actions = ts_parse_actions, + .symbol_names = ts_symbol_names, + .field_names = ts_field_names, + .field_map_slices = ts_field_map_slices, + .field_map_entries = ts_field_map_entries, + .symbol_metadata = ts_symbol_metadata, + .public_symbol_map = ts_symbol_map, + .alias_map = ts_non_terminal_alias_map, + .alias_sequences = &ts_alias_sequences[0][0], + .lex_modes = ts_lex_modes, + .lex_fn = ts_lex, + .keyword_lex_fn = ts_lex_keywords, + .keyword_capture_token = sym_identifier, + .external_scanner = { + &ts_external_scanner_states[0][0], + ts_external_scanner_symbol_map, + tree_sitter_bloblang2_external_scanner_create, + tree_sitter_bloblang2_external_scanner_destroy, + tree_sitter_bloblang2_external_scanner_scan, + tree_sitter_bloblang2_external_scanner_serialize, + tree_sitter_bloblang2_external_scanner_deserialize, + }, + .primary_state_ids = ts_primary_state_ids, + }; + return &language; +} +#ifdef __cplusplus +} +#endif diff --git a/internal/bloblang2/tree-sitter/src/scanner.c b/internal/bloblang2/tree-sitter/src/scanner.c new file mode 100644 index 000000000..5248c8567 --- /dev/null +++ b/internal/bloblang2/tree-sitter/src/scanner.c @@ -0,0 +1,145 @@ +// External scanner for Bloblang V2 tree-sitter grammar. +// +// Handles newline significance: Bloblang uses newlines as statement separators, +// but suppresses them in certain contexts: +// +// 1. Inside () and [] — handled by the grammar (these rules don't include +// _newline, so valid_symbols[NEWLINE] is false → scanner emits NL_SKIP). +// 2. After operators that can't end an expression (+, -, =, =>, ->, etc.) — +// handled by the grammar (parser expects an expression, not _newline). +// 3. When the next line starts with a postfix token (., ?., [, ?[, else) — +// handled HERE in the scanner via lookahead. +// 4. Consecutive newlines collapsed — handled HERE (emit one, skip the rest). +// +// Two external tokens: +// NEWLINE — significant newline (statement separator) +// NL_SKIP — newline consumed as whitespace (in extras) + +#include "tree_sitter/parser.h" + +#include + +enum { + NEWLINE = 0, + NL_SKIP = 1, +}; + +// No persistent state needed — all decisions are based on valid_symbols +// and character lookahead. +void *tree_sitter_bloblang2_external_scanner_create(void) { + return NULL; +} + +void tree_sitter_bloblang2_external_scanner_destroy(void *payload) { + (void)payload; +} + +unsigned tree_sitter_bloblang2_external_scanner_serialize(void *payload, + char *buffer) { + (void)payload; + (void)buffer; + return 0; +} + +void tree_sitter_bloblang2_external_scanner_deserialize(void *payload, + const char *buffer, + unsigned length) { + (void)payload; + (void)buffer; + (void)length; +} + +// Peek ahead past whitespace and newlines to check for postfix continuation. +// Returns true if the next substantive token is '.', '?.', '[', '?[', or 'else'. +static bool is_postfix_continuation(TSLexer *lexer) { + // We've already consumed the first \n. Now peek ahead. + for (;;) { + int32_t c = lexer->lookahead; + if (c == ' ' || c == '\t' || c == '\r' || c == '\n') { + lexer->advance(lexer, false); + continue; + } + // Skip comment lines — a line starting with # is not a postfix continuation, + // but we need to look past it to check the next real line. + if (c == '#') { + while (lexer->lookahead != '\n' && lexer->lookahead != 0) { + lexer->advance(lexer, false); + } + continue; + } + break; + } + + int32_t c = lexer->lookahead; + + if (c == '.') return true; + if (c == '[') return true; + if (c == '?') return true; // ?. or ?[ + + // Check for 'else' keyword. + if (c == 'e') { + lexer->advance(lexer, false); + if (lexer->lookahead != 'l') return false; + lexer->advance(lexer, false); + if (lexer->lookahead != 's') return false; + lexer->advance(lexer, false); + if (lexer->lookahead != 'e') return false; + lexer->advance(lexer, false); + // Must be word boundary. + int32_t after = lexer->lookahead; + if ((after >= 'a' && after <= 'z') || + (after >= 'A' && after <= 'Z') || + (after >= '0' && after <= '9') || + after == '_') { + return false; + } + return true; + } + + return false; +} + +bool tree_sitter_bloblang2_external_scanner_scan(void *payload, + TSLexer *lexer, + const bool *valid_symbols) { + (void)payload; + + // Skip spaces and tabs (but not newlines — those are what we're looking for). + while (lexer->lookahead == ' ' || lexer->lookahead == '\t' || + lexer->lookahead == '\r') { + lexer->advance(lexer, true); + } + + // Not a newline — nothing for us to do. + if (lexer->lookahead != '\n') { + return false; + } + + // We found a newline. Consume it. + lexer->advance(lexer, false); + + // If the parser wants a significant newline (statement separator)... + if (valid_symbols[NEWLINE]) { + // Mark the end of the token here — everything after this is lookahead. + lexer->mark_end(lexer); + + // Check for postfix continuation on the next line. + if (is_postfix_continuation(lexer)) { + // Suppress: emit as whitespace skip instead. + lexer->result_symbol = NL_SKIP; + return true; + } + + // Emit significant newline. + lexer->result_symbol = NEWLINE; + return true; + } + + // Parser doesn't want a newline here — consume as whitespace. + if (valid_symbols[NL_SKIP]) { + lexer->result_symbol = NL_SKIP; + return true; + } + + return false; +} diff --git a/internal/bloblang2/tree-sitter/src/tree_sitter/alloc.h b/internal/bloblang2/tree-sitter/src/tree_sitter/alloc.h new file mode 100644 index 000000000..1abdd1201 --- /dev/null +++ b/internal/bloblang2/tree-sitter/src/tree_sitter/alloc.h @@ -0,0 +1,54 @@ +#ifndef TREE_SITTER_ALLOC_H_ +#define TREE_SITTER_ALLOC_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include +#include + +// Allow clients to override allocation functions +#ifdef TREE_SITTER_REUSE_ALLOCATOR + +extern void *(*ts_current_malloc)(size_t size); +extern void *(*ts_current_calloc)(size_t count, size_t size); +extern void *(*ts_current_realloc)(void *ptr, size_t size); +extern void (*ts_current_free)(void *ptr); + +#ifndef ts_malloc +#define ts_malloc ts_current_malloc +#endif +#ifndef ts_calloc +#define ts_calloc ts_current_calloc +#endif +#ifndef ts_realloc +#define ts_realloc ts_current_realloc +#endif +#ifndef ts_free +#define ts_free ts_current_free +#endif + +#else + +#ifndef ts_malloc +#define ts_malloc malloc +#endif +#ifndef ts_calloc +#define ts_calloc calloc +#endif +#ifndef ts_realloc +#define ts_realloc realloc +#endif +#ifndef ts_free +#define ts_free free +#endif + +#endif + +#ifdef __cplusplus +} +#endif + +#endif // TREE_SITTER_ALLOC_H_ diff --git a/internal/bloblang2/tree-sitter/src/tree_sitter/array.h b/internal/bloblang2/tree-sitter/src/tree_sitter/array.h new file mode 100644 index 000000000..a17a574f0 --- /dev/null +++ b/internal/bloblang2/tree-sitter/src/tree_sitter/array.h @@ -0,0 +1,291 @@ +#ifndef TREE_SITTER_ARRAY_H_ +#define TREE_SITTER_ARRAY_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include "./alloc.h" + +#include +#include +#include +#include +#include + +#ifdef _MSC_VER +#pragma warning(push) +#pragma warning(disable : 4101) +#elif defined(__GNUC__) || defined(__clang__) +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wunused-variable" +#endif + +#define Array(T) \ + struct { \ + T *contents; \ + uint32_t size; \ + uint32_t capacity; \ + } + +/// Initialize an array. +#define array_init(self) \ + ((self)->size = 0, (self)->capacity = 0, (self)->contents = NULL) + +/// Create an empty array. +#define array_new() \ + { NULL, 0, 0 } + +/// Get a pointer to the element at a given `index` in the array. +#define array_get(self, _index) \ + (assert((uint32_t)(_index) < (self)->size), &(self)->contents[_index]) + +/// Get a pointer to the first element in the array. +#define array_front(self) array_get(self, 0) + +/// Get a pointer to the last element in the array. +#define array_back(self) array_get(self, (self)->size - 1) + +/// Clear the array, setting its size to zero. Note that this does not free any +/// memory allocated for the array's contents. +#define array_clear(self) ((self)->size = 0) + +/// Reserve `new_capacity` elements of space in the array. If `new_capacity` is +/// less than the array's current capacity, this function has no effect. +#define array_reserve(self, new_capacity) \ + _array__reserve((Array *)(self), array_elem_size(self), new_capacity) + +/// Free any memory allocated for this array. Note that this does not free any +/// memory allocated for the array's contents. +#define array_delete(self) _array__delete((Array *)(self)) + +/// Push a new `element` onto the end of the array. +#define array_push(self, element) \ + (_array__grow((Array *)(self), 1, array_elem_size(self)), \ + (self)->contents[(self)->size++] = (element)) + +/// Increase the array's size by `count` elements. +/// New elements are zero-initialized. +#define array_grow_by(self, count) \ + do { \ + if ((count) == 0) break; \ + _array__grow((Array *)(self), count, array_elem_size(self)); \ + memset((self)->contents + (self)->size, 0, (count) * array_elem_size(self)); \ + (self)->size += (count); \ + } while (0) + +/// Append all elements from one array to the end of another. +#define array_push_all(self, other) \ + array_extend((self), (other)->size, (other)->contents) + +/// Append `count` elements to the end of the array, reading their values from the +/// `contents` pointer. +#define array_extend(self, count, contents) \ + _array__splice( \ + (Array *)(self), array_elem_size(self), (self)->size, \ + 0, count, contents \ + ) + +/// Remove `old_count` elements from the array starting at the given `index`. At +/// the same index, insert `new_count` new elements, reading their values from the +/// `new_contents` pointer. +#define array_splice(self, _index, old_count, new_count, new_contents) \ + _array__splice( \ + (Array *)(self), array_elem_size(self), _index, \ + old_count, new_count, new_contents \ + ) + +/// Insert one `element` into the array at the given `index`. +#define array_insert(self, _index, element) \ + _array__splice((Array *)(self), array_elem_size(self), _index, 0, 1, &(element)) + +/// Remove one element from the array at the given `index`. +#define array_erase(self, _index) \ + _array__erase((Array *)(self), array_elem_size(self), _index) + +/// Pop the last element off the array, returning the element by value. +#define array_pop(self) ((self)->contents[--(self)->size]) + +/// Assign the contents of one array to another, reallocating if necessary. +#define array_assign(self, other) \ + _array__assign((Array *)(self), (const Array *)(other), array_elem_size(self)) + +/// Swap one array with another +#define array_swap(self, other) \ + _array__swap((Array *)(self), (Array *)(other)) + +/// Get the size of the array contents +#define array_elem_size(self) (sizeof *(self)->contents) + +/// Search a sorted array for a given `needle` value, using the given `compare` +/// callback to determine the order. +/// +/// If an existing element is found to be equal to `needle`, then the `index` +/// out-parameter is set to the existing value's index, and the `exists` +/// out-parameter is set to true. Otherwise, `index` is set to an index where +/// `needle` should be inserted in order to preserve the sorting, and `exists` +/// is set to false. +#define array_search_sorted_with(self, compare, needle, _index, _exists) \ + _array__search_sorted(self, 0, compare, , needle, _index, _exists) + +/// Search a sorted array for a given `needle` value, using integer comparisons +/// of a given struct field (specified with a leading dot) to determine the order. +/// +/// See also `array_search_sorted_with`. +#define array_search_sorted_by(self, field, needle, _index, _exists) \ + _array__search_sorted(self, 0, _compare_int, field, needle, _index, _exists) + +/// Insert a given `value` into a sorted array, using the given `compare` +/// callback to determine the order. +#define array_insert_sorted_with(self, compare, value) \ + do { \ + unsigned _index, _exists; \ + array_search_sorted_with(self, compare, &(value), &_index, &_exists); \ + if (!_exists) array_insert(self, _index, value); \ + } while (0) + +/// Insert a given `value` into a sorted array, using integer comparisons of +/// a given struct field (specified with a leading dot) to determine the order. +/// +/// See also `array_search_sorted_by`. +#define array_insert_sorted_by(self, field, value) \ + do { \ + unsigned _index, _exists; \ + array_search_sorted_by(self, field, (value) field, &_index, &_exists); \ + if (!_exists) array_insert(self, _index, value); \ + } while (0) + +// Private + +typedef Array(void) Array; + +/// This is not what you're looking for, see `array_delete`. +static inline void _array__delete(Array *self) { + if (self->contents) { + ts_free(self->contents); + self->contents = NULL; + self->size = 0; + self->capacity = 0; + } +} + +/// This is not what you're looking for, see `array_erase`. +static inline void _array__erase(Array *self, size_t element_size, + uint32_t index) { + assert(index < self->size); + char *contents = (char *)self->contents; + memmove(contents + index * element_size, contents + (index + 1) * element_size, + (self->size - index - 1) * element_size); + self->size--; +} + +/// This is not what you're looking for, see `array_reserve`. +static inline void _array__reserve(Array *self, size_t element_size, uint32_t new_capacity) { + if (new_capacity > self->capacity) { + if (self->contents) { + self->contents = ts_realloc(self->contents, new_capacity * element_size); + } else { + self->contents = ts_malloc(new_capacity * element_size); + } + self->capacity = new_capacity; + } +} + +/// This is not what you're looking for, see `array_assign`. +static inline void _array__assign(Array *self, const Array *other, size_t element_size) { + _array__reserve(self, element_size, other->size); + self->size = other->size; + memcpy(self->contents, other->contents, self->size * element_size); +} + +/// This is not what you're looking for, see `array_swap`. +static inline void _array__swap(Array *self, Array *other) { + Array swap = *other; + *other = *self; + *self = swap; +} + +/// This is not what you're looking for, see `array_push` or `array_grow_by`. +static inline void _array__grow(Array *self, uint32_t count, size_t element_size) { + uint32_t new_size = self->size + count; + if (new_size > self->capacity) { + uint32_t new_capacity = self->capacity * 2; + if (new_capacity < 8) new_capacity = 8; + if (new_capacity < new_size) new_capacity = new_size; + _array__reserve(self, element_size, new_capacity); + } +} + +/// This is not what you're looking for, see `array_splice`. +static inline void _array__splice(Array *self, size_t element_size, + uint32_t index, uint32_t old_count, + uint32_t new_count, const void *elements) { + uint32_t new_size = self->size + new_count - old_count; + uint32_t old_end = index + old_count; + uint32_t new_end = index + new_count; + assert(old_end <= self->size); + + _array__reserve(self, element_size, new_size); + + char *contents = (char *)self->contents; + if (self->size > old_end) { + memmove( + contents + new_end * element_size, + contents + old_end * element_size, + (self->size - old_end) * element_size + ); + } + if (new_count > 0) { + if (elements) { + memcpy( + (contents + index * element_size), + elements, + new_count * element_size + ); + } else { + memset( + (contents + index * element_size), + 0, + new_count * element_size + ); + } + } + self->size += new_count - old_count; +} + +/// A binary search routine, based on Rust's `std::slice::binary_search_by`. +/// This is not what you're looking for, see `array_search_sorted_with` or `array_search_sorted_by`. +#define _array__search_sorted(self, start, compare, suffix, needle, _index, _exists) \ + do { \ + *(_index) = start; \ + *(_exists) = false; \ + uint32_t size = (self)->size - *(_index); \ + if (size == 0) break; \ + int comparison; \ + while (size > 1) { \ + uint32_t half_size = size / 2; \ + uint32_t mid_index = *(_index) + half_size; \ + comparison = compare(&((self)->contents[mid_index] suffix), (needle)); \ + if (comparison <= 0) *(_index) = mid_index; \ + size -= half_size; \ + } \ + comparison = compare(&((self)->contents[*(_index)] suffix), (needle)); \ + if (comparison == 0) *(_exists) = true; \ + else if (comparison < 0) *(_index) += 1; \ + } while (0) + +/// Helper macro for the `_sorted_by` routines below. This takes the left (existing) +/// parameter by reference in order to work with the generic sorting function above. +#define _compare_int(a, b) ((int)*(a) - (int)(b)) + +#ifdef _MSC_VER +#pragma warning(pop) +#elif defined(__GNUC__) || defined(__clang__) +#pragma GCC diagnostic pop +#endif + +#ifdef __cplusplus +} +#endif + +#endif // TREE_SITTER_ARRAY_H_ diff --git a/internal/bloblang2/tree-sitter/src/tree_sitter/parser.h b/internal/bloblang2/tree-sitter/src/tree_sitter/parser.h new file mode 100644 index 000000000..799f599bd --- /dev/null +++ b/internal/bloblang2/tree-sitter/src/tree_sitter/parser.h @@ -0,0 +1,266 @@ +#ifndef TREE_SITTER_PARSER_H_ +#define TREE_SITTER_PARSER_H_ + +#ifdef __cplusplus +extern "C" { +#endif + +#include +#include +#include + +#define ts_builtin_sym_error ((TSSymbol)-1) +#define ts_builtin_sym_end 0 +#define TREE_SITTER_SERIALIZATION_BUFFER_SIZE 1024 + +#ifndef TREE_SITTER_API_H_ +typedef uint16_t TSStateId; +typedef uint16_t TSSymbol; +typedef uint16_t TSFieldId; +typedef struct TSLanguage TSLanguage; +#endif + +typedef struct { + TSFieldId field_id; + uint8_t child_index; + bool inherited; +} TSFieldMapEntry; + +typedef struct { + uint16_t index; + uint16_t length; +} TSFieldMapSlice; + +typedef struct { + bool visible; + bool named; + bool supertype; +} TSSymbolMetadata; + +typedef struct TSLexer TSLexer; + +struct TSLexer { + int32_t lookahead; + TSSymbol result_symbol; + void (*advance)(TSLexer *, bool); + void (*mark_end)(TSLexer *); + uint32_t (*get_column)(TSLexer *); + bool (*is_at_included_range_start)(const TSLexer *); + bool (*eof)(const TSLexer *); + void (*log)(const TSLexer *, const char *, ...); +}; + +typedef enum { + TSParseActionTypeShift, + TSParseActionTypeReduce, + TSParseActionTypeAccept, + TSParseActionTypeRecover, +} TSParseActionType; + +typedef union { + struct { + uint8_t type; + TSStateId state; + bool extra; + bool repetition; + } shift; + struct { + uint8_t type; + uint8_t child_count; + TSSymbol symbol; + int16_t dynamic_precedence; + uint16_t production_id; + } reduce; + uint8_t type; +} TSParseAction; + +typedef struct { + uint16_t lex_state; + uint16_t external_lex_state; +} TSLexMode; + +typedef union { + TSParseAction action; + struct { + uint8_t count; + bool reusable; + } entry; +} TSParseActionEntry; + +typedef struct { + int32_t start; + int32_t end; +} TSCharacterRange; + +struct TSLanguage { + uint32_t version; + uint32_t symbol_count; + uint32_t alias_count; + uint32_t token_count; + uint32_t external_token_count; + uint32_t state_count; + uint32_t large_state_count; + uint32_t production_id_count; + uint32_t field_count; + uint16_t max_alias_sequence_length; + const uint16_t *parse_table; + const uint16_t *small_parse_table; + const uint32_t *small_parse_table_map; + const TSParseActionEntry *parse_actions; + const char * const *symbol_names; + const char * const *field_names; + const TSFieldMapSlice *field_map_slices; + const TSFieldMapEntry *field_map_entries; + const TSSymbolMetadata *symbol_metadata; + const TSSymbol *public_symbol_map; + const uint16_t *alias_map; + const TSSymbol *alias_sequences; + const TSLexMode *lex_modes; + bool (*lex_fn)(TSLexer *, TSStateId); + bool (*keyword_lex_fn)(TSLexer *, TSStateId); + TSSymbol keyword_capture_token; + struct { + const bool *states; + const TSSymbol *symbol_map; + void *(*create)(void); + void (*destroy)(void *); + bool (*scan)(void *, TSLexer *, const bool *symbol_whitelist); + unsigned (*serialize)(void *, char *); + void (*deserialize)(void *, const char *, unsigned); + } external_scanner; + const TSStateId *primary_state_ids; +}; + +static inline bool set_contains(TSCharacterRange *ranges, uint32_t len, int32_t lookahead) { + uint32_t index = 0; + uint32_t size = len - index; + while (size > 1) { + uint32_t half_size = size / 2; + uint32_t mid_index = index + half_size; + TSCharacterRange *range = &ranges[mid_index]; + if (lookahead >= range->start && lookahead <= range->end) { + return true; + } else if (lookahead > range->end) { + index = mid_index; + } + size -= half_size; + } + TSCharacterRange *range = &ranges[index]; + return (lookahead >= range->start && lookahead <= range->end); +} + +/* + * Lexer Macros + */ + +#ifdef _MSC_VER +#define UNUSED __pragma(warning(suppress : 4101)) +#else +#define UNUSED __attribute__((unused)) +#endif + +#define START_LEXER() \ + bool result = false; \ + bool skip = false; \ + UNUSED \ + bool eof = false; \ + int32_t lookahead; \ + goto start; \ + next_state: \ + lexer->advance(lexer, skip); \ + start: \ + skip = false; \ + lookahead = lexer->lookahead; + +#define ADVANCE(state_value) \ + { \ + state = state_value; \ + goto next_state; \ + } + +#define ADVANCE_MAP(...) \ + { \ + static const uint16_t map[] = { __VA_ARGS__ }; \ + for (uint32_t i = 0; i < sizeof(map) / sizeof(map[0]); i += 2) { \ + if (map[i] == lookahead) { \ + state = map[i + 1]; \ + goto next_state; \ + } \ + } \ + } + +#define SKIP(state_value) \ + { \ + skip = true; \ + state = state_value; \ + goto next_state; \ + } + +#define ACCEPT_TOKEN(symbol_value) \ + result = true; \ + lexer->result_symbol = symbol_value; \ + lexer->mark_end(lexer); + +#define END_STATE() return result; + +/* + * Parse Table Macros + */ + +#define SMALL_STATE(id) ((id) - LARGE_STATE_COUNT) + +#define STATE(id) id + +#define ACTIONS(id) id + +#define SHIFT(state_value) \ + {{ \ + .shift = { \ + .type = TSParseActionTypeShift, \ + .state = (state_value) \ + } \ + }} + +#define SHIFT_REPEAT(state_value) \ + {{ \ + .shift = { \ + .type = TSParseActionTypeShift, \ + .state = (state_value), \ + .repetition = true \ + } \ + }} + +#define SHIFT_EXTRA() \ + {{ \ + .shift = { \ + .type = TSParseActionTypeShift, \ + .extra = true \ + } \ + }} + +#define REDUCE(symbol_name, children, precedence, prod_id) \ + {{ \ + .reduce = { \ + .type = TSParseActionTypeReduce, \ + .symbol = symbol_name, \ + .child_count = children, \ + .dynamic_precedence = precedence, \ + .production_id = prod_id \ + }, \ + }} + +#define RECOVER() \ + {{ \ + .type = TSParseActionTypeRecover \ + }} + +#define ACCEPT_INPUT() \ + {{ \ + .type = TSParseActionTypeAccept \ + }} + +#ifdef __cplusplus +} +#endif + +#endif // TREE_SITTER_PARSER_H_ diff --git a/internal/bloblang2/tree-sitter/test/corpus/assignments.txt b/internal/bloblang2/tree-sitter/test/corpus/assignments.txt new file mode 100644 index 000000000..0d1819a1e --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/assignments.txt @@ -0,0 +1,126 @@ +================== +Simple output assignment +================== + +output = input + +--- + +(source_file + (assignment + (assign_target) + (input))) + +================== +Output path assignment +================== + +output.user.name = input.name + +--- + +(source_file + (assignment + (assign_target + (target_path_segment + (identifier)) + (target_path_segment + (identifier))) + (field_access + receiver: (input) + field: (identifier)))) + +================== +Output metadata assignment +================== + +output@ = input + +--- + +(source_file + (assignment + (assign_target + (metadata_access)) + (input))) + +================== +Output metadata path assignment +================== + +output@.key = "value" + +--- + +(source_file + (assignment + (assign_target + (metadata_access) + (target_path_segment + (identifier))) + (string + (string_content)))) + +================== +Variable assignment +================== + +$temp = input.value + +--- + +(source_file + (assignment + (assign_target + (variable)) + (field_access + receiver: (input) + field: (identifier)))) + +================== +Variable path assignment +================== + +$data.items[0].name = "test" + +--- + +(source_file + (assignment + (assign_target + (variable) + (target_path_segment + (identifier)) + (target_path_segment + (integer)) + (target_path_segment + (identifier))) + (string + (string_content)))) + +================== +Multiple assignments +================== + +output.a = 1 +output.b = 2 +output.c = 3 + +--- + +(source_file + (assignment + (assign_target + (target_path_segment + (identifier))) + (integer)) + (assignment + (assign_target + (target_path_segment + (identifier))) + (integer)) + (assignment + (assign_target + (target_path_segment + (identifier))) + (integer))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/comments.txt b/internal/bloblang2/tree-sitter/test/corpus/comments.txt new file mode 100644 index 000000000..fee2afd70 --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/comments.txt @@ -0,0 +1,28 @@ +================== +Line comment +================== + +# this is a comment +output = input + +--- + +(source_file + (comment) + (assignment + (assign_target) + (input))) + +================== +Inline comment +================== + +output = input # grab input + +--- + +(source_file + (assignment + (assign_target) + (input)) + (comment)) diff --git a/internal/bloblang2/tree-sitter/test/corpus/control_flow.txt b/internal/bloblang2/tree-sitter/test/corpus/control_flow.txt new file mode 100644 index 000000000..ff1e6ba03 --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/control_flow.txt @@ -0,0 +1,182 @@ +================== +If expression +================== + +output = if input.x > 0 { "positive" } else { "non-positive" } + +--- + +(source_file + (assignment + (assign_target) + (if_expression + condition: (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (integer)) + consequence: (expr_body + (string + (string_content))) + (else_clause + alternative: (expr_body + (string + (string_content))))))) + +================== +If else if expression +================== + +output = if input.x > 10 { "big" } else if input.x > 0 { "small" } else { "none" } + +--- + +(source_file + (assignment + (assign_target) + (if_expression + condition: (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (integer)) + consequence: (expr_body + (string + (string_content))) + (else_if_clause + condition: (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (integer)) + consequence: (expr_body + (string + (string_content)))) + (else_clause + alternative: (expr_body + (string + (string_content))))))) + +================== +If statement +================== + +if input.active { + output.status = "on" +} + +--- + +(source_file + (if_statement + condition: (field_access + receiver: (input) + field: (identifier)) + (statement_block + (assignment + (assign_target + (target_path_segment + (identifier))) + (string + (string_content)))))) + +================== +Match expression with subject +================== + +output = match input.type { + "a" => 1, + "b" => 2, + _ => 0, +} + +--- + +(source_file + (assignment + (assign_target) + (match_expression + subject: (field_access + receiver: (input) + field: (identifier)) + (match_cases + (match_case + pattern: (string + (string_content)) + body: (integer)) + (match_case + pattern: (string + (string_content)) + body: (integer)) + (match_case + body: (integer)))))) + +================== +Match with as binding +================== + +output = match input as v { + v.x > 0 => "pos", + _ => "other", +} + +--- + +(source_file + (assignment + (assign_target) + (match_expression + subject: (input) + binding: (identifier) + (match_cases + (match_case + pattern: (binary_expression + left: (field_access + receiver: (identifier) + field: (identifier)) + right: (integer)) + body: (string + (string_content))) + (match_case + body: (string + (string_content))))))) + +================== +Match statement +================== + +match input.type { + "admin" => { + output.role = "admin" + }, + _ => { + output.role = "user" + }, +} + +--- + +(source_file + (match_statement + subject: (field_access + receiver: (input) + field: (identifier)) + (match_statement_cases + (match_statement_case + pattern: (string + (string_content)) + body: (statement_block + (assignment + (assign_target + (target_path_segment + (identifier))) + (string + (string_content))))) + (match_statement_case + body: (statement_block + (assignment + (assign_target + (target_path_segment + (identifier))) + (string + (string_content)))))))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/imports.txt b/internal/bloblang2/tree-sitter/test/corpus/imports.txt new file mode 100644 index 000000000..e0b48e3be --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/imports.txt @@ -0,0 +1,39 @@ +================== +Import statement +================== + +import "utils.blobl2" as utils + +--- + +(source_file + (import_statement + (string + (string_content)) + (identifier))) + +================== +Qualified name call +================== + +import "math.blobl2" as math +output = math::double(input.x) + +--- + +(source_file + (import_statement + (string + (string_content)) + (identifier)) + (assignment + (assign_target) + (call_expression + name: (qualified_name + namespace: (identifier) + name: (identifier)) + (argument_list + (positional_arguments + (field_access + receiver: (input) + field: (identifier))))))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/lambdas.txt b/internal/bloblang2/tree-sitter/test/corpus/lambdas.txt new file mode 100644 index 000000000..73abf440e --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/lambdas.txt @@ -0,0 +1,107 @@ +================== +Single parameter lambda +================== + +output = input.items.map(x -> x * 2) + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier) + (argument_list + (positional_arguments + (lambda_expression + parameters: (identifier) + body: (binary_expression + left: (identifier) + right: (integer)))))))) + +================== +Multi-parameter lambda +================== + +output = input.items.fold((acc, x) -> acc + x) + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier) + (argument_list + (positional_arguments + (lambda_expression + parameters: (parameter_list + (parameter + (identifier)) + (parameter + (identifier))) + body: (binary_expression + left: (identifier) + right: (identifier)))))))) + +================== +Discard lambda +================== + +output = input.items.map(_ -> "x") + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier) + (argument_list + (positional_arguments + (lambda_expression + body: (string + (string_content)))))))) + +================== +Lambda with block body +================== + +output = input.items.map(x -> { + $doubled = x * 2 + $doubled + 1 +}) + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier) + (argument_list + (positional_arguments + (lambda_expression + parameters: (identifier) + body: (lambda_block + (expr_body + (var_assignment + (variable) + (binary_expression + left: (identifier) + right: (integer))) + (binary_expression + left: (variable) + right: (integer)))))))))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/literals.txt b/internal/bloblang2/tree-sitter/test/corpus/literals.txt new file mode 100644 index 000000000..341b65036 --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/literals.txt @@ -0,0 +1,212 @@ +================== +Integer literal +================== + +output = 42 + +--- + +(source_file + (assignment + (assign_target) + (integer))) + +================== +Float literal +================== + +output = 3.14 + +--- + +(source_file + (assignment + (assign_target) + (float))) + +================== +String literal +================== + +output = "hello world" + +--- + +(source_file + (assignment + (assign_target) + (string + (string_content)))) + +================== +String with escape sequences +================== + +output = "line\none\ttwo\\" + +--- + +(source_file + (assignment + (assign_target) + (string + (string_content) + (escape_sequence) + (string_content) + (escape_sequence) + (string_content) + (escape_sequence)))) + +================== +String with unicode escape +================== + +output = "\u0041\u{1F600}" + +--- + +(source_file + (assignment + (assign_target) + (string + (escape_sequence) + (escape_sequence)))) + +================== +Raw string literal +================== + +output = `raw\nstring` + +--- + +(source_file + (assignment + (assign_target) + (raw_string))) + +================== +Boolean literals +================== + +output = true +output = false + +--- + +(source_file + (assignment + (assign_target) + (boolean)) + (assignment + (assign_target) + (boolean))) + +================== +Null literal +================== + +output = null + +--- + +(source_file + (assignment + (assign_target) + (null))) + +================== +Array literal +================== + +output = [1, 2, 3] + +--- + +(source_file + (assignment + (assign_target) + (array + (integer) + (integer) + (integer)))) + +================== +Empty array +================== + +output = [] + +--- + +(source_file + (assignment + (assign_target) + (array))) + +================== +Object literal +================== + +output = {"key": "value", "num": 42} + +--- + +(source_file + (assignment + (assign_target) + (object + (object_entry + key: (string + (string_content)) + value: (string + (string_content))) + (object_entry + key: (string + (string_content)) + value: (integer))))) + +================== +Empty object +================== + +output = {} + +--- + +(source_file + (assignment + (assign_target) + (object))) + +================== +Trailing comma in array +================== + +output = [1, 2,] + +--- + +(source_file + (assignment + (assign_target) + (array + (integer) + (integer)))) + +================== +Trailing comma in object +================== + +output = {"a": 1,} + +--- + +(source_file + (assignment + (assign_target) + (object + (object_entry + key: (string + (string_content)) + value: (integer))))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/maps.txt b/internal/bloblang2/tree-sitter/test/corpus/maps.txt new file mode 100644 index 000000000..33afec7e6 --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/maps.txt @@ -0,0 +1,116 @@ +================== +Simple map declaration +================== + +map double(x) { + x * 2 +} + +--- + +(source_file + (map_declaration + name: (identifier) + (parameter_list + (parameter + (identifier))) + (expr_body + (binary_expression + left: (identifier) + right: (integer))))) + +================== +Map with default parameter +================== + +map greet(name, greeting = "hello") { + greeting + " " + name +} + +--- + +(source_file + (map_declaration + name: (identifier) + (parameter_list + (parameter + (identifier)) + (parameter + (identifier) + (string + (string_content)))) + (expr_body + (binary_expression + left: (binary_expression + left: (identifier) + right: (string + (string_content))) + right: (identifier))))) + +================== +Map with discard parameter +================== + +map first(x, _) { + x +} + +--- + +(source_file + (map_declaration + name: (identifier) + (parameter_list + (parameter + (identifier)) + (parameter)) + (expr_body + (identifier)))) + +================== +Map with variable assignment +================== + +map transform(data) { + $temp = data.value * 2 + $temp + 1 +} + +--- + +(source_file + (map_declaration + name: (identifier) + (parameter_list + (parameter + (identifier))) + (expr_body + (var_assignment + (variable) + (binary_expression + left: (field_access + receiver: (identifier) + field: (identifier)) + right: (integer))) + (binary_expression + left: (variable) + right: (integer))))) + +================== +Map call +================== + +output = double(input.x) + +--- + +(source_file + (assignment + (assign_target) + (call_expression + name: (identifier) + (argument_list + (positional_arguments + (field_access + receiver: (input) + field: (identifier))))))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/methods.txt b/internal/bloblang2/tree-sitter/test/corpus/methods.txt new file mode 100644 index 000000000..7d3b7b31e --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/methods.txt @@ -0,0 +1,196 @@ +================== +Method call +================== + +output = input.name.uppercase() + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier)))) + +================== +Chained method calls +================== + +output = input.name.trim().lowercase().replace_all(" ", "-") + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (method_call + receiver: (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier)) + method: (identifier)) + method: (identifier) + (argument_list + (positional_arguments + (string + (string_content)) + (string + (string_content))))))) + +================== +Null-safe field access +================== + +output = input.user?.name + +--- + +(source_file + (assignment + (assign_target) + (null_safe_field_access + receiver: (field_access + receiver: (input) + field: (identifier)) + field: (identifier)))) + +================== +Null-safe method call +================== + +output = input.name?.trim() + +--- + +(source_file + (assignment + (assign_target) + (null_safe_method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier)))) + +================== +Null-safe index +================== + +output = input.items?[0] + +--- + +(source_file + (assignment + (assign_target) + (null_safe_index + receiver: (field_access + receiver: (input) + field: (identifier)) + index: (integer)))) + +================== +Index access +================== + +output = input.items[0] + +--- + +(source_file + (assignment + (assign_target) + (index + receiver: (field_access + receiver: (input) + field: (identifier)) + index: (integer)))) + +================== +Named arguments +================== + +output = input.name.replace_all(old: " ", new: "-") + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier) + (argument_list + (named_arguments + (named_argument + name: (identifier) + value: (string + (string_content))) + (named_argument + name: (identifier) + value: (string + (string_content)))))))) + +================== +Field access with keyword name +================== + +output = input.map + +--- + +(source_file + (assignment + (assign_target) + (field_access + receiver: (input) + field: (identifier)))) + +================== +Deleted function +================== + +output = deleted() + +--- + +(source_file + (assignment + (assign_target) + (call_expression))) + +================== +Void function +================== + +output = void() + +--- + +(source_file + (assignment + (assign_target) + (call_expression))) + +================== +Throw function +================== + +output = throw("error message") + +--- + +(source_file + (assignment + (assign_target) + (call_expression + (argument_list + (positional_arguments + (string + (string_content))))))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/multiline.txt b/internal/bloblang2/tree-sitter/test/corpus/multiline.txt new file mode 100644 index 000000000..945f57659 --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/multiline.txt @@ -0,0 +1,341 @@ +================== +Multiline method chain (postfix continuation) +================== + +output = input.name + .trim() + .lowercase() + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier)) + method: (identifier)))) + +================== +Multiline binary expression (operator continuation) +================== + +output = input.price + + input.tax + + input.shipping + +--- + +(source_file + (assignment + (assign_target) + (binary_expression + left: (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (field_access + receiver: (input) + field: (identifier))) + right: (field_access + receiver: (input) + field: (identifier))))) + +================== +Multiline array literal +================== + +output = [ + 1, + 2, + 3 +] + +--- + +(source_file + (assignment + (assign_target) + (array + (integer) + (integer) + (integer)))) + +================== +Multiline function arguments +================== + +output = input.replace_all( + "hello", + "world" +) + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (input) + method: (identifier) + (argument_list + (positional_arguments + (string + (string_content)) + (string + (string_content))))))) + +================== +Multiline if else chain +================== + +output = if input.x > 0 { + "positive" +} else { + "non-positive" +} + +--- + +(source_file + (assignment + (assign_target) + (if_expression + condition: (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (integer)) + consequence: (expr_body + (string + (string_content))) + (else_clause + alternative: (expr_body + (string + (string_content))))))) + +================== +Multiline match expression +================== + +output = match input.type { + "a" => 1, + "b" => 2, + _ => 0, +} + +--- + +(source_file + (assignment + (assign_target) + (match_expression + subject: (field_access + receiver: (input) + field: (identifier)) + (match_cases + (match_case + pattern: (string + (string_content)) + body: (integer)) + (match_case + pattern: (string + (string_content)) + body: (integer)) + (match_case + body: (integer)))))) + +================== +Multiline map declaration +================== + +map transform(data) { + $temp = data.value * 2 + $temp + 1 +} + +--- + +(source_file + (map_declaration + name: (identifier) + (parameter_list + (parameter + (identifier))) + (expr_body + (var_assignment + (variable) + (binary_expression + left: (field_access + receiver: (identifier) + field: (identifier)) + right: (integer))) + (binary_expression + left: (variable) + right: (integer))))) + +================== +Null-safe postfix continuation +================== + +output = input.user + ?.name + ?.trim() + +--- + +(source_file + (assignment + (assign_target) + (null_safe_method_call + receiver: (null_safe_field_access + receiver: (field_access + receiver: (input) + field: (identifier)) + field: (identifier)) + method: (identifier)))) + +================== +Index postfix continuation +================== + +output = input.items + [0] + +--- + +(source_file + (assignment + (assign_target) + (index + receiver: (field_access + receiver: (input) + field: (identifier)) + index: (integer)))) + +================== +Else continuation across lines +================== + +if input.active { + output = "yes" +} +else { + output = "no" +} + +--- + +(source_file + (if_statement + condition: (field_access + receiver: (input) + field: (identifier)) + (statement_block + (assignment + (assign_target) + (string + (string_content)))) + (else_statement_clause + (statement_block + (assignment + (assign_target) + (string + (string_content))))))) + +================== +Multiline lambda block +================== + +output = input.items.map(x -> { + $doubled = x * 2 + $doubled + 1 +}) + +--- + +(source_file + (assignment + (assign_target) + (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier) + (argument_list + (positional_arguments + (lambda_expression + parameters: (identifier) + body: (lambda_block + (expr_body + (var_assignment + (variable) + (binary_expression + left: (identifier) + right: (integer))) + (binary_expression + left: (variable) + right: (integer)))))))))) + +================== +Blank lines between statements +================== + +output.a = 1 + +output.b = 2 + +--- + +(source_file + (assignment + (assign_target + (target_path_segment + (identifier))) + (integer)) + (assignment + (assign_target + (target_path_segment + (identifier))) + (integer))) + +================== +Assignment continuation +================== + +output.result = + input.items + .filter(x -> x > 0) + .map(x -> x * 2) + +--- + +(source_file + (assignment + (assign_target + (target_path_segment + (identifier))) + (method_call + receiver: (method_call + receiver: (field_access + receiver: (input) + field: (identifier)) + method: (identifier) + (argument_list + (positional_arguments + (lambda_expression + parameters: (identifier) + body: (binary_expression + left: (identifier) + right: (integer)))))) + method: (identifier) + (argument_list + (positional_arguments + (lambda_expression + parameters: (identifier) + body: (binary_expression + left: (identifier) + right: (integer)))))))) diff --git a/internal/bloblang2/tree-sitter/test/corpus/operators.txt b/internal/bloblang2/tree-sitter/test/corpus/operators.txt new file mode 100644 index 000000000..989359f7e --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/corpus/operators.txt @@ -0,0 +1,139 @@ +================== +Arithmetic operators +================== + +output = 1 + 2 * 3 + +--- + +(source_file + (assignment + (assign_target) + (binary_expression + left: (integer) + right: (binary_expression + left: (integer) + right: (integer))))) + +================== +Comparison operator +================== + +output = input.age > 18 + +--- + +(source_file + (assignment + (assign_target) + (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (integer)))) + +================== +Logical operators +================== + +output = input.a > 0 && input.b < 10 + +--- + +(source_file + (assignment + (assign_target) + (binary_expression + left: (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (integer)) + right: (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (integer))))) + +================== +Unary not +================== + +output = !input.active + +--- + +(source_file + (assignment + (assign_target) + (unary_expression + operand: (field_access + receiver: (input) + field: (identifier))))) + +================== +Unary minus +================== + +output = -input.value + +--- + +(source_file + (assignment + (assign_target) + (unary_expression + operand: (field_access + receiver: (input) + field: (identifier))))) + +================== +Parenthesized precedence +================== + +output = (1 + 2) * 3 + +--- + +(source_file + (assignment + (assign_target) + (binary_expression + left: (parenthesized_expression + (binary_expression + left: (integer) + right: (integer))) + right: (integer)))) + +================== +Equality +================== + +output = input.x == "test" + +--- + +(source_file + (assignment + (assign_target) + (binary_expression + left: (field_access + receiver: (input) + field: (identifier)) + right: (string + (string_content))))) + +================== +Modulo +================== + +output = 10 % 3 + +--- + +(source_file + (assignment + (assign_target) + (binary_expression + left: (integer) + right: (integer)))) diff --git a/internal/bloblang2/tree-sitter/test/speccompat/speccompat_test.go b/internal/bloblang2/tree-sitter/test/speccompat/speccompat_test.go new file mode 100644 index 000000000..249d3e404 --- /dev/null +++ b/internal/bloblang2/tree-sitter/test/speccompat/speccompat_test.go @@ -0,0 +1,186 @@ +// Package speccompat verifies that the tree-sitter-bloblang2 grammar can parse +// every mapping from the Bloblang V2 spec test suite without producing ERROR +// nodes. This catches grammar regressions against the reference spec. +// +// Tests with compile_error are included — some are parse errors (which +// tree-sitter should also flag as ERROR), others are semantic errors (which +// should parse cleanly). We skip compile_error tests from the error check +// since the grammar can't distinguish parse vs semantic errors. +// +// Run with: +// +// go test -tags treesitter ./internal/bloblang2/tree-sitter/test/speccompat/ +// +//go:build treesitter + +package speccompat + +import ( + "fmt" + "os" + "os/exec" + "path/filepath" + "strings" + "testing" + + "gopkg.in/yaml.v3" +) + +type specFile struct { + Description string `yaml:"description"` + Files map[string]string `yaml:"files"` + Tests []specTest `yaml:"tests"` +} + +type specTest struct { + Name string `yaml:"name"` + Mapping string `yaml:"mapping"` + CompileError string `yaml:"compile_error"` +} + +func TestSpecMappingsParse(t *testing.T) { + // Find tree-sitter CLI. + tsPath, err := exec.LookPath("tree-sitter") + if err != nil { + // Try npx. + tsPath = "npx" + } + + // Find the grammar root (two levels up from test/speccompat/). + grammarDir, err := filepath.Abs(filepath.Join("..", "..")) + if err != nil { + t.Fatal(err) + } + + // Ensure the parser is generated. + parserC := filepath.Join(grammarDir, "src", "parser.c") + if _, err := os.Stat(parserC); os.IsNotExist(err) { + t.Skipf("parser not generated — run 'npx tree-sitter generate' in %s first", grammarDir) + } + + // Find spec test directory. + specDir := filepath.Join(grammarDir, "..", "spec", "tests") + if _, err := os.Stat(specDir); os.IsNotExist(err) { + t.Skipf("spec tests not found at %s", specDir) + } + + // Walk all YAML files. + var files []string + err = filepath.Walk(specDir, func(path string, info os.FileInfo, err error) error { + if err != nil { + return err + } + if !info.IsDir() && strings.HasSuffix(path, ".yaml") { + files = append(files, path) + } + return nil + }) + if err != nil { + t.Fatal(err) + } + + if len(files) == 0 { + t.Fatal("no spec test files found") + } + + var total, passed, skipped, failed int + + for _, file := range files { + relPath, _ := filepath.Rel(specDir, file) + + data, err := os.ReadFile(file) + if err != nil { + t.Errorf("failed to read %s: %v", relPath, err) + continue + } + + var sf specFile + if err := yaml.Unmarshal(data, &sf); err != nil { + t.Errorf("failed to parse %s: %v", relPath, err) + continue + } + + for _, tc := range sf.Tests { + total++ + testName := fmt.Sprintf("%s/%s", relPath, tc.Name) + + if tc.Mapping == "" { + skipped++ + continue + } + + // Skip compile_error tests — they may have intentional parse errors + // that tree-sitter would flag as ERROR. The grammar can't distinguish + // parse errors from semantic errors. + if tc.CompileError != "" { + skipped++ + continue + } + + // Also parse any imported files to verify they parse cleanly. + mappings := map[string]string{"main": tc.Mapping} + for name, content := range sf.Files { + mappings[name] = content + } + + for label, mapping := range mappings { + // Write mapping to temp file. + tmpFile, err := os.CreateTemp("", "blobl2-*.blobl2") + if err != nil { + t.Fatal(err) + } + _, _ = tmpFile.WriteString(mapping) + tmpFile.Close() + + // Run tree-sitter parse. + var cmd *exec.Cmd + if tsPath == "npx" { + cmd = exec.Command("npx", "tree-sitter", "parse", tmpFile.Name()) + } else { + cmd = exec.Command(tsPath, "parse", tmpFile.Name()) + } + cmd.Dir = grammarDir + output, err := cmd.CombinedOutput() + os.Remove(tmpFile.Name()) + + outStr := string(output) + + if strings.Contains(outStr, "(ERROR") || strings.Contains(outStr, "(MISSING") { + failed++ + suffix := "" + if label != "main" { + suffix = fmt.Sprintf(" (file: %s)", label) + } + // Show just the first few lines of the parse tree for context. + lines := strings.Split(outStr, "\n") + preview := outStr + if len(lines) > 20 { + preview = strings.Join(lines[:20], "\n") + "\n..." + } + t.Errorf("ERROR in parse tree for %s%s:\n mapping:\n %s\n parse tree:\n%s", + testName, suffix, + strings.ReplaceAll(strings.TrimSpace(mapping), "\n", "\n "), + preview) + break + } else if err != nil && !strings.Contains(outStr, "(source_file") { + // tree-sitter parse failed entirely (not just ERROR nodes). + failed++ + t.Errorf("tree-sitter parse failed for %s: %v\n%s", testName, err, outStr) + break + } else { + // Only count as passed if all mappings (main + files) parsed. + if label == "main" && len(mappings) == 1 { + passed++ + } else if label != "main" { + // Will be counted when main passes. + } else { + passed++ + } + } + } + } + } + + t.Logf("Spec compatibility: %d total, %d passed, %d skipped (compile_error), %d failed", + total, passed, skipped, failed) +} diff --git a/internal/bloblang2/tree-sitter/tree-sitter.json b/internal/bloblang2/tree-sitter/tree-sitter.json new file mode 100644 index 000000000..b47a4be15 --- /dev/null +++ b/internal/bloblang2/tree-sitter/tree-sitter.json @@ -0,0 +1,17 @@ +{ + "metadata": { + "version": "0.1.0", + "links": { + "repository": "https://github.com/redpanda-data/benthos" + } + }, + "grammars": [ + { + "name": "bloblang2", + "path": ".", + "scope": "source.bloblang2", + "file-types": ["blobl2"], + "injection-regex": "bloblang2" + } + ] +} From c410731c1a7ee93ec8457582aa0942840110601e Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Fri, 24 Apr 2026 10:40:12 +0100 Subject: [PATCH 09/20] bloblang(v2): Add TypeScript runtime Adds internal/bloblang2/ts/, a full TypeScript port of the V2 runtime: scanner, parser, resolver, optimizer, tree-walking interpreter, value system, and standard library. Bundles to a browser-loadable script that powers the demo's client-side engine. The TypeScript runtime passes the same spec conformance corpus (internal/bloblang2/spec/tests/) as the Go runtime, exercised via a small spectest harness layered over the same YAML schema. --- internal/bloblang2/ts/.gitignore | 2 + internal/bloblang2/ts/Taskfile.yml | 39 + internal/bloblang2/ts/bundle.mjs | 19 + internal/bloblang2/ts/package-lock.json | 1656 ++++++++++++++++ internal/bloblang2/ts/package.json | 19 + internal/bloblang2/ts/src/arithmetic.ts | 714 +++++++ internal/bloblang2/ts/src/ast.ts | 328 ++++ internal/bloblang2/ts/src/index.ts | 32 + internal/bloblang2/ts/src/interpreter.ts | 1723 +++++++++++++++++ internal/bloblang2/ts/src/optimizer.ts | 464 +++++ internal/bloblang2/ts/src/parser.ts | 1106 +++++++++++ internal/bloblang2/ts/src/resolver.ts | 694 +++++++ internal/bloblang2/ts/src/scanner.ts | 597 ++++++ internal/bloblang2/ts/src/scope.ts | 49 + .../bloblang2/ts/src/stdlib/array_methods.ts | 423 ++++ internal/bloblang2/ts/src/stdlib/encoding.ts | 424 ++++ internal/bloblang2/ts/src/stdlib/functions.ts | 344 ++++ internal/bloblang2/ts/src/stdlib/index.ts | 112 ++ .../bloblang2/ts/src/stdlib/lambda_methods.ts | 669 +++++++ .../ts/src/stdlib/numeric_methods.ts | 123 ++ .../bloblang2/ts/src/stdlib/object_methods.ts | 134 ++ .../bloblang2/ts/src/stdlib/string_methods.ts | 362 ++++ internal/bloblang2/ts/src/stdlib/timestamp.ts | 500 +++++ .../ts/src/stdlib/type_conversion.ts | 441 +++++ internal/bloblang2/ts/src/token.ts | 167 ++ internal/bloblang2/ts/src/value.ts | 593 ++++++ internal/bloblang2/ts/test/argfold.test.ts | 111 ++ internal/bloblang2/ts/test/exec.test.ts | 584 ++++++ internal/bloblang2/ts/test/parse.test.ts | 72 + internal/bloblang2/ts/tsconfig.json | 15 + internal/bloblang2/ts/vitest.config.ts | 13 + 31 files changed, 12529 insertions(+) create mode 100644 internal/bloblang2/ts/.gitignore create mode 100644 internal/bloblang2/ts/Taskfile.yml create mode 100644 internal/bloblang2/ts/bundle.mjs create mode 100644 internal/bloblang2/ts/package-lock.json create mode 100644 internal/bloblang2/ts/package.json create mode 100644 internal/bloblang2/ts/src/arithmetic.ts create mode 100644 internal/bloblang2/ts/src/ast.ts create mode 100644 internal/bloblang2/ts/src/index.ts create mode 100644 internal/bloblang2/ts/src/interpreter.ts create mode 100644 internal/bloblang2/ts/src/optimizer.ts create mode 100644 internal/bloblang2/ts/src/parser.ts create mode 100644 internal/bloblang2/ts/src/resolver.ts create mode 100644 internal/bloblang2/ts/src/scanner.ts create mode 100644 internal/bloblang2/ts/src/scope.ts create mode 100644 internal/bloblang2/ts/src/stdlib/array_methods.ts create mode 100644 internal/bloblang2/ts/src/stdlib/encoding.ts create mode 100644 internal/bloblang2/ts/src/stdlib/functions.ts create mode 100644 internal/bloblang2/ts/src/stdlib/index.ts create mode 100644 internal/bloblang2/ts/src/stdlib/lambda_methods.ts create mode 100644 internal/bloblang2/ts/src/stdlib/numeric_methods.ts create mode 100644 internal/bloblang2/ts/src/stdlib/object_methods.ts create mode 100644 internal/bloblang2/ts/src/stdlib/string_methods.ts create mode 100644 internal/bloblang2/ts/src/stdlib/timestamp.ts create mode 100644 internal/bloblang2/ts/src/stdlib/type_conversion.ts create mode 100644 internal/bloblang2/ts/src/token.ts create mode 100644 internal/bloblang2/ts/src/value.ts create mode 100644 internal/bloblang2/ts/test/argfold.test.ts create mode 100644 internal/bloblang2/ts/test/exec.test.ts create mode 100644 internal/bloblang2/ts/test/parse.test.ts create mode 100644 internal/bloblang2/ts/tsconfig.json create mode 100644 internal/bloblang2/ts/vitest.config.ts diff --git a/internal/bloblang2/ts/.gitignore b/internal/bloblang2/ts/.gitignore new file mode 100644 index 000000000..b94707787 --- /dev/null +++ b/internal/bloblang2/ts/.gitignore @@ -0,0 +1,2 @@ +node_modules/ +dist/ diff --git a/internal/bloblang2/ts/Taskfile.yml b/internal/bloblang2/ts/Taskfile.yml new file mode 100644 index 000000000..3fd98d7d9 --- /dev/null +++ b/internal/bloblang2/ts/Taskfile.yml @@ -0,0 +1,39 @@ +version: "3" + +tasks: + install: + desc: Install npm dependencies + cmds: + - npm install + sources: + - package.json + generates: + - node_modules/.package-lock.json + + build: + desc: Compile TypeScript to dist/ + deps: [install] + cmds: + - npx tsc + sources: + - src/**/*.ts + - tsconfig.json + generates: + - dist/**/*.js + + bundle: + desc: Bundle as a single ES module and copy into demo/ + deps: [build] + cmds: + - node bundle.mjs + + test: + desc: Run spec conformance and unit tests + deps: [install] + cmds: + - npx vitest run + + clean: + desc: Remove build artifacts + cmds: + - rm -rf dist diff --git a/internal/bloblang2/ts/bundle.mjs b/internal/bloblang2/ts/bundle.mjs new file mode 100644 index 000000000..cd795158b --- /dev/null +++ b/internal/bloblang2/ts/bundle.mjs @@ -0,0 +1,19 @@ +// Bundle the Bloblang V2 interpreter as a single ES module for browser use. +import { build } from "esbuild"; +import { copyFileSync } from "fs"; + +await build({ + entryPoints: ["src/index.ts"], + bundle: true, + format: "esm", + outfile: "dist/bloblang2.mjs", + target: "es2022", + minify: false, + sourcemap: true, +}); + +// Copy to the demo directory for the Go server to embed. +copyFileSync("dist/bloblang2.mjs", "../demo/bloblang2.mjs"); +copyFileSync("dist/bloblang2.mjs.map", "../demo/bloblang2.mjs.map"); + +console.log("Bundled to dist/bloblang2.mjs and copied to demo/"); diff --git a/internal/bloblang2/ts/package-lock.json b/internal/bloblang2/ts/package-lock.json new file mode 100644 index 000000000..0e23587d7 --- /dev/null +++ b/internal/bloblang2/ts/package-lock.json @@ -0,0 +1,1656 @@ +{ + "name": "@bloblang/v2", + "version": "0.0.1", + "lockfileVersion": 3, + "requires": true, + "packages": { + "": { + "name": "@bloblang/v2", + "version": "0.0.1", + "devDependencies": { + "typescript": "^5.7.0", + "vitest": "^3.0.0", + "yaml": "^2.7.0" + } + }, + "node_modules/@esbuild/aix-ppc64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.27.4.tgz", + "integrity": "sha512-cQPwL2mp2nSmHHJlCyoXgHGhbEPMrEEU5xhkcy3Hs/O7nGZqEpZ2sUtLaL9MORLtDfRvVl2/3PAuEkYZH0Ty8Q==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "aix" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-arm": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm/-/android-arm-0.27.4.tgz", + "integrity": "sha512-X9bUgvxiC8CHAGKYufLIHGXPJWnr0OCdR0anD2e21vdvgCI8lIfqFbnoeOz7lBjdrAGUhqLZLcQo6MLhTO2DKQ==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/android-arm64/-/android-arm64-0.27.4.tgz", + "integrity": "sha512-gdLscB7v75wRfu7QSm/zg6Rx29VLdy9eTr2t44sfTW7CxwAtQghZ4ZnqHk3/ogz7xao0QAgrkradbBzcqFPasw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/android-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/android-x64/-/android-x64-0.27.4.tgz", + "integrity": "sha512-PzPFnBNVF292sfpfhiyiXCGSn9HZg5BcAz+ivBuSsl6Rk4ga1oEXAamhOXRFyMcjwr2DVtm40G65N3GLeH1Lvw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/darwin-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-arm64/-/darwin-arm64-0.27.4.tgz", + "integrity": "sha512-b7xaGIwdJlht8ZFCvMkpDN6uiSmnxxK56N2GDTMYPr2/gzvfdQN8rTfBsvVKmIVY/X7EM+/hJKEIbbHs9oA4tQ==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/darwin-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/darwin-x64/-/darwin-x64-0.27.4.tgz", + "integrity": "sha512-sR+OiKLwd15nmCdqpXMnuJ9W2kpy0KigzqScqHI3Hqwr7IXxBp3Yva+yJwoqh7rE8V77tdoheRYataNKL4QrPw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/freebsd-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-arm64/-/freebsd-arm64-0.27.4.tgz", + "integrity": "sha512-jnfpKe+p79tCnm4GVav68A7tUFeKQwQyLgESwEAUzyxk/TJr4QdGog9sqWNcUbr/bZt/O/HXouspuQDd9JxFSw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/freebsd-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/freebsd-x64/-/freebsd-x64-0.27.4.tgz", + "integrity": "sha512-2kb4ceA/CpfUrIcTUl1wrP/9ad9Atrp5J94Lq69w7UwOMolPIGrfLSvAKJp0RTvkPPyn6CIWrNy13kyLikZRZQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-arm": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm/-/linux-arm-0.27.4.tgz", + "integrity": "sha512-aBYgcIxX/wd5n2ys0yESGeYMGF+pv6g0DhZr3G1ZG4jMfruU9Tl1i2Z+Wnj9/KjGz1lTLCcorqE2viePZqj4Eg==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-arm64/-/linux-arm64-0.27.4.tgz", + "integrity": "sha512-7nQOttdzVGth1iz57kxg9uCz57dxQLHWxopL6mYuYthohPKEK0vU0C3O21CcBK6KDlkYVcnDXY099HcCDXd9dA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-ia32": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ia32/-/linux-ia32-0.27.4.tgz", + "integrity": "sha512-oPtixtAIzgvzYcKBQM/qZ3R+9TEUd1aNJQu0HhGyqtx6oS7qTpvjheIWBbes4+qu1bNlo2V4cbkISr8q6gRBFA==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-loong64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-loong64/-/linux-loong64-0.27.4.tgz", + "integrity": "sha512-8mL/vh8qeCoRcFH2nM8wm5uJP+ZcVYGGayMavi8GmRJjuI3g1v6Z7Ni0JJKAJW+m0EtUuARb6Lmp4hMjzCBWzA==", + "cpu": [ + "loong64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-mips64el": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-mips64el/-/linux-mips64el-0.27.4.tgz", + "integrity": "sha512-1RdrWFFiiLIW7LQq9Q2NES+HiD4NyT8Itj9AUeCl0IVCA459WnPhREKgwrpaIfTOe+/2rdntisegiPWn/r/aAw==", + "cpu": [ + "mips64el" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-ppc64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-ppc64/-/linux-ppc64-0.27.4.tgz", + "integrity": "sha512-tLCwNG47l3sd9lpfyx9LAGEGItCUeRCWeAx6x2Jmbav65nAwoPXfewtAdtbtit/pJFLUWOhpv0FpS6GQAmPrHA==", + "cpu": [ + "ppc64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-riscv64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-riscv64/-/linux-riscv64-0.27.4.tgz", + "integrity": "sha512-BnASypppbUWyqjd1KIpU4AUBiIhVr6YlHx/cnPgqEkNoVOhHg+YiSVxM1RLfiy4t9cAulbRGTNCKOcqHrEQLIw==", + "cpu": [ + "riscv64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-s390x": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-s390x/-/linux-s390x-0.27.4.tgz", + "integrity": "sha512-+eUqgb/Z7vxVLezG8bVB9SfBie89gMueS+I0xYh2tJdw3vqA/0ImZJ2ROeWwVJN59ihBeZ7Tu92dF/5dy5FttA==", + "cpu": [ + "s390x" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/linux-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/linux-x64/-/linux-x64-0.27.4.tgz", + "integrity": "sha512-S5qOXrKV8BQEzJPVxAwnryi2+Iq5pB40gTEIT69BQONqR7JH1EPIcQ/Uiv9mCnn05jff9umq/5nqzxlqTOg9NA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "linux" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/netbsd-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-arm64/-/netbsd-arm64-0.27.4.tgz", + "integrity": "sha512-xHT8X4sb0GS8qTqiwzHqpY00C95DPAq7nAwX35Ie/s+LO9830hrMd3oX0ZMKLvy7vsonee73x0lmcdOVXFzd6Q==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/netbsd-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/netbsd-x64/-/netbsd-x64-0.27.4.tgz", + "integrity": "sha512-RugOvOdXfdyi5Tyv40kgQnI0byv66BFgAqjdgtAKqHoZTbTF2QqfQrFwa7cHEORJf6X2ht+l9ABLMP0dnKYsgg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "netbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openbsd-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-arm64/-/openbsd-arm64-0.27.4.tgz", + "integrity": "sha512-2MyL3IAaTX+1/qP0O1SwskwcwCoOI4kV2IBX1xYnDDqthmq5ArrW94qSIKCAuRraMgPOmG0RDTA74mzYNQA9ow==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openbsd-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/openbsd-x64/-/openbsd-x64-0.27.4.tgz", + "integrity": "sha512-u8fg/jQ5aQDfsnIV6+KwLOf1CmJnfu1ShpwqdwC0uA7ZPwFws55Ngc12vBdeUdnuWoQYx/SOQLGDcdlfXhYmXQ==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/openharmony-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/openharmony-arm64/-/openharmony-arm64-0.27.4.tgz", + "integrity": "sha512-JkTZrl6VbyO8lDQO3yv26nNr2RM2yZzNrNHEsj9bm6dOwwu9OYN28CjzZkH57bh4w0I2F7IodpQvUAEd1mbWXg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openharmony" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/sunos-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/sunos-x64/-/sunos-x64-0.27.4.tgz", + "integrity": "sha512-/gOzgaewZJfeJTlsWhvUEmUG4tWEY2Spp5M20INYRg2ZKl9QPO3QEEgPeRtLjEWSW8FilRNacPOg8R1uaYkA6g==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "sunos" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-arm64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/win32-arm64/-/win32-arm64-0.27.4.tgz", + "integrity": "sha512-Z9SExBg2y32smoDQdf1HRwHRt6vAHLXcxD2uGgO/v2jK7Y718Ix4ndsbNMU/+1Qiem9OiOdaqitioZwxivhXYg==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-ia32": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/win32-ia32/-/win32-ia32-0.27.4.tgz", + "integrity": "sha512-DAyGLS0Jz5G5iixEbMHi5KdiApqHBWMGzTtMiJ72ZOLhbu/bzxgAe8Ue8CTS3n3HbIUHQz/L51yMdGMeoxXNJw==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@esbuild/win32-x64": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/@esbuild/win32-x64/-/win32-x64-0.27.4.tgz", + "integrity": "sha512-+knoa0BDoeXgkNvvV1vvbZX4+hizelrkwmGJBdT17t8FNPwG2lKemmuMZlmaNQ3ws3DKKCxpb4zRZEIp3UxFCg==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ], + "engines": { + "node": ">=18" + } + }, + "node_modules/@jridgewell/sourcemap-codec": { + "version": "1.5.5", + "resolved": "https://registry.npmjs.org/@jridgewell/sourcemap-codec/-/sourcemap-codec-1.5.5.tgz", + "integrity": "sha512-cYQ9310grqxueWbl+WuIUIaiUaDcj7WOq5fVhEljNVgRfOUhY9fy2zTvfoqWsnebh8Sl70VScFbICvJnLKB0Og==", + "dev": true, + "license": "MIT" + }, + "node_modules/@rollup/rollup-android-arm-eabi": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm-eabi/-/rollup-android-arm-eabi-4.60.0.tgz", + "integrity": "sha512-WOhNW9K8bR3kf4zLxbfg6Pxu2ybOUbB2AjMDHSQx86LIF4rH4Ft7vmMwNt0loO0eonglSNy4cpD3MKXXKQu0/A==", + "cpu": [ + "arm" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ] + }, + "node_modules/@rollup/rollup-android-arm64": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-android-arm64/-/rollup-android-arm64-4.60.0.tgz", + "integrity": "sha512-u6JHLll5QKRvjciE78bQXDmqRqNs5M/3GVqZeMwvmjaNODJih/WIrJlFVEihvV0MiYFmd+ZyPr9wxOVbPAG2Iw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "android" + ] + }, + "node_modules/@rollup/rollup-darwin-arm64": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-arm64/-/rollup-darwin-arm64-4.60.0.tgz", + "integrity": "sha512-qEF7CsKKzSRc20Ciu2Zw1wRrBz4g56F7r/vRwY430UPp/nt1x21Q/fpJ9N5l47WWvJlkNCPJz3QRVw008fi7yA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ] + }, + "node_modules/@rollup/rollup-darwin-x64": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-darwin-x64/-/rollup-darwin-x64-4.60.0.tgz", + "integrity": "sha512-WADYozJ4QCnXCH4wPB+3FuGmDPoFseVCUrANmA5LWwGmC6FL14BWC7pcq+FstOZv3baGX65tZ378uT6WG8ynTw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ] + }, + "node_modules/@rollup/rollup-freebsd-arm64": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-arm64/-/rollup-freebsd-arm64-4.60.0.tgz", + "integrity": "sha512-6b8wGHJlDrGeSE3aH5mGNHBjA0TTkxdoNHik5EkvPHCt351XnigA4pS7Wsj/Eo9Y8RBU6f35cjN9SYmCFBtzxw==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ] + }, + "node_modules/@rollup/rollup-freebsd-x64": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-freebsd-x64/-/rollup-freebsd-x64-4.60.0.tgz", + "integrity": "sha512-h25Ga0t4jaylMB8M/JKAyrvvfxGRjnPQIR8lnCayyzEjEOx2EJIlIiMbhpWxDRKGKF8jbNH01NnN663dH638mA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "freebsd" + ] + }, + "node_modules/@rollup/rollup-linux-arm-gnueabihf": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-gnueabihf/-/rollup-linux-arm-gnueabihf-4.60.0.tgz", + "integrity": "sha512-RzeBwv0B3qtVBWtcuABtSuCzToo2IEAIQrcyB/b2zMvBWVbjo8bZDjACUpnaafaxhTw2W+imQbP2BD1usasK4g==", + "cpu": [ + "arm" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm-musleabihf": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm-musleabihf/-/rollup-linux-arm-musleabihf-4.60.0.tgz", + "integrity": "sha512-Sf7zusNI2CIU1HLzuu9Tc5YGAHEZs5Lu7N1ssJG4Tkw6e0MEsN7NdjUDDfGNHy2IU+ENyWT+L2obgWiguWibWQ==", + "cpu": [ + "arm" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm64-gnu": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-gnu/-/rollup-linux-arm64-gnu-4.60.0.tgz", + "integrity": "sha512-DX2x7CMcrJzsE91q7/O02IJQ5/aLkVtYFryqCjduJhUfGKG6yJV8hxaw8pZa93lLEpPTP/ohdN4wFz7yp/ry9A==", + "cpu": [ + "arm64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-arm64-musl": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-arm64-musl/-/rollup-linux-arm64-musl-4.60.0.tgz", + "integrity": "sha512-09EL+yFVbJZlhcQfShpswwRZ0Rg+z/CsSELFCnPt3iK+iqwGsI4zht3secj5vLEs957QvFFXnzAT0FFPIxSrkQ==", + "cpu": [ + "arm64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-loong64-gnu": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loong64-gnu/-/rollup-linux-loong64-gnu-4.60.0.tgz", + "integrity": "sha512-i9IcCMPr3EXm8EQg5jnja0Zyc1iFxJjZWlb4wr7U2Wx/GrddOuEafxRdMPRYVaXjgbhvqalp6np07hN1w9kAKw==", + "cpu": [ + "loong64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-loong64-musl": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-loong64-musl/-/rollup-linux-loong64-musl-4.60.0.tgz", + "integrity": "sha512-DGzdJK9kyJ+B78MCkWeGnpXJ91tK/iKA6HwHxF4TAlPIY7GXEvMe8hBFRgdrR9Ly4qebR/7gfUs9y2IoaVEyog==", + "cpu": [ + "loong64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-ppc64-gnu": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-ppc64-gnu/-/rollup-linux-ppc64-gnu-4.60.0.tgz", + "integrity": "sha512-RwpnLsqC8qbS8z1H1AxBA1H6qknR4YpPR9w2XX0vo2Sz10miu57PkNcnHVaZkbqyw/kUWfKMI73jhmfi9BRMUQ==", + "cpu": [ + "ppc64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-ppc64-musl": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-ppc64-musl/-/rollup-linux-ppc64-musl-4.60.0.tgz", + "integrity": "sha512-Z8pPf54Ly3aqtdWC3G4rFigZgNvd+qJlOE52fmko3KST9SoGfAdSRCwyoyG05q1HrrAblLbk1/PSIV+80/pxLg==", + "cpu": [ + "ppc64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-riscv64-gnu": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-gnu/-/rollup-linux-riscv64-gnu-4.60.0.tgz", + "integrity": "sha512-3a3qQustp3COCGvnP4SvrMHnPQ9d1vzCakQVRTliaz8cIp/wULGjiGpbcqrkv0WrHTEp8bQD/B3HBjzujVWLOA==", + "cpu": [ + "riscv64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-riscv64-musl": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-riscv64-musl/-/rollup-linux-riscv64-musl-4.60.0.tgz", + "integrity": "sha512-pjZDsVH/1VsghMJ2/kAaxt6dL0psT6ZexQVrijczOf+PeP2BUqTHYejk3l6TlPRydggINOeNRhvpLa0AYpCWSQ==", + "cpu": [ + "riscv64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-s390x-gnu": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-s390x-gnu/-/rollup-linux-s390x-gnu-4.60.0.tgz", + "integrity": "sha512-3ObQs0BhvPgiUVZrN7gqCSvmFuMWvWvsjG5ayJ3Lraqv+2KhOsp+pUbigqbeWqueGIsnn+09HBw27rJ+gYK4VQ==", + "cpu": [ + "s390x" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-x64-gnu": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-gnu/-/rollup-linux-x64-gnu-4.60.0.tgz", + "integrity": "sha512-EtylprDtQPdS5rXvAayrNDYoJhIz1/vzN2fEubo3yLE7tfAw+948dO0g4M0vkTVFhKojnF+n6C8bDNe+gDRdTg==", + "cpu": [ + "x64" + ], + "dev": true, + "libc": [ + "glibc" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-linux-x64-musl": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-linux-x64-musl/-/rollup-linux-x64-musl-4.60.0.tgz", + "integrity": "sha512-k09oiRCi/bHU9UVFqD17r3eJR9bn03TyKraCrlz5ULFJGdJGi7VOmm9jl44vOJvRJ6P7WuBi/s2A97LxxHGIdw==", + "cpu": [ + "x64" + ], + "dev": true, + "libc": [ + "musl" + ], + "license": "MIT", + "optional": true, + "os": [ + "linux" + ] + }, + "node_modules/@rollup/rollup-openbsd-x64": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-openbsd-x64/-/rollup-openbsd-x64-4.60.0.tgz", + "integrity": "sha512-1o/0/pIhozoSaDJoDcec+IVLbnRtQmHwPV730+AOD29lHEEo4F5BEUB24H0OBdhbBBDwIOSuf7vgg0Ywxdfiiw==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openbsd" + ] + }, + "node_modules/@rollup/rollup-openharmony-arm64": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-openharmony-arm64/-/rollup-openharmony-arm64-4.60.0.tgz", + "integrity": "sha512-pESDkos/PDzYwtyzB5p/UoNU/8fJo68vcXM9ZW2V0kjYayj1KaaUfi1NmTUTUpMn4UhU4gTuK8gIaFO4UGuMbA==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "openharmony" + ] + }, + "node_modules/@rollup/rollup-win32-arm64-msvc": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-arm64-msvc/-/rollup-win32-arm64-msvc-4.60.0.tgz", + "integrity": "sha512-hj1wFStD7B1YBeYmvY+lWXZ7ey73YGPcViMShYikqKT1GtstIKQAtfUI6yrzPjAy/O7pO0VLXGmUVWXQMaYgTQ==", + "cpu": [ + "arm64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-ia32-msvc": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-ia32-msvc/-/rollup-win32-ia32-msvc-4.60.0.tgz", + "integrity": "sha512-SyaIPFoxmUPlNDq5EHkTbiKzmSEmq/gOYFI/3HHJ8iS/v1mbugVa7dXUzcJGQfoytp9DJFLhHH4U3/eTy2Bq4w==", + "cpu": [ + "ia32" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-x64-gnu": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-gnu/-/rollup-win32-x64-gnu-4.60.0.tgz", + "integrity": "sha512-RdcryEfzZr+lAr5kRm2ucN9aVlCCa2QNq4hXelZxb8GG0NJSazq44Z3PCCc8wISRuCVnGs0lQJVX5Vp6fKA+IA==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@rollup/rollup-win32-x64-msvc": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/@rollup/rollup-win32-x64-msvc/-/rollup-win32-x64-msvc-4.60.0.tgz", + "integrity": "sha512-PrsWNQ8BuE00O3Xsx3ALh2Df8fAj9+cvvX9AIA6o4KpATR98c9mud4XtDWVvsEuyia5U4tVSTKygawyJkjm60w==", + "cpu": [ + "x64" + ], + "dev": true, + "license": "MIT", + "optional": true, + "os": [ + "win32" + ] + }, + "node_modules/@types/chai": { + "version": "5.2.3", + "resolved": "https://registry.npmjs.org/@types/chai/-/chai-5.2.3.tgz", + "integrity": "sha512-Mw558oeA9fFbv65/y4mHtXDs9bPnFMZAL/jxdPFUpOHHIXX91mcgEHbS5Lahr+pwZFR8A7GQleRWeI6cGFC2UA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/deep-eql": "*", + "assertion-error": "^2.0.1" + } + }, + "node_modules/@types/deep-eql": { + "version": "4.0.2", + "resolved": "https://registry.npmjs.org/@types/deep-eql/-/deep-eql-4.0.2.tgz", + "integrity": "sha512-c9h9dVVMigMPc4bwTvC5dxqtqJZwQPePsWjPlpSOnojbor6pGqdk541lfA7AqFQr5pB1BRdq0juY9db81BwyFw==", + "dev": true, + "license": "MIT" + }, + "node_modules/@types/estree": { + "version": "1.0.8", + "resolved": "https://registry.npmjs.org/@types/estree/-/estree-1.0.8.tgz", + "integrity": "sha512-dWHzHa2WqEXI/O1E9OjrocMTKJl2mSrEolh1Iomrv6U+JuNwaHXsXx9bLu5gG7BUWFIN0skIQJQ/L1rIex4X6w==", + "dev": true, + "license": "MIT" + }, + "node_modules/@vitest/expect": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/@vitest/expect/-/expect-3.2.4.tgz", + "integrity": "sha512-Io0yyORnB6sikFlt8QW5K7slY4OjqNX9jmJQ02QDda8lyM6B5oNgVWoSoKPac8/kgnCUzuHQKrSLtu/uOqqrig==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/chai": "^5.2.2", + "@vitest/spy": "3.2.4", + "@vitest/utils": "3.2.4", + "chai": "^5.2.0", + "tinyrainbow": "^2.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/mocker": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/@vitest/mocker/-/mocker-3.2.4.tgz", + "integrity": "sha512-46ryTE9RZO/rfDd7pEqFl7etuyzekzEhUbTW3BvmeO/BcCMEgq59BKhek3dXDWgAj4oMK6OZi+vRr1wPW6qjEQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@vitest/spy": "3.2.4", + "estree-walker": "^3.0.3", + "magic-string": "^0.30.17" + }, + "funding": { + "url": "https://opencollective.com/vitest" + }, + "peerDependencies": { + "msw": "^2.4.9", + "vite": "^5.0.0 || ^6.0.0 || ^7.0.0-0" + }, + "peerDependenciesMeta": { + "msw": { + "optional": true + }, + "vite": { + "optional": true + } + } + }, + "node_modules/@vitest/pretty-format": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/@vitest/pretty-format/-/pretty-format-3.2.4.tgz", + "integrity": "sha512-IVNZik8IVRJRTr9fxlitMKeJeXFFFN0JaB9PHPGQ8NKQbGpfjlTx9zO4RefN8gp7eqjNy8nyK3NZmBzOPeIxtA==", + "dev": true, + "license": "MIT", + "dependencies": { + "tinyrainbow": "^2.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/runner": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/@vitest/runner/-/runner-3.2.4.tgz", + "integrity": "sha512-oukfKT9Mk41LreEW09vt45f8wx7DordoWUZMYdY/cyAk7w5TWkTRCNZYF7sX7n2wB7jyGAl74OxgwhPgKaqDMQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@vitest/utils": "3.2.4", + "pathe": "^2.0.3", + "strip-literal": "^3.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/snapshot": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/@vitest/snapshot/-/snapshot-3.2.4.tgz", + "integrity": "sha512-dEYtS7qQP2CjU27QBC5oUOxLE/v5eLkGqPE0ZKEIDGMs4vKWe7IjgLOeauHsR0D5YuuycGRO5oSRXnwnmA78fQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@vitest/pretty-format": "3.2.4", + "magic-string": "^0.30.17", + "pathe": "^2.0.3" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/spy": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/@vitest/spy/-/spy-3.2.4.tgz", + "integrity": "sha512-vAfasCOe6AIK70iP5UD11Ac4siNUNJ9i/9PZ3NKx07sG6sUxeag1LWdNrMWeKKYBLlzuK+Gn65Yd5nyL6ds+nw==", + "dev": true, + "license": "MIT", + "dependencies": { + "tinyspy": "^4.0.3" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/@vitest/utils": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/@vitest/utils/-/utils-3.2.4.tgz", + "integrity": "sha512-fB2V0JFrQSMsCo9HiSq3Ezpdv4iYaXRG1Sx8edX3MwxfyNn83mKiGzOcH+Fkxt4MHxr3y42fQi1oeAInqgX2QA==", + "dev": true, + "license": "MIT", + "dependencies": { + "@vitest/pretty-format": "3.2.4", + "loupe": "^3.1.4", + "tinyrainbow": "^2.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/assertion-error": { + "version": "2.0.1", + "resolved": "https://registry.npmjs.org/assertion-error/-/assertion-error-2.0.1.tgz", + "integrity": "sha512-Izi8RQcffqCeNVgFigKli1ssklIbpHnCYc6AknXGYoB6grJqyeby7jv12JUQgmTAnIDnbck1uxksT4dzN3PWBA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12" + } + }, + "node_modules/cac": { + "version": "6.7.14", + "resolved": "https://registry.npmjs.org/cac/-/cac-6.7.14.tgz", + "integrity": "sha512-b6Ilus+c3RrdDk+JhLKUAQfzzgLEPy6wcXqS7f/xe1EETvsDP6GORG7SFuOs6cID5YkqchW/LXZbX5bc8j7ZcQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=8" + } + }, + "node_modules/chai": { + "version": "5.3.3", + "resolved": "https://registry.npmjs.org/chai/-/chai-5.3.3.tgz", + "integrity": "sha512-4zNhdJD/iOjSH0A05ea+Ke6MU5mmpQcbQsSOkgdaUMJ9zTlDTD/GYlwohmIE2u0gaxHYiVHEn1Fw9mZ/ktJWgw==", + "dev": true, + "license": "MIT", + "dependencies": { + "assertion-error": "^2.0.1", + "check-error": "^2.1.1", + "deep-eql": "^5.0.1", + "loupe": "^3.1.0", + "pathval": "^2.0.0" + }, + "engines": { + "node": ">=18" + } + }, + "node_modules/check-error": { + "version": "2.1.3", + "resolved": "https://registry.npmjs.org/check-error/-/check-error-2.1.3.tgz", + "integrity": "sha512-PAJdDJusoxnwm1VwW07VWwUN1sl7smmC3OKggvndJFadxxDRyFJBX/ggnu/KE4kQAB7a3Dp8f/YXC1FlUprWmA==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 16" + } + }, + "node_modules/debug": { + "version": "4.4.3", + "resolved": "https://registry.npmjs.org/debug/-/debug-4.4.3.tgz", + "integrity": "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA==", + "dev": true, + "license": "MIT", + "dependencies": { + "ms": "^2.1.3" + }, + "engines": { + "node": ">=6.0" + }, + "peerDependenciesMeta": { + "supports-color": { + "optional": true + } + } + }, + "node_modules/deep-eql": { + "version": "5.0.2", + "resolved": "https://registry.npmjs.org/deep-eql/-/deep-eql-5.0.2.tgz", + "integrity": "sha512-h5k/5U50IJJFpzfL6nO9jaaumfjO/f2NjK/oYB2Djzm4p9L+3T9qWpZqZ2hAbLPuuYq9wrU08WQyBTL5GbPk5Q==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=6" + } + }, + "node_modules/es-module-lexer": { + "version": "1.7.0", + "resolved": "https://registry.npmjs.org/es-module-lexer/-/es-module-lexer-1.7.0.tgz", + "integrity": "sha512-jEQoCwk8hyb2AZziIOLhDqpm5+2ww5uIE6lkO/6jcOCusfk6LhMHpXXfBLXTZ7Ydyt0j4VoUQv6uGNYbdW+kBA==", + "dev": true, + "license": "MIT" + }, + "node_modules/esbuild": { + "version": "0.27.4", + "resolved": "https://registry.npmjs.org/esbuild/-/esbuild-0.27.4.tgz", + "integrity": "sha512-Rq4vbHnYkK5fws5NF7MYTU68FPRE1ajX7heQ/8QXXWqNgqqJ/GkmmyxIzUnf2Sr/bakf8l54716CcMGHYhMrrQ==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "bin": { + "esbuild": "bin/esbuild" + }, + "engines": { + "node": ">=18" + }, + "optionalDependencies": { + "@esbuild/aix-ppc64": "0.27.4", + "@esbuild/android-arm": "0.27.4", + "@esbuild/android-arm64": "0.27.4", + "@esbuild/android-x64": "0.27.4", + "@esbuild/darwin-arm64": "0.27.4", + "@esbuild/darwin-x64": "0.27.4", + "@esbuild/freebsd-arm64": "0.27.4", + "@esbuild/freebsd-x64": "0.27.4", + "@esbuild/linux-arm": "0.27.4", + "@esbuild/linux-arm64": "0.27.4", + "@esbuild/linux-ia32": "0.27.4", + "@esbuild/linux-loong64": "0.27.4", + "@esbuild/linux-mips64el": "0.27.4", + "@esbuild/linux-ppc64": "0.27.4", + "@esbuild/linux-riscv64": "0.27.4", + "@esbuild/linux-s390x": "0.27.4", + "@esbuild/linux-x64": "0.27.4", + "@esbuild/netbsd-arm64": "0.27.4", + "@esbuild/netbsd-x64": "0.27.4", + "@esbuild/openbsd-arm64": "0.27.4", + "@esbuild/openbsd-x64": "0.27.4", + "@esbuild/openharmony-arm64": "0.27.4", + "@esbuild/sunos-x64": "0.27.4", + "@esbuild/win32-arm64": "0.27.4", + "@esbuild/win32-ia32": "0.27.4", + "@esbuild/win32-x64": "0.27.4" + } + }, + "node_modules/estree-walker": { + "version": "3.0.3", + "resolved": "https://registry.npmjs.org/estree-walker/-/estree-walker-3.0.3.tgz", + "integrity": "sha512-7RUKfXgSMMkzt6ZuXmqapOurLGPPfgj6l9uRZ7lRGolvk0y2yocc35LdcxKC5PQZdn2DMqioAQ2NoWcrTKmm6g==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/estree": "^1.0.0" + } + }, + "node_modules/expect-type": { + "version": "1.3.0", + "resolved": "https://registry.npmjs.org/expect-type/-/expect-type-1.3.0.tgz", + "integrity": "sha512-knvyeauYhqjOYvQ66MznSMs83wmHrCycNEN6Ao+2AeYEfxUIkuiVxdEa1qlGEPK+We3n0THiDciYSsCcgW/DoA==", + "dev": true, + "license": "Apache-2.0", + "engines": { + "node": ">=12.0.0" + } + }, + "node_modules/fdir": { + "version": "6.5.0", + "resolved": "https://registry.npmjs.org/fdir/-/fdir-6.5.0.tgz", + "integrity": "sha512-tIbYtZbucOs0BRGqPJkshJUYdL+SDH7dVM8gjy+ERp3WAUjLEFJE+02kanyHtwjWOnwrKYBiwAmM0p4kLJAnXg==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12.0.0" + }, + "peerDependencies": { + "picomatch": "^3 || ^4" + }, + "peerDependenciesMeta": { + "picomatch": { + "optional": true + } + } + }, + "node_modules/fsevents": { + "version": "2.3.3", + "resolved": "https://registry.npmjs.org/fsevents/-/fsevents-2.3.3.tgz", + "integrity": "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw==", + "dev": true, + "hasInstallScript": true, + "license": "MIT", + "optional": true, + "os": [ + "darwin" + ], + "engines": { + "node": "^8.16.0 || ^10.6.0 || >=11.0.0" + } + }, + "node_modules/js-tokens": { + "version": "9.0.1", + "resolved": "https://registry.npmjs.org/js-tokens/-/js-tokens-9.0.1.tgz", + "integrity": "sha512-mxa9E9ITFOt0ban3j6L5MpjwegGz6lBQmM1IJkWeBZGcMxto50+eWdjC/52xDbS2vy0k7vIMK0Fe2wfL9OQSpQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/loupe": { + "version": "3.2.1", + "resolved": "https://registry.npmjs.org/loupe/-/loupe-3.2.1.tgz", + "integrity": "sha512-CdzqowRJCeLU72bHvWqwRBBlLcMEtIvGrlvef74kMnV2AolS9Y8xUv1I0U/MNAWMhBlKIoyuEgoJ0t/bbwHbLQ==", + "dev": true, + "license": "MIT" + }, + "node_modules/magic-string": { + "version": "0.30.21", + "resolved": "https://registry.npmjs.org/magic-string/-/magic-string-0.30.21.tgz", + "integrity": "sha512-vd2F4YUyEXKGcLHoq+TEyCjxueSeHnFxyyjNp80yg0XV4vUhnDer/lvvlqM/arB5bXQN5K2/3oinyCRyx8T2CQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@jridgewell/sourcemap-codec": "^1.5.5" + } + }, + "node_modules/ms": { + "version": "2.1.3", + "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz", + "integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA==", + "dev": true, + "license": "MIT" + }, + "node_modules/nanoid": { + "version": "3.3.11", + "resolved": "https://registry.npmjs.org/nanoid/-/nanoid-3.3.11.tgz", + "integrity": "sha512-N8SpfPUnUp1bK+PMYW8qSWdl9U+wwNWI4QKxOYDy9JAro3WMX7p2OeVRF9v+347pnakNevPmiHhNmZ2HbFA76w==", + "dev": true, + "funding": [ + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "bin": { + "nanoid": "bin/nanoid.cjs" + }, + "engines": { + "node": "^10 || ^12 || ^13.7 || ^14 || >=15.0.1" + } + }, + "node_modules/pathe": { + "version": "2.0.3", + "resolved": "https://registry.npmjs.org/pathe/-/pathe-2.0.3.tgz", + "integrity": "sha512-WUjGcAqP1gQacoQe+OBJsFA7Ld4DyXuUIjZ5cc75cLHvJ7dtNsTugphxIADwspS+AraAUePCKrSVtPLFj/F88w==", + "dev": true, + "license": "MIT" + }, + "node_modules/pathval": { + "version": "2.0.1", + "resolved": "https://registry.npmjs.org/pathval/-/pathval-2.0.1.tgz", + "integrity": "sha512-//nshmD55c46FuFw26xV/xFAaB5HF9Xdap7HJBBnrKdAd6/GxDBaNA1870O79+9ueg61cZLSVc+OaFlfmObYVQ==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">= 14.16" + } + }, + "node_modules/picocolors": { + "version": "1.1.1", + "resolved": "https://registry.npmjs.org/picocolors/-/picocolors-1.1.1.tgz", + "integrity": "sha512-xceH2snhtb5M9liqDsmEw56le376mTZkEX/jEb/RxNFyegNul7eNslCXP9FDj/Lcu0X8KEyMceP2ntpaHrDEVA==", + "dev": true, + "license": "ISC" + }, + "node_modules/picomatch": { + "version": "4.0.4", + "resolved": "https://registry.npmjs.org/picomatch/-/picomatch-4.0.4.tgz", + "integrity": "sha512-QP88BAKvMam/3NxH6vj2o21R6MjxZUAd6nlwAS/pnGvN9IVLocLHxGYIzFhg6fUQ+5th6P4dv4eW9jX3DSIj7A==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=12" + }, + "funding": { + "url": "https://github.com/sponsors/jonschlinkert" + } + }, + "node_modules/postcss": { + "version": "8.5.8", + "resolved": "https://registry.npmjs.org/postcss/-/postcss-8.5.8.tgz", + "integrity": "sha512-OW/rX8O/jXnm82Ey1k44pObPtdblfiuWnrd8X7GJ7emImCOstunGbXUpp7HdBrFQX6rJzn3sPT397Wp5aCwCHg==", + "dev": true, + "funding": [ + { + "type": "opencollective", + "url": "https://opencollective.com/postcss/" + }, + { + "type": "tidelift", + "url": "https://tidelift.com/funding/github/npm/postcss" + }, + { + "type": "github", + "url": "https://github.com/sponsors/ai" + } + ], + "license": "MIT", + "dependencies": { + "nanoid": "^3.3.11", + "picocolors": "^1.1.1", + "source-map-js": "^1.2.1" + }, + "engines": { + "node": "^10 || ^12 || >=14" + } + }, + "node_modules/rollup": { + "version": "4.60.0", + "resolved": "https://registry.npmjs.org/rollup/-/rollup-4.60.0.tgz", + "integrity": "sha512-yqjxruMGBQJ2gG4HtjZtAfXArHomazDHoFwFFmZZl0r7Pdo7qCIXKqKHZc8yeoMgzJJ+pO6pEEHa+V7uzWlrAQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/estree": "1.0.8" + }, + "bin": { + "rollup": "dist/bin/rollup" + }, + "engines": { + "node": ">=18.0.0", + "npm": ">=8.0.0" + }, + "optionalDependencies": { + "@rollup/rollup-android-arm-eabi": "4.60.0", + "@rollup/rollup-android-arm64": "4.60.0", + "@rollup/rollup-darwin-arm64": "4.60.0", + "@rollup/rollup-darwin-x64": "4.60.0", + "@rollup/rollup-freebsd-arm64": "4.60.0", + "@rollup/rollup-freebsd-x64": "4.60.0", + "@rollup/rollup-linux-arm-gnueabihf": "4.60.0", + "@rollup/rollup-linux-arm-musleabihf": "4.60.0", + "@rollup/rollup-linux-arm64-gnu": "4.60.0", + "@rollup/rollup-linux-arm64-musl": "4.60.0", + "@rollup/rollup-linux-loong64-gnu": "4.60.0", + "@rollup/rollup-linux-loong64-musl": "4.60.0", + "@rollup/rollup-linux-ppc64-gnu": "4.60.0", + "@rollup/rollup-linux-ppc64-musl": "4.60.0", + "@rollup/rollup-linux-riscv64-gnu": "4.60.0", + "@rollup/rollup-linux-riscv64-musl": "4.60.0", + "@rollup/rollup-linux-s390x-gnu": "4.60.0", + "@rollup/rollup-linux-x64-gnu": "4.60.0", + "@rollup/rollup-linux-x64-musl": "4.60.0", + "@rollup/rollup-openbsd-x64": "4.60.0", + "@rollup/rollup-openharmony-arm64": "4.60.0", + "@rollup/rollup-win32-arm64-msvc": "4.60.0", + "@rollup/rollup-win32-ia32-msvc": "4.60.0", + "@rollup/rollup-win32-x64-gnu": "4.60.0", + "@rollup/rollup-win32-x64-msvc": "4.60.0", + "fsevents": "~2.3.2" + } + }, + "node_modules/siginfo": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/siginfo/-/siginfo-2.0.0.tgz", + "integrity": "sha512-ybx0WO1/8bSBLEWXZvEd7gMW3Sn3JFlW3TvX1nREbDLRNQNaeNN8WK0meBwPdAaOI7TtRRRJn/Es1zhrrCHu7g==", + "dev": true, + "license": "ISC" + }, + "node_modules/source-map-js": { + "version": "1.2.1", + "resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.1.tgz", + "integrity": "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA==", + "dev": true, + "license": "BSD-3-Clause", + "engines": { + "node": ">=0.10.0" + } + }, + "node_modules/stackback": { + "version": "0.0.2", + "resolved": "https://registry.npmjs.org/stackback/-/stackback-0.0.2.tgz", + "integrity": "sha512-1XMJE5fQo1jGH6Y/7ebnwPOBEkIEnT4QF32d5R1+VXdXveM0IBMJt8zfaxX1P3QhVwrYe+576+jkANtSS2mBbw==", + "dev": true, + "license": "MIT" + }, + "node_modules/std-env": { + "version": "3.10.0", + "resolved": "https://registry.npmjs.org/std-env/-/std-env-3.10.0.tgz", + "integrity": "sha512-5GS12FdOZNliM5mAOxFRg7Ir0pWz8MdpYm6AY6VPkGpbA7ZzmbzNcBJQ0GPvvyWgcY7QAhCgf9Uy89I03faLkg==", + "dev": true, + "license": "MIT" + }, + "node_modules/strip-literal": { + "version": "3.1.0", + "resolved": "https://registry.npmjs.org/strip-literal/-/strip-literal-3.1.0.tgz", + "integrity": "sha512-8r3mkIM/2+PpjHoOtiAW8Rg3jJLHaV7xPwG+YRGrv6FP0wwk/toTpATxWYOW0BKdWwl82VT2tFYi5DlROa0Mxg==", + "dev": true, + "license": "MIT", + "dependencies": { + "js-tokens": "^9.0.1" + }, + "funding": { + "url": "https://github.com/sponsors/antfu" + } + }, + "node_modules/tinybench": { + "version": "2.9.0", + "resolved": "https://registry.npmjs.org/tinybench/-/tinybench-2.9.0.tgz", + "integrity": "sha512-0+DUvqWMValLmha6lr4kD8iAMK1HzV0/aKnCtWb9v9641TnP/MFb7Pc2bxoxQjTXAErryXVgUOfv2YqNllqGeg==", + "dev": true, + "license": "MIT" + }, + "node_modules/tinyexec": { + "version": "0.3.2", + "resolved": "https://registry.npmjs.org/tinyexec/-/tinyexec-0.3.2.tgz", + "integrity": "sha512-KQQR9yN7R5+OSwaK0XQoj22pwHoTlgYqmUscPYoknOoWCWfj/5/ABTMRi69FrKU5ffPVh5QcFikpWJI/P1ocHA==", + "dev": true, + "license": "MIT" + }, + "node_modules/tinyglobby": { + "version": "0.2.15", + "resolved": "https://registry.npmjs.org/tinyglobby/-/tinyglobby-0.2.15.tgz", + "integrity": "sha512-j2Zq4NyQYG5XMST4cbs02Ak8iJUdxRM0XI5QyxXuZOzKOINmWurp3smXu3y5wDcJrptwpSjgXHzIQxR0omXljQ==", + "dev": true, + "license": "MIT", + "dependencies": { + "fdir": "^6.5.0", + "picomatch": "^4.0.3" + }, + "engines": { + "node": ">=12.0.0" + }, + "funding": { + "url": "https://github.com/sponsors/SuperchupuDev" + } + }, + "node_modules/tinypool": { + "version": "1.1.1", + "resolved": "https://registry.npmjs.org/tinypool/-/tinypool-1.1.1.tgz", + "integrity": "sha512-Zba82s87IFq9A9XmjiX5uZA/ARWDrB03OHlq+Vw1fSdt0I+4/Kutwy8BP4Y/y/aORMo61FQ0vIb5j44vSo5Pkg==", + "dev": true, + "license": "MIT", + "engines": { + "node": "^18.0.0 || >=20.0.0" + } + }, + "node_modules/tinyrainbow": { + "version": "2.0.0", + "resolved": "https://registry.npmjs.org/tinyrainbow/-/tinyrainbow-2.0.0.tgz", + "integrity": "sha512-op4nsTR47R6p0vMUUoYl/a+ljLFVtlfaXkLQmqfLR1qHma1h/ysYk4hEXZ880bf2CYgTskvTa/e196Vd5dDQXw==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14.0.0" + } + }, + "node_modules/tinyspy": { + "version": "4.0.4", + "resolved": "https://registry.npmjs.org/tinyspy/-/tinyspy-4.0.4.tgz", + "integrity": "sha512-azl+t0z7pw/z958Gy9svOTuzqIk6xq+NSheJzn5MMWtWTFywIacg2wUlzKFGtt3cthx0r2SxMK0yzJOR0IES7Q==", + "dev": true, + "license": "MIT", + "engines": { + "node": ">=14.0.0" + } + }, + "node_modules/typescript": { + "version": "5.9.3", + "resolved": "https://registry.npmjs.org/typescript/-/typescript-5.9.3.tgz", + "integrity": "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw==", + "dev": true, + "license": "Apache-2.0", + "bin": { + "tsc": "bin/tsc", + "tsserver": "bin/tsserver" + }, + "engines": { + "node": ">=14.17" + } + }, + "node_modules/vite": { + "version": "7.3.1", + "resolved": "https://registry.npmjs.org/vite/-/vite-7.3.1.tgz", + "integrity": "sha512-w+N7Hifpc3gRjZ63vYBXA56dvvRlNWRczTdmCBBa+CotUzAPf5b7YMdMR/8CQoeYE5LX3W4wj6RYTgonm1b9DA==", + "dev": true, + "license": "MIT", + "dependencies": { + "esbuild": "^0.27.0", + "fdir": "^6.5.0", + "picomatch": "^4.0.3", + "postcss": "^8.5.6", + "rollup": "^4.43.0", + "tinyglobby": "^0.2.15" + }, + "bin": { + "vite": "bin/vite.js" + }, + "engines": { + "node": "^20.19.0 || >=22.12.0" + }, + "funding": { + "url": "https://github.com/vitejs/vite?sponsor=1" + }, + "optionalDependencies": { + "fsevents": "~2.3.3" + }, + "peerDependencies": { + "@types/node": "^20.19.0 || >=22.12.0", + "jiti": ">=1.21.0", + "less": "^4.0.0", + "lightningcss": "^1.21.0", + "sass": "^1.70.0", + "sass-embedded": "^1.70.0", + "stylus": ">=0.54.8", + "sugarss": "^5.0.0", + "terser": "^5.16.0", + "tsx": "^4.8.1", + "yaml": "^2.4.2" + }, + "peerDependenciesMeta": { + "@types/node": { + "optional": true + }, + "jiti": { + "optional": true + }, + "less": { + "optional": true + }, + "lightningcss": { + "optional": true + }, + "sass": { + "optional": true + }, + "sass-embedded": { + "optional": true + }, + "stylus": { + "optional": true + }, + "sugarss": { + "optional": true + }, + "terser": { + "optional": true + }, + "tsx": { + "optional": true + }, + "yaml": { + "optional": true + } + } + }, + "node_modules/vite-node": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/vite-node/-/vite-node-3.2.4.tgz", + "integrity": "sha512-EbKSKh+bh1E1IFxeO0pg1n4dvoOTt0UDiXMd/qn++r98+jPO1xtJilvXldeuQ8giIB5IkpjCgMleHMNEsGH6pg==", + "dev": true, + "license": "MIT", + "dependencies": { + "cac": "^6.7.14", + "debug": "^4.4.1", + "es-module-lexer": "^1.7.0", + "pathe": "^2.0.3", + "vite": "^5.0.0 || ^6.0.0 || ^7.0.0-0" + }, + "bin": { + "vite-node": "vite-node.mjs" + }, + "engines": { + "node": "^18.0.0 || ^20.0.0 || >=22.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + } + }, + "node_modules/vitest": { + "version": "3.2.4", + "resolved": "https://registry.npmjs.org/vitest/-/vitest-3.2.4.tgz", + "integrity": "sha512-LUCP5ev3GURDysTWiP47wRRUpLKMOfPh+yKTx3kVIEiu5KOMeqzpnYNsKyOoVrULivR8tLcks4+lga33Whn90A==", + "dev": true, + "license": "MIT", + "dependencies": { + "@types/chai": "^5.2.2", + "@vitest/expect": "3.2.4", + "@vitest/mocker": "3.2.4", + "@vitest/pretty-format": "^3.2.4", + "@vitest/runner": "3.2.4", + "@vitest/snapshot": "3.2.4", + "@vitest/spy": "3.2.4", + "@vitest/utils": "3.2.4", + "chai": "^5.2.0", + "debug": "^4.4.1", + "expect-type": "^1.2.1", + "magic-string": "^0.30.17", + "pathe": "^2.0.3", + "picomatch": "^4.0.2", + "std-env": "^3.9.0", + "tinybench": "^2.9.0", + "tinyexec": "^0.3.2", + "tinyglobby": "^0.2.14", + "tinypool": "^1.1.1", + "tinyrainbow": "^2.0.0", + "vite": "^5.0.0 || ^6.0.0 || ^7.0.0-0", + "vite-node": "3.2.4", + "why-is-node-running": "^2.3.0" + }, + "bin": { + "vitest": "vitest.mjs" + }, + "engines": { + "node": "^18.0.0 || ^20.0.0 || >=22.0.0" + }, + "funding": { + "url": "https://opencollective.com/vitest" + }, + "peerDependencies": { + "@edge-runtime/vm": "*", + "@types/debug": "^4.1.12", + "@types/node": "^18.0.0 || ^20.0.0 || >=22.0.0", + "@vitest/browser": "3.2.4", + "@vitest/ui": "3.2.4", + "happy-dom": "*", + "jsdom": "*" + }, + "peerDependenciesMeta": { + "@edge-runtime/vm": { + "optional": true + }, + "@types/debug": { + "optional": true + }, + "@types/node": { + "optional": true + }, + "@vitest/browser": { + "optional": true + }, + "@vitest/ui": { + "optional": true + }, + "happy-dom": { + "optional": true + }, + "jsdom": { + "optional": true + } + } + }, + "node_modules/why-is-node-running": { + "version": "2.3.0", + "resolved": "https://registry.npmjs.org/why-is-node-running/-/why-is-node-running-2.3.0.tgz", + "integrity": "sha512-hUrmaWBdVDcxvYqnyh09zunKzROWjbZTiNy8dBEjkS7ehEDQibXJ7XvlmtbwuTclUiIyN+CyXQD4Vmko8fNm8w==", + "dev": true, + "license": "MIT", + "dependencies": { + "siginfo": "^2.0.0", + "stackback": "0.0.2" + }, + "bin": { + "why-is-node-running": "cli.js" + }, + "engines": { + "node": ">=8" + } + }, + "node_modules/yaml": { + "version": "2.8.3", + "resolved": "https://registry.npmjs.org/yaml/-/yaml-2.8.3.tgz", + "integrity": "sha512-AvbaCLOO2Otw/lW5bmh9d/WEdcDFdQp2Z2ZUH3pX9U2ihyUY0nvLv7J6TrWowklRGPYbB/IuIMfYgxaCPg5Bpg==", + "dev": true, + "license": "ISC", + "bin": { + "yaml": "bin.mjs" + }, + "engines": { + "node": ">= 14.6" + }, + "funding": { + "url": "https://github.com/sponsors/eemeli" + } + } + } +} diff --git a/internal/bloblang2/ts/package.json b/internal/bloblang2/ts/package.json new file mode 100644 index 000000000..0dfcab524 --- /dev/null +++ b/internal/bloblang2/ts/package.json @@ -0,0 +1,19 @@ +{ + "name": "@bloblang/v2", + "version": "0.0.1", + "private": true, + "type": "module", + "main": "dist/index.js", + "types": "dist/index.d.ts", + "scripts": { + "build": "tsc", + "bundle": "node bundle.mjs", + "test": "vitest run", + "test:watch": "vitest" + }, + "devDependencies": { + "typescript": "^5.7.0", + "vitest": "^3.0.0", + "yaml": "^2.7.0" + } +} diff --git a/internal/bloblang2/ts/src/arithmetic.ts b/internal/bloblang2/ts/src/arithmetic.ts new file mode 100644 index 000000000..61afc2be2 --- /dev/null +++ b/internal/bloblang2/ts/src/arithmetic.ts @@ -0,0 +1,714 @@ +// Binary arithmetic and comparison operations for the Bloblang V2 interpreter. + +import { TokenType } from "./token.js"; +import { + type Value, + mkInt64, + mkFloat64, + mkString, + mkBool, + mkError, + mkBytes, + mkInt32, + mkUint32, + mkUint64, + mkFloat32, + NULL, + TRUE, + FALSE, + isError, + isNull, + isString, + isBytes, + isNumeric, + isFloat32, + isFloat64, + isInt32, + isInt64, + isUint32, + isUint64, + isTimestamp, + typeName, + valuesEqual, + promoteWithError, + promoteChecked, + MAX_INT64, + MIN_INT64, + MAX_INT32, + MIN_INT32, + MAX_UINT32, + MAX_UINT64, + MAX_SAFE_FLOAT64, +} from "./value.js"; + +// --------------------------------------------------------------------------- +// Public API +// --------------------------------------------------------------------------- + +export function evalBinaryOp(op: TokenType, left: Value, right: Value): Value { + // Timestamp subtraction: ts - ts = int64 nanoseconds. + if (op === TokenType.MINUS) { + if (isTimestamp(left) && isTimestamp(right)) { + const diff = left.value - right.value; + if (diff > MAX_INT64 || diff < MIN_INT64) { + return mkError( + "timestamp subtraction overflow: difference exceeds int64 nanosecond range (~292 years)", + ); + } + return mkInt64(diff); + } + if (isTimestamp(left)) { + return mkError("cannot subtract timestamp and " + typeName(right)); + } + } + + // Timestamp comparison. + if (isTimestamp(left)) { + if (isTimestamp(right)) { + switch (op) { + case TokenType.GT: + return mkBool(left.value > right.value); + case TokenType.GE: + return mkBool(left.value >= right.value); + case TokenType.LT: + return mkBool(left.value < right.value); + case TokenType.LE: + return mkBool(left.value <= right.value); + case TokenType.EQ: + return mkBool(left.value === right.value); + case TokenType.NE: + return mkBool(left.value !== right.value); + default: + return mkError( + "cannot " + + opVerb(op) + + " timestamp and timestamp", + ); + } + } + if (op === TokenType.EQ || op === TokenType.NE) { + return mkBool(op === TokenType.NE); // cross-family + } + return mkError( + "cannot " + opVerb(op) + " timestamp and " + typeName(right), + ); + } + if (isTimestamp(right)) { + if (op === TokenType.EQ || op === TokenType.NE) { + return mkBool(op === TokenType.NE); // cross-family + } + return mkError( + "cannot " + opVerb(op) + " " + typeName(left) + " and timestamp", + ); + } + + switch (op) { + case TokenType.PLUS: + return evalAdd(left, right); + case TokenType.MINUS: + return evalArith(left, right, "-"); + case TokenType.STAR: + return evalArith(left, right, "*"); + case TokenType.SLASH: + return evalDivide(left, right); + case TokenType.PERCENT: + return evalModulo(left, right); + case TokenType.EQ: + return evalEquality(left, right, false); + case TokenType.NE: + return evalEquality(left, right, true); + case TokenType.GT: + return evalCompare(left, right, ">"); + case TokenType.GE: + return evalCompare(left, right, ">="); + case TokenType.LT: + return evalCompare(left, right, "<"); + case TokenType.LE: + return evalCompare(left, right, "<="); + default: + return mkError(`unknown binary operator ${op}`); + } +} + +export function numericNegate(v: Value): Value { + switch (v.tag) { + case "int64": + if (v.value === MIN_INT64) return mkError("int64 overflow"); + return mkInt64(-v.value); + case "int32": + if (v.value === MIN_INT32) return mkError("int32 overflow"); + return mkInt32(-v.value); + case "float64": + return mkFloat64(-v.value); + case "float32": + return mkFloat32(-v.value); + case "uint32": + return mkInt64(-BigInt(v.value)); + case "uint64": + if (v.value > MAX_INT64) { + return mkError( + "cannot negate uint64 value exceeding int64 range", + ); + } + return mkInt64(-v.value); + default: + return mkError(`cannot negate ${typeName(v)}`); + } +} + +// --------------------------------------------------------------------------- +// Internal helpers +// --------------------------------------------------------------------------- + +/** + * Equality with checked numeric promotion. Returns an error Value when + * cross-type numeric promotion fails (e.g., int64 > 2^53 vs float). + */ +function evalEquality(a: Value, b: Value, negate: boolean): Value { + if (isNull(a) && isNull(b)) return mkBool(!negate); + if (isNull(a) || isNull(b)) return mkBool(negate); + + // Numeric equality with checked promotion. + if (isNumeric(a) && isNumeric(b)) { + // Same tag: direct compare (no promotion needed). + if (a.tag === b.tag) { + let eq: boolean; + switch (a.tag) { + case "float64": + eq = !isNaN(a.value) && !isNaN((b as { tag: "float64"; value: number }).value) && a.value === (b as { tag: "float64"; value: number }).value; + break; + case "float32": + eq = !isNaN(a.value) && !isNaN((b as { tag: "float32"; value: number }).value) && a.value === (b as { tag: "float32"; value: number }).value; + break; + default: + eq = (a as { value: unknown }).value === (b as { value: unknown }).value; + } + return mkBool(negate ? !eq : eq); + } + + // Different numeric types: checked promotion (can error). + const result = promoteWithError(a, b); + if ("error" in result) return mkError(result.error); + const [pa, pb, kind] = result.promoted; + + let eq: boolean; + switch (kind) { + case "int64": + eq = (pa as { value: bigint }).value === (pb as { value: bigint }).value; + break; + case "int32": + case "uint32": + eq = (pa as { value: number }).value === (pb as { value: number }).value; + break; + case "uint64": + eq = (pa as { value: bigint }).value === (pb as { value: bigint }).value; + break; + case "float64": + case "float32": { + const af = (pa as { value: number }).value; + const bf = (pb as { value: number }).value; + eq = !isNaN(af) && !isNaN(bf) && af === bf; + break; + } + default: + eq = false; + } + return mkBool(negate ? !eq : eq); + } + + // Non-numeric: use existing valuesEqual (no promotion errors possible). + const eq = valuesEqual(a, b); + return mkBool(negate ? !eq : eq); +} + +function evalAdd(left: Value, right: Value): Value { + // String concatenation. + if (isString(left)) { + if (!isString(right)) { + return mkError( + "cannot add string and " + typeName(right) + ": not numeric", + ); + } + return mkString(left.value + right.value); + } + if (isString(right)) { + return mkError( + "cannot add " + typeName(left) + " and string: not numeric", + ); + } + // Bytes concatenation. + if (isBytes(left)) { + if (!isBytes(right)) { + return mkError("cannot add bytes and " + typeName(right)); + } + const result = new Uint8Array(left.value.length + right.value.length); + result.set(left.value); + result.set(right.value, left.value.length); + return mkBytes(result); + } + // Numeric addition. + return evalArith(left, right, "+"); +} + +function evalArith(left: Value, right: Value, op: string): Value { + if (!isNumeric(left) || !isNumeric(right)) { + return arithError(left, right, op); + } + + const result = promoteWithError(left, right); + if ("error" in result) return mkError(result.error); + const [pl, pr, kind] = result.promoted; + + switch (kind) { + case "int64": + return checkedInt64Arith( + (pl as { tag: "int64"; value: bigint }).value, + (pr as { tag: "int64"; value: bigint }).value, + op, + ); + case "int32": + return checkedInt32Arith( + (pl as { tag: "int32"; value: number }).value, + (pr as { tag: "int32"; value: number }).value, + op, + ); + case "uint32": + return checkedUint32Arith( + (pl as { tag: "uint32"; value: number }).value, + (pr as { tag: "uint32"; value: number }).value, + op, + ); + case "uint64": + return checkedUint64Arith( + (pl as { tag: "uint64"; value: bigint }).value, + (pr as { tag: "uint64"; value: bigint }).value, + op, + ); + case "float64": + return floatArith( + (pl as { tag: "float64"; value: number }).value, + (pr as { tag: "float64"; value: number }).value, + op, + ); + case "float32": + return float32Arith( + (pl as { tag: "float32"; value: number }).value, + (pr as { tag: "float32"; value: number }).value, + op, + ); + default: + return mkError("unexpected promotion result"); + } +} + +function evalDivide(left: Value, right: Value): Value { + if (!isNumeric(left) || !isNumeric(right)) { + return mkError( + `cannot divide ${typeName(left)} by ${typeName(right)}`, + ); + } + + // Division always produces float. + // float32 / float32 → float32, all else → float64. + if (isFloat32(left) && isFloat32(right)) { + if (right.value === 0) return mkError("division by zero"); + return mkFloat32(left.value / right.value); + } + + const af = checkedToFloat64(left); + const bf = checkedToFloat64(right); + if (af === null || bf === null) { + return mkError( + "integer exceeds float64 exact range (magnitude > 2^53)", + ); + } + if (bf === 0) return mkError("division by zero"); + return mkFloat64(af / bf); +} + +function evalModulo(left: Value, right: Value): Value { + if (!isNumeric(left) || !isNumeric(right)) { + return mkError( + `cannot modulo ${typeName(left)} by ${typeName(right)}`, + ); + } + + const result = promoteWithError(left, right); + if ("error" in result) return mkError(result.error); + const [pl, pr, kind] = result.promoted; + + switch (kind) { + case "int64": { + const a = (pl as { tag: "int64"; value: bigint }).value; + const b = (pr as { tag: "int64"; value: bigint }).value; + if (b === 0n) return mkError("modulo by zero"); + if (a === MIN_INT64 && b === -1n) return mkError("int64 overflow"); + return mkInt64(a % b); + } + case "int32": { + const a = (pl as { tag: "int32"; value: number }).value; + const b = (pr as { tag: "int32"; value: number }).value; + if (b === 0) return mkError("modulo by zero"); + if (a === MIN_INT32 && b === -1) return mkError("int32 overflow"); + return mkInt32(a % b); + } + case "uint32": { + const a = (pl as { tag: "uint32"; value: number }).value; + const b = (pr as { tag: "uint32"; value: number }).value; + if (b === 0) return mkError("modulo by zero"); + return mkUint32(a % b); + } + case "uint64": { + const a = (pl as { tag: "uint64"; value: bigint }).value; + const b = (pr as { tag: "uint64"; value: bigint }).value; + if (b === 0n) return mkError("modulo by zero"); + return mkUint64(a % b); + } + case "float64": { + const a = (pl as { tag: "float64"; value: number }).value; + const b = (pr as { tag: "float64"; value: number }).value; + if (b === 0) return mkError("modulo by zero"); + return mkFloat64(a % b); + } + case "float32": { + const a = (pl as { tag: "float32"; value: number }).value; + const b = (pr as { tag: "float32"; value: number }).value; + if (b === 0) return mkError("modulo by zero"); + return mkFloat32(Math.fround(a % b)); + } + default: + return mkError("unexpected promotion result"); + } +} + +function evalCompare(left: Value, right: Value, op: string): Value { + if (!isNumeric(left) && !isNumeric(right)) { + // String comparison. + if (isString(left)) { + if (!isString(right)) { + return mkError( + `cannot compare string and ${typeName(right)}: not comparable`, + ); + } + return stringCompare(left.value, right.value, op); + } + // Bytes comparison (lexicographic). + if (isBytes(left)) { + if (!isBytes(right)) { + return mkError( + `cannot compare bytes and ${typeName(right)}: not comparable`, + ); + } + return bytesCompare(left.value, right.value, op); + } + return mkError( + `cannot compare ${typeName(left)} and ${typeName(right)}: not comparable types`, + ); + } + if (!isNumeric(left) || !isNumeric(right)) { + return mkError( + `cannot compare ${typeName(left)} and ${typeName(right)}: not comparable types`, + ); + } + + // Promote using checked rules. + const result = promoteWithError(left, right); + if ("error" in result) return mkError(result.error); + const [pl, pr, kind] = result.promoted; + + switch (kind) { + case "int64": + return compareOrdered( + (pl as { tag: "int64"; value: bigint }).value, + (pr as { tag: "int64"; value: bigint }).value, + op, + ); + case "int32": + return compareOrdered( + BigInt((pl as { tag: "int32"; value: number }).value), + BigInt((pr as { tag: "int32"; value: number }).value), + op, + ); + case "uint32": + return compareOrdered( + BigInt((pl as { tag: "uint32"; value: number }).value), + BigInt((pr as { tag: "uint32"; value: number }).value), + op, + ); + case "uint64": + return compareOrdered( + (pl as { tag: "uint64"; value: bigint }).value, + (pr as { tag: "uint64"; value: bigint }).value, + op, + ); + case "float64": + return compareFloat( + (pl as { tag: "float64"; value: number }).value, + (pr as { tag: "float64"; value: number }).value, + op, + ); + case "float32": + return compareFloat( + (pl as { tag: "float32"; value: number }).value, + (pr as { tag: "float32"; value: number }).value, + op, + ); + default: + return mkError("unexpected promotion result"); + } +} + +// --------------------------------------------------------------------------- +// Checked integer arithmetic +// --------------------------------------------------------------------------- + + +function checkedInt64Arith(a: bigint, b: bigint, op: string): Value { + switch (op) { + case "+": { + if ( + (b > 0n && a > MAX_INT64 - b) || + (b < 0n && a < MIN_INT64 - b) + ) { + return mkError("int64 overflow"); + } + return mkInt64(a + b); + } + case "-": { + if ( + (b < 0n && a > MAX_INT64 + b) || + (b > 0n && a < MIN_INT64 + b) + ) { + return mkError("int64 overflow"); + } + return mkInt64(a - b); + } + case "*": { + const result = a * b; + if (result > MAX_INT64 || result < MIN_INT64) { + return mkError("int64 overflow"); + } + return mkInt64(result); + } + default: + return mkError("unsupported int64 operation " + op); + } +} + +function checkedInt32Arith(a: number, b: number, op: string): Value { + // Promote to int64, check, then narrow. + const result = checkedInt64Arith(BigInt(a), BigInt(b), op); + if (isError(result)) return result; + const r = (result as { tag: "int64"; value: bigint }).value; + if (r > BigInt(MAX_INT32) || r < BigInt(MIN_INT32)) { + return mkError("int32 overflow"); + } + return mkInt32(Number(r)); +} + +function checkedUint32Arith(a: number, b: number, op: string): Value { + switch (op) { + case "+": { + if (a > MAX_UINT32 - b) return mkError("uint32 overflow"); + return mkUint32(a + b); + } + case "-": { + if (a < b) return mkError("uint32 overflow"); + return mkUint32(a - b); + } + case "*": { + if (a !== 0 && b !== 0) { + const result = a * b; + if (result > MAX_UINT32) { + return mkError("uint32 overflow"); + } + return mkUint32(result); + } + return mkUint32(a * b); + } + default: + return mkError("unsupported uint32 operation " + op); + } +} + +function checkedUint64Arith(a: bigint, b: bigint, op: string): Value { + switch (op) { + case "+": { + if (a > MAX_UINT64 - b) return mkError("uint64 overflow"); + return mkUint64(a + b); + } + case "-": { + if (a < b) return mkError("uint64 overflow"); + return mkUint64(a - b); + } + case "*": { + const result = a * b; + if (result > MAX_UINT64) return mkError("uint64 overflow"); + return mkUint64(result); + } + default: + return mkError("unsupported uint64 operation " + op); + } +} + +function floatArith(a: number, b: number, op: string): Value { + switch (op) { + case "+": + return mkFloat64(a + b); + case "-": + return mkFloat64(a - b); + case "*": + return mkFloat64(a * b); + default: + return mkError("unsupported float64 operation " + op); + } +} + +function float32Arith(a: number, b: number, op: string): Value { + switch (op) { + case "+": + return mkFloat32(a + b); + case "-": + return mkFloat32(a - b); + case "*": + return mkFloat32(a * b); + default: + return mkError("unsupported float32 operation " + op); + } +} + +// --------------------------------------------------------------------------- +// Comparison helpers +// --------------------------------------------------------------------------- + +function compareOrdered(a: bigint, b: bigint, op: string): Value { + switch (op) { + case ">": + return mkBool(a > b); + case ">=": + return mkBool(a >= b); + case "<": + return mkBool(a < b); + case "<=": + return mkBool(a <= b); + default: + return FALSE; + } +} + +function compareFloat(a: number, b: number, op: string): Value { + switch (op) { + case ">": + return mkBool(a > b); + case ">=": + return mkBool(a >= b); + case "<": + return mkBool(a < b); + case "<=": + return mkBool(a <= b); + default: + return FALSE; + } +} + +function stringCompare(a: string, b: string, op: string): Value { + switch (op) { + case ">": + return mkBool(a > b); + case ">=": + return mkBool(a >= b); + case "<": + return mkBool(a < b); + case "<=": + return mkBool(a <= b); + default: + return FALSE; + } +} + +function bytesCompare(a: Uint8Array, b: Uint8Array, op: string): Value { + let cmp = 0; + const len = Math.min(a.length, b.length); + for (let i = 0; i < len; i++) { + if (a[i]! < b[i]!) { + cmp = -1; + break; + } + if (a[i]! > b[i]!) { + cmp = 1; + break; + } + } + if (cmp === 0) { + if (a.length < b.length) cmp = -1; + else if (a.length > b.length) cmp = 1; + } + switch (op) { + case ">": + return mkBool(cmp > 0); + case ">=": + return mkBool(cmp >= 0); + case "<": + return mkBool(cmp < 0); + case "<=": + return mkBool(cmp <= 0); + default: + return FALSE; + } +} + +// --------------------------------------------------------------------------- +// Utility +// --------------------------------------------------------------------------- + +function opVerb(op: TokenType | string): string { + switch (op) { + case TokenType.PLUS: + case "+": + return "add"; + case TokenType.MINUS: + case "-": + return "subtract"; + case TokenType.STAR: + case "*": + return "multiply"; + case TokenType.GT: + case TokenType.GE: + case TokenType.LT: + case TokenType.LE: + case ">": + case ">=": + case "<": + case "<=": + return "compare"; + default: + return "perform arithmetic on"; + } +} + +function arithError(left: Value, right: Value, op: string): Value { + return mkError( + `cannot ${opVerb(op)} ${typeName(left)} and ${typeName(right)}: arithmetic requires numeric types`, + ); +} + +function checkedToFloat64(v: Value): number | null { + switch (v.tag) { + case "int64": + if (v.value > MAX_SAFE_FLOAT64 || v.value < -MAX_SAFE_FLOAT64) return null; + return Number(v.value); + case "int32": + return v.value; + case "uint32": + return v.value; + case "uint64": + if (v.value > MAX_SAFE_FLOAT64) return null; + return Number(v.value); + case "float64": + return v.value; + case "float32": + return v.value; + default: + return null; + } +} diff --git a/internal/bloblang2/ts/src/ast.ts b/internal/bloblang2/ts/src/ast.ts new file mode 100644 index 000000000..4be93770e --- /dev/null +++ b/internal/bloblang2/ts/src/ast.ts @@ -0,0 +1,328 @@ +// AST node types for Bloblang V2. + +import type { Pos, TokenType } from "./token.js"; + +// --- Top-level --- + +export interface Program { + stmts: Stmt[]; + maps: MapDecl[]; + imports: ImportStmt[]; + namespaces: Map; +} + +export interface MapDecl { + pos: Pos; + name: string; + params: Param[]; + body: ExprBody; + namespaces: Map; +} + +export interface Param { + name: string; // empty for discard (_) + default_: Expr | null; + discard: boolean; + pos: Pos; +} + +export interface ImportStmt { + pos: Pos; + path: string; + namespace: string; +} + +// --- Expression body --- + +export interface ExprBody { + assignments: VarAssign[]; + result: Expr; +} + +export interface VarAssign { + pos: Pos; + name: string; + path: PathSegment[]; + value: Expr; +} + +// --- Statements --- + +export type Stmt = Assignment | IfStmt | MatchStmt; + +export interface Assignment { + kind: "assignment"; + pos: Pos; + target: AssignTarget; + value: Expr; +} + +export type AssignTargetRoot = "output" | "var"; + +export interface AssignTarget { + pos: Pos; + root: AssignTargetRoot; + varName: string; + metaAccess: boolean; + path: PathSegment[]; +} + +export interface IfStmt { + kind: "if_stmt"; + pos: Pos; + branches: IfBranch[]; + else_: Stmt[] | null; +} + +export interface IfBranch { + cond: Expr; + body: Stmt[]; +} + +export interface MatchStmt { + kind: "match_stmt"; + pos: Pos; + subject: Expr | null; + binding: string; + cases: MatchStmtCase[]; +} + +export interface MatchStmtCase { + pattern: Expr | null; // null for wildcard + wildcard: boolean; + body: Stmt[]; +} + +export interface MatchExprCase { + pattern: Expr | null; // null for wildcard + wildcard: boolean; + body: Expr | ExprBody; +} + +// --- Expressions --- + +export type Expr = + | LiteralExpr + | ArrayLiteral + | ObjectLiteral + | InputExpr + | InputMetaExpr + | OutputExpr + | OutputMetaExpr + | VarExpr + | IdentExpr + | BinaryExpr + | UnaryExpr + | CallExpr + | MethodCallExpr + | FieldAccessExpr + | IndexExpr + | LambdaExpr + | IfExpr + | MatchExpr + | PathExpr; + +export interface LiteralExpr { + kind: "literal"; + pos: Pos; + tokenType: TokenType; + value: string; +} + +export interface ArrayLiteral { + kind: "array"; + pos: Pos; + elements: Expr[]; +} + +export interface ObjectLiteral { + kind: "object"; + pos: Pos; + entries: ObjectEntry[]; +} + +export interface ObjectEntry { + key: Expr; + value: Expr; +} + +export interface InputExpr { + kind: "input"; + pos: Pos; +} + +export interface InputMetaExpr { + kind: "input_meta"; + pos: Pos; +} + +export interface OutputExpr { + kind: "output"; + pos: Pos; +} + +export interface OutputMetaExpr { + kind: "output_meta"; + pos: Pos; +} + +export interface VarExpr { + kind: "var"; + pos: Pos; + name: string; +} + +export interface IdentExpr { + kind: "ident"; + pos: Pos; + namespace: string; + name: string; +} + +export interface BinaryExpr { + kind: "binary"; + left: Expr; + op: TokenType; + opPos: Pos; + right: Expr; +} + +export interface UnaryExpr { + kind: "unary"; + op: TokenType; + pos: Pos; + operand: Expr; +} + +export interface CallExpr { + kind: "call"; + pos: Pos; + name: string; + namespace: string; + args: CallArg[]; + named: boolean; +} + +export interface CallArg { + name: string; + value: Expr; + // folded is the parse-time-precomputed form of this argument, + // populated by the resolver when the receiving method/function + // exposes an ArgFolder and the argument shape allows folding + // (typically: the value is a string literal). When present, the + // interpreter substitutes folded for the evaluated value verbatim, + // skipping repeat work on every call. Used for e.g. compiling regex + // patterns once at parse time rather than on every invocation. + folded?: unknown; +} + +export interface MethodCallExpr { + kind: "method_call"; + receiver: Expr; + method: string; + methodPos: Pos; + args: CallArg[]; + named: boolean; + nullSafe: boolean; +} + +export interface FieldAccessExpr { + kind: "field_access"; + receiver: Expr; + field: string; + fieldPos: Pos; + nullSafe: boolean; +} + +export interface IndexExpr { + kind: "index"; + receiver: Expr; + index: Expr; + pos: Pos; + nullSafe: boolean; +} + +export interface LambdaExpr { + kind: "lambda"; + pos: Pos; + params: Param[]; + body: ExprBody; +} + +export interface IfExpr { + kind: "if_expr"; + pos: Pos; + branches: IfExprBranch[]; + else_: ExprBody | null; +} + +export interface IfExprBranch { + cond: Expr; + body: ExprBody; +} + +export interface MatchExpr { + kind: "match_expr"; + pos: Pos; + subject: Expr | null; + binding: string; + cases: MatchExprCase[]; +} + +// --- Path expressions (produced by optimizer) --- + +export type PathRoot = + | "input" + | "input_meta" + | "output" + | "output_meta" + | "var"; + +export interface PathExpr { + kind: "path"; + pos: Pos; + root: PathRoot; + varName: string; + segments: PathSegment[]; +} + +export type PathSegmentKind = "field" | "index" | "method"; + +export interface PathSegment { + segKind: PathSegmentKind; + name: string; + index: Expr | null; + args: CallArg[]; + named: boolean; + nullSafe: boolean; + pos: Pos; +} + +// --- Helpers --- + +export function exprPos(e: Expr): Pos { + switch (e.kind) { + case "literal": + case "array": + case "object": + case "input": + case "input_meta": + case "output": + case "output_meta": + case "var": + case "ident": + case "call": + case "lambda": + case "if_expr": + case "match_expr": + case "path": + case "unary": + case "index": + return e.pos; + case "binary": + return exprPos(e.left); + case "method_call": + case "field_access": + return exprPos(e.receiver); + } +} diff --git a/internal/bloblang2/ts/src/index.ts b/internal/bloblang2/ts/src/index.ts new file mode 100644 index 000000000..704918e84 --- /dev/null +++ b/internal/bloblang2/ts/src/index.ts @@ -0,0 +1,32 @@ +// Bloblang V2 — TypeScript implementation. + +export { parse } from "./parser.js"; +export { optimize } from "./optimizer.js"; +export { resolve } from "./resolver.js"; +export { Interpreter } from "./interpreter.js"; +export type { MethodSpec, FunctionSpec, MethodFunc, LambdaMethodFunc, FunctionFunc } from "./interpreter.js"; +export type { FunctionInfo, MethodInfo } from "./resolver.js"; +export type { Program, Expr, Stmt } from "./ast.js"; +export type { PosError, Pos } from "./token.js"; +export { registerStdlib, stdlibNames } from "./stdlib/index.js"; +export { + type Value, + fromJSON, + toJSON, + mkString, + mkInt64, + mkFloat64, + mkArray, + mkObject, + mkBool, + mkError, + NULL, + VOID, + DELETED, + isError, + isVoid, + isDeleted, + typeName, + deepClone, + valuesEqual, +} from "./value.js"; diff --git a/internal/bloblang2/ts/src/interpreter.ts b/internal/bloblang2/ts/src/interpreter.ts new file mode 100644 index 000000000..ccf427502 --- /dev/null +++ b/internal/bloblang2/ts/src/interpreter.ts @@ -0,0 +1,1723 @@ +// Tree-walking interpreter for Bloblang V2. + +import type { + Program, + Stmt, + Expr, + ExprBody, + VarAssign, + MapDecl, + CallArg, + PathSegment, + Param, + MatchStmtCase, + MatchExprCase, + Assignment, + IfStmt, + MatchStmt, + LiteralExpr, + BinaryExpr, + UnaryExpr, + InputExpr, + InputMetaExpr, + OutputExpr, + OutputMetaExpr, + VarExpr, + IdentExpr, + CallExpr, + FieldAccessExpr, + MethodCallExpr, + IndexExpr, + IfExpr, + MatchExpr, + ArrayLiteral, + ObjectLiteral, + LambdaExpr, + PathExpr, +} from "./ast.js"; +import { TokenType } from "./token.js"; +import { + type Value, + mkInt64, + mkFloat64, + mkString, + mkBool, + mkArray, + mkObject, + mkError, + mkFolded, + NULL, + VOID, + DELETED, + TRUE, + FALSE, + isError, + isVoid, + isDeleted, + isNull, + isBool, + isString, + isArray, + isObject, + isNumeric, + isInt64, + isFloat64, + isInt32, + isUint32, + isUint64, + isFloat32, + isTimestamp, + isBytes, + typeName, + deepClone, + valuesEqual, +} from "./value.js"; +import { evalBinaryOp, numericNegate } from "./arithmetic.js"; +import { Scope, type ScopeMode } from "./scope.js"; + +// --------------------------------------------------------------------------- +// Error handling +// --------------------------------------------------------------------------- + +export class RuntimeError extends Error { + constructor(message: string) { + super(message); + this.name = "RuntimeError"; + } +} + +class RecursionError extends Error { + constructor() { + super("maximum recursion depth exceeded"); + this.name = "RecursionError"; + } +} + +// --------------------------------------------------------------------------- +// Type interfaces +// --------------------------------------------------------------------------- + +export type MethodFunc = ( + interp: Interpreter, + receiver: Value, + args: Value[], +) => Value; + +export type LambdaMethodFunc = ( + interp: Interpreter, + receiver: Value, + args: CallArg[], +) => Value; + +export type FunctionFunc = (args: Value[]) => Value; + +export interface MethodParam { + name: string; + default_: Value | null; + hasDefault: boolean; + /** This parameter position accepts a lambda argument. */ + acceptsLambda?: boolean; +} + +export interface MethodSpec { + fn: MethodFunc | null; + lambdaFn: LambdaMethodFunc | null; + intrinsic: boolean; + params: MethodParam[] | null; + acceptsNull: boolean; + /** Method accepts a lambda argument (implicit for lambdaFn-backed methods). */ + acceptsLambda?: boolean; + /** + * argFolder, if set, is surfaced on the MethodInfo so the resolver + * runs parse-time folding on literal arguments (see + * resolver.ArgFolder). + */ + argFolder?: import("./resolver.js").ArgFolder; +} + +export interface FunctionParam { + name: string; + default_: Value | null; + hasDefault: boolean; +} + +export interface FunctionSpec { + fn: FunctionFunc; + params: FunctionParam[]; + /** + * argFolder, if set, is surfaced on the FunctionInfo so the resolver + * runs parse-time folding on literal arguments. + */ + argFolder?: import("./resolver.js").ArgFolder; +} + +// --------------------------------------------------------------------------- +// Maximum recursion depth +// --------------------------------------------------------------------------- + +const MAX_RECURSION_DEPTH = 1024; + +// --------------------------------------------------------------------------- +// Interpreter +// --------------------------------------------------------------------------- + +export class Interpreter { + prog: Program | null; + + // Runtime state. + input: Value = NULL; + inputMeta: Value = mkObject(new Map()); + output: Value = mkObject(new Map()); + outputMeta: Value = mkObject(new Map()); + deleted: boolean = false; + + // Map table: local maps + namespaced imports. + maps: Map = new Map(); + namespaces: Map> = new Map(); + + scope: Scope = new Scope(null, "statement"); + depth: number = 0; + + // Methods and functions (pluggable for extensibility). + methods: Map = new Map(); + functions: Map = new Map(); + + constructor(prog: Program | null) { + this.prog = prog; + + if (prog !== null) { + // Hoist map declarations. + for (const m of prog.maps) { + this.maps.set(m.name, m); + } + + // Build namespace tables from imports. + for (const [ns, maps] of prog.namespaces) { + const table = new Map(); + for (const m of maps) { + table.set(m.name, m); + } + this.namespaces.set(ns, table); + } + } + } + + // --- Registration --- + + registerMethod(name: string, spec: MethodSpec): void { + this.methods.set(name, spec); + } + + registerFunction(name: string, spec: FunctionSpec): void { + this.functions.set(name, spec); + } + + // --- Public API --- + + /** + * Exec runs the program against the given input and metadata. + * Throws RuntimeError on failure. + */ + exec( + input: Value, + metadata: Value, + ): { output: Value; outputMeta: Value; deleted: boolean } { + this.input = input; + this.inputMeta = metadata; + this.output = mkObject(new Map()); + this.outputMeta = mkObject(new Map()); + this.deleted = false; + this.scope = new Scope(null, "statement"); + this.depth = 0; + + for (const stmt of this.prog!.stmts) { + this.execStmt(stmt); + if (this.deleted) { + return { output: NULL, outputMeta: mkObject(new Map()), deleted: true }; + } + } + + return { + output: this.output, + outputMeta: this.outputMeta, + deleted: false, + }; + } + + /** + * Run executes the program with error recovery, converting runtime errors + * to error returns. + */ + run( + input: Value, + metadata: Value, + ): { + output: Value; + outputMeta: Value; + deleted: boolean; + error: string | null; + } { + try { + const result = this.exec(input, metadata); + return { ...result, error: null }; + } catch (e) { + if (e instanceof RuntimeError) { + return { + output: NULL, + outputMeta: mkObject(new Map()), + deleted: false, + error: e.message, + }; + } + if (e instanceof RecursionError) { + return { + output: NULL, + outputMeta: mkObject(new Map()), + deleted: false, + error: e.message, + }; + } + if (e instanceof RangeError && e.message.includes("call stack")) { + return { + output: NULL, + outputMeta: mkObject(new Map()), + deleted: false, + error: "maximum recursion depth exceeded", + }; + } + throw e; // re-throw unexpected errors + } + } + + // --- Statement execution --- + + private execStmt(stmt: Stmt): void { + switch (stmt.kind) { + case "assignment": + this.execAssignment(stmt); + break; + case "if_stmt": + this.execIfStmt(stmt); + break; + case "match_stmt": + this.execMatchStmt(stmt); + break; + } + } + + private execAssignment(a: Assignment): void { + const value = this.evalExpr(a.value); + + // Error propagation: if value is an error, it halts the mapping. + if (isError(value)) { + throw new RuntimeError(value.message); + } + + // Void handling. + if (isVoid(value)) { + // For variable targets: declaration with void is an error, + // reassignment with void skips the assignment. + if ( + a.target.root === "var" && + a.target.path.length === 0 + ) { + if (this.scope.get(a.target.varName) === undefined) { + throw new RuntimeError( + "void in variable declaration (use .or() to provide a default)", + ); + } + } + return; + } + + switch (a.target.root) { + case "output": { + if (a.target.metaAccess) { + // Metadata root assignment. + if (a.target.path.length === 0) { + if (isDeleted(value)) { + throw new RuntimeError("cannot delete metadata object"); + } + if (!isObject(value)) { + throw new RuntimeError( + `metadata must be an object, got ${typeName(value)}`, + ); + } + this.outputMeta = deepClone(value); + return; + } + const metaRef: { v: Value } = { v: this.outputMeta }; + this.assignPath(metaRef, a.target.path, value); + this.outputMeta = metaRef.v; + } else { + // Message drop: output = deleted() + if (a.target.path.length === 0 && isDeleted(value)) { + this.deleted = true; + return; + } + const outputRef: { v: Value } = { v: this.output }; + this.assignPath(outputRef, a.target.path, value); + this.output = outputRef.v; + } + break; + } + case "var": { + if (isDeleted(value)) { + if (a.target.path.length === 0) { + throw new RuntimeError( + "cannot assign deleted() to a variable", + ); + } + } + if (a.target.path.length === 0) { + this.scope.set(a.target.varName, deepClone(value)); + } else { + // Path assignment to an undeclared variable declares it and + // auto-creates the root based on the first path segment (spec + // Section 3.7). + const existing = this.scope.get(a.target.varName); + const clone = existing === undefined ? NULL : deepClone(existing); + const ref: { v: Value } = { v: clone }; + this.assignPath(ref, a.target.path, value); + this.scope.set(a.target.varName, ref.v); + } + break; + } + } + } + + private execIfStmt(s: IfStmt): void { + for (const branch of s.branches) { + const cond = this.evalExpr(branch.cond); + if (isError(cond)) { + throw new RuntimeError(cond.message); + } + if (!isBool(cond)) { + throw new RuntimeError( + `if condition must be boolean, got ${typeName(cond)}`, + ); + } + if (cond.value) { + const childScope = new Scope(this.scope, "statement"); + const saved = this.scope; + this.scope = childScope; + for (const stmt of branch.body) { + this.execStmt(stmt); + if (this.deleted) { + this.scope = saved; + return; + } + } + this.scope = saved; + return; + } + } + + if (s.else_ !== null) { + const childScope = new Scope(this.scope, "statement"); + const saved = this.scope; + this.scope = childScope; + for (const stmt of s.else_) { + this.execStmt(stmt); + if (this.deleted) { + this.scope = saved; + return; + } + } + this.scope = saved; + } + } + + private execMatchStmt(s: MatchStmt): void { + let subject: Value = NULL; + if (s.subject !== null) { + subject = this.evalExpr(s.subject); + if (isError(subject)) { + throw new RuntimeError(subject.message); + } + } + + for (const c of s.cases) { + const [matched, errVal] = this.matchCaseMatches( + c.pattern, + c.wildcard, + subject, + s.binding, + s.subject !== null, + ); + if (errVal !== null) { + throw new RuntimeError( + isError(errVal) ? errVal.message : "match error", + ); + } + if (matched) { + const childScope = new Scope(this.scope, "statement"); + if (s.binding !== "") { + childScope.vars.set(s.binding, subject); + } + const saved = this.scope; + this.scope = childScope; + for (const stmt of c.body) { + this.execStmt(stmt); + if (this.deleted) { + this.scope = saved; + return; + } + } + this.scope = saved; + return; + } + } + } + + // --- Expression evaluation --- + + evalExpr(expr: Expr): Value { + switch (expr.kind) { + case "literal": + return this.evalLiteral(expr); + case "binary": + return this.evalBinary(expr); + case "unary": + return this.evalUnary(expr); + case "input": + return this.input; + case "input_meta": + return this.inputMeta; + case "output": + return deepClone(this.output); + case "output_meta": + return deepClone(this.outputMeta); + case "var": { + const v = this.scope.get(expr.name); + if (v === undefined) { + throw new RuntimeError("undefined variable $" + expr.name); + } + return v; + } + case "ident": + return this.evalIdent(expr); + case "call": + return this.evalCall(expr); + case "field_access": + return this.evalFieldAccess(expr); + case "method_call": + return this.evalMethodCall(expr); + case "index": + return this.evalIndex(expr); + case "if_expr": + return this.evalIfExpr(expr); + case "match_expr": + return this.evalMatchExpr(expr); + case "array": + return this.evalArrayLiteral(expr); + case "object": + return this.evalObjectLiteral(expr); + case "lambda": + throw new RuntimeError( + "lambda expression cannot be used as a value", + ); + case "path": + return this.evalPathExpr(expr); + } + } + + private evalLiteral(e: LiteralExpr): Value { + switch (e.tokenType) { + case TokenType.INT: + return mkInt64(BigInt(e.value)); + case TokenType.FLOAT: + return mkFloat64(parseFloat(e.value)); + case TokenType.STRING: + case TokenType.RAW_STRING: + return mkString(e.value); + case TokenType.TRUE: + return TRUE; + case TokenType.FALSE: + return FALSE; + case TokenType.NULL: + return NULL; + default: + return NULL; + } + } + + private evalBinary(e: BinaryExpr): Value { + const left = this.evalExpr(e.left); + if (isError(left)) return left; + if (isVoid(left)) return mkError("void in expression"); + if (isDeleted(left)) return mkError("deleted value in expression"); + + // Short-circuit for logical operators. + if (e.op === TokenType.AND) { + if (!isBool(left)) { + return mkError( + `&& requires boolean operands, got ${typeName(left)}`, + ); + } + if (!left.value) return FALSE; + const right = this.evalExpr(e.right); + if (isError(right)) return right; + if (!isBool(right)) { + return mkError( + `&& requires boolean operands, got ${typeName(right)}`, + ); + } + return right; + } + if (e.op === TokenType.OR) { + if (!isBool(left)) { + return mkError( + `|| requires boolean operands, got ${typeName(left)}`, + ); + } + if (left.value) return TRUE; + const right = this.evalExpr(e.right); + if (isError(right)) return right; + if (!isBool(right)) { + return mkError( + `|| requires boolean operands, got ${typeName(right)}`, + ); + } + return right; + } + + const right = this.evalExpr(e.right); + if (isError(right)) return right; + if (isVoid(right)) return mkError("void in expression"); + if (isDeleted(right)) return mkError("deleted value in expression"); + + return evalBinaryOp(e.op, left, right); + } + + private evalUnary(e: UnaryExpr): Value { + const operand = this.evalExpr(e.operand); + if (isError(operand)) return operand; + if (isVoid(operand)) return mkError("void in expression"); + if (isDeleted(operand)) return mkError("deleted value in expression"); + + switch (e.op) { + case TokenType.MINUS: + return numericNegate(operand); + case TokenType.BANG: { + if (!isBool(operand)) { + return mkError( + `! requires boolean operand, got ${typeName(operand)}`, + ); + } + return mkBool(!operand.value); + } + default: + return mkError(`unknown unary operator ${e.op}`); + } + } + + private evalFieldAccess(e: FieldAccessExpr): Value { + const receiver = this.evalExpr(e.receiver); + if (isError(receiver)) return receiver; + if (e.nullSafe && isNull(receiver)) return NULL; + if (isNull(receiver)) { + return mkError(`cannot access field "${e.field}" on null`); + } + if (!isObject(receiver)) { + return mkError( + `cannot access field "${e.field}" on ${typeName(receiver)}`, + ); + } + return receiver.value.get(e.field) ?? NULL; + } + + private evalIndex(e: IndexExpr): Value { + const receiver = this.evalExpr(e.receiver); + if (isError(receiver)) return receiver; + if (e.nullSafe && isNull(receiver)) return NULL; + + const index = this.evalExpr(e.index); + if (isError(index)) return index; + + return this.indexValue(receiver, index); + } + + private evalMethodCall(e: MethodCallExpr): Value { + // Intrinsic: .catch() + if (e.method === "catch") { + return this.evalCatch(e); + } + + // Intrinsic: .or() + if (e.method === "or") { + return this.evalOr(e); + } + + const receiver = this.evalExpr(e.receiver); + + // Error propagation. + if (isError(receiver)) return receiver; + + // Null-safe. + if (e.nullSafe && isNull(receiver)) return NULL; + + // Look up the method. + const spec = this.methods.get(e.method); + if (spec === undefined) { + if (isNull(receiver)) { + return mkError(`.${e.method}() does not support null`); + } + return mkError(`unknown method .${e.method}()`); + } + + // Null check using spec metadata. + if (isNull(receiver) && !e.nullSafe && !spec.acceptsNull) { + return mkError(`.${e.method}() does not support null`); + } + + // Void and deleted in method calls. + if (isVoid(receiver)) { + return mkError("cannot call method on void"); + } + if (isDeleted(receiver)) { + return mkError("cannot call method on deleted value"); + } + + // Lambda methods: receive unevaluated AST args. + if (spec.lambdaFn !== null) { + let args = e.args; + if (e.named && spec.params !== null) { + args = reorderNamedCallArgs(args, spec.params); + } + return spec.lambdaFn(this, receiver, args); + } + + // Evaluate arguments, resolving named args to positional if needed. + let args: Value[]; + if (e.named) { + const resolved = this.resolveNamedMethodArgs(e); + if (isError(resolved)) return resolved; + args = (resolved as { tag: "array"; value: Value[] }).value; + } else { + args = this.evalArgs(e.args); + } + for (const a of args) { + if (isError(a)) return a; + } + + return spec.fn!(this, receiver, args); + } + + private evalCatch(e: MethodCallExpr): Value { + const receiver = this.evalExpr(e.receiver); + + // .catch() passes non-errors through unchanged. + if (!isError(receiver)) return receiver; + + // Error: invoke the catch handler lambda. + if (e.args.length !== 1) { + return mkError(".catch() requires exactly one argument"); + } + const lambdaExpr = e.args[0]!.value; + if (lambdaExpr.kind !== "lambda") { + return mkError(".catch() argument must be a lambda"); + } + + // Build the error object: {"what": "error message"}. + const errObj = mkObject( + new Map([["what", mkString(receiver.message)]]), + ); + + return this.callLambda(lambdaExpr, [errObj]); + } + + private evalOr(e: MethodCallExpr): Value { + const receiver = this.evalExpr(e.receiver); + + // .or() rescues null, void, and deleted. + if (!isNull(receiver) && !isVoid(receiver) && !isDeleted(receiver)) { + return receiver; + } + + // Short-circuit: only evaluate the argument when rescuing. + if (e.args.length !== 1) { + return mkError(".or() requires exactly one argument"); + } + return this.evalExpr(e.args[0]!.value); + } + + /** Public for stdlib lambda methods that need to call lambdas. */ + callLambda(lambda: LambdaExpr, args: Value[]): Value { + const lambdaScope = new Scope(this.scope, "expression"); + for (let i = 0; i < lambda.params.length; i++) { + const p = lambda.params[i]!; + if (p.discard) continue; + if (i < args.length) { + lambdaScope.vars.set(p.name, deepClone(args[i]!)); + } else if (p.default_ !== null) { + lambdaScope.vars.set(p.name, this.evalExpr(p.default_)); + } + } + + const saved = this.scope; + this.scope = lambdaScope; + const result = this.evalExprBody(lambda.body); + this.scope = saved; + + return result; + } + + private evalCall(e: CallExpr): Value { + // Check for namespace-qualified call. + if (e.namespace !== "") { + return this.callNamespaced(e); + } + + // Check for user-defined map. + const mapDecl = this.maps.get(e.name); + if (mapDecl !== undefined) { + return this.callMap(mapDecl, e); + } + + // Check stdlib functions. + const spec = this.functions.get(e.name); + if (spec !== undefined) { + let args: Value[]; + if (e.named) { + const resolved = this.resolveNamedFuncArgs(e, spec); + if (isError(resolved)) return resolved; + args = (resolved as { tag: "array"; value: Value[] }).value; + } else { + args = this.evalArgs(e.args); + } + for (const a of args) { + if (isError(a)) return a; + } + return spec.fn(args); + } + + return mkError(`unknown function ${e.name}()`); + } + + private callNamespaced(e: CallExpr): Value { + const ns = this.namespaces.get(e.namespace); + if (ns === undefined) { + return mkError(`unknown namespace "${e.namespace}"`); + } + const m = ns.get(e.name); + if (m === undefined) { + return mkError(`unknown function ${e.namespace}::${e.name}()`); + } + return this.callMap(m, e); + } + + private callMap(m: MapDecl, e: CallExpr): Value { + this.depth++; + if (this.depth > MAX_RECURSION_DEPTH) { + throw new RecursionError(); + } + + try { + // Evaluate and bind parameters into an isolated scope. + const mapScope = new Scope(null, "expression"); + if (e.named) { + const err = this.bindNamedMapParams(mapScope, m, e); + if (err !== "") return mkError(err); + } else { + const args = this.evalArgs(e.args); + for (const a of args) { + if (isError(a)) return a; + } + const err = this.bindPositionalParams(mapScope, m.params, args); + if (err !== "") return mkError(err); + } + + // Evaluate the map body. If the map has its own namespace context, + // temporarily switch to it. + const savedScope = this.scope; + const savedNamespaces = this.namespaces; + const savedMaps = this.maps; + + this.scope = mapScope; + if (m.namespaces !== undefined && m.namespaces.size > 0) { + const nsTable = new Map>(); + for (const [ns, maps] of m.namespaces) { + const table = new Map(); + for (const md of maps) { + table.set(md.name, md); + } + nsTable.set(ns, table); + } + this.namespaces = nsTable; + } + + const result = this.evalExprBody(m.body); + + this.scope = savedScope; + this.namespaces = savedNamespaces; + this.maps = savedMaps; + + return result; + } finally { + this.depth--; + } + } + + private evalIdent(e: IdentExpr): Value { + // Qualified reference — only valid in higher-order method args. + if (e.namespace !== "") { + return mkError( + e.namespace + + "::" + + e.name + + " cannot be used as a value (pass to a higher-order method or call with parentheses)", + ); + } + // Check scope (parameters, variables). + const v = this.scope.get(e.name); + if (v !== undefined) return v; + // Bare map name without call — error per spec. + if (this.maps.has(e.name)) { + return mkError( + "map " + + e.name + + " cannot be used as a value (call it with parentheses)", + ); + } + return mkError("undefined identifier " + e.name); + } + + private evalIfExpr(e: IfExpr): Value { + for (const branch of e.branches) { + const cond = this.evalExpr(branch.cond); + if (isError(cond)) return cond; + if (!isBool(cond)) { + return mkError( + `if condition must be boolean, got ${typeName(cond)}`, + ); + } + if (cond.value) { + const childScope = new Scope(this.scope, "expression"); + const saved = this.scope; + this.scope = childScope; + const result = this.evalExprBody(branch.body); + this.scope = saved; + return result; + } + } + + if (e.else_ !== null) { + const childScope = new Scope(this.scope, "expression"); + const saved = this.scope; + this.scope = childScope; + const result = this.evalExprBody(e.else_); + this.scope = saved; + return result; + } + + return VOID; + } + + private evalMatchExpr(e: MatchExpr): Value { + let subject: Value = NULL; + if (e.subject !== null) { + subject = this.evalExpr(e.subject); + if (isError(subject)) return subject; + } + + for (const c of e.cases) { + const [matched, errVal] = this.matchCaseMatches( + c.pattern, + c.wildcard, + subject, + e.binding, + e.subject !== null, + ); + if (errVal !== null) return errVal; + if (matched) { + const childScope = new Scope(this.scope, "expression"); + if (e.binding !== "") { + childScope.vars.set(e.binding, subject); + } + const saved = this.scope; + this.scope = childScope; + + let result: Value; + const body = c.body; + if ("assignments" in body) { + // ExprBody + result = this.evalExprBody(body as ExprBody); + } else { + // Expr + result = this.evalExpr(body as Expr); + } + + this.scope = saved; + return result; + } + } + + return VOID; + } + + /** + * matchCaseMatches returns [matched, errorValue]. If errorValue is non-null, + * the case expression produced an error that should be propagated. + */ + private matchCaseMatches( + pattern: Expr | null, + wildcard: boolean, + subject: Value, + binding: string, + hasSubject: boolean, + ): [boolean, Value | null] { + if (wildcard) return [true, null]; + + if (hasSubject && binding === "") { + // Equality match: compare pattern against subject. + const patternVal = this.evalExpr(pattern!); + if (isError(patternVal)) return [false, patternVal]; + // Boolean case values are a runtime error in equality match. + if (isBool(patternVal)) { + return [ + false, + mkError( + "boolean case value in equality match (use 'as' for boolean conditions)", + ), + ]; + } + return [valuesEqual(subject, patternVal), null]; + } + + // Boolean match (with or without 'as'): case must evaluate to bool. + if (binding !== "") { + const childScope = new Scope(this.scope, this.scope.mode); + childScope.vars.set(binding, subject); + const saved = this.scope; + this.scope = childScope; + const patternVal = this.evalExpr(pattern!); + this.scope = saved; + if (isError(patternVal)) return [false, patternVal]; + if (!isBool(patternVal)) { + return [ + false, + mkError( + `boolean match case must evaluate to bool, got ${typeName(patternVal)}`, + ), + ]; + } + return [patternVal.value, null]; + } + + const patternVal = this.evalExpr(pattern!); + if (isError(patternVal)) return [false, patternVal]; + if (!isBool(patternVal)) { + return [ + false, + mkError( + `boolean match case must evaluate to bool, got ${typeName(patternVal)}`, + ), + ]; + } + return [patternVal.value, null]; + } + + private evalArrayLiteral(e: ArrayLiteral): Value { + const result: Value[] = []; + for (const elem of e.elements) { + const val = this.evalExpr(elem); + if (isError(val)) return val; + if (isVoid(val)) { + return mkError( + "void in array literal (use deleted() to omit elements, or add an else branch)", + ); + } + if (isDeleted(val)) continue; // deleted elements are removed + result.push(val); + } + return mkArray(result); + } + + private evalObjectLiteral(e: ObjectLiteral): Value { + const result = new Map(); + for (const entry of e.entries) { + const key = this.evalExpr(entry.key); + if (isError(key)) return key; + if (!isString(key)) { + return mkError( + `object key must be string, got ${typeName(key)}`, + ); + } + const val = this.evalExpr(entry.value); + if (isError(val)) return val; + if (isVoid(val)) { + return mkError( + "void in object literal (use deleted() to omit fields, or add an else branch)", + ); + } + if (isDeleted(val)) continue; // deleted fields are removed + result.set(key.value, val); + } + return mkObject(result); + } + + evalExprBody(body: ExprBody): Value { + for (const va of body.assignments) { + const val = this.evalExpr(va.value); + if (isError(val)) return val; + if (isVoid(val)) { + // Void in variable declaration is an error. + // Void in reassignment (variable exists in any reachable scope) skips. + if (this.scope.get(va.name) !== undefined) { + continue; + } + return mkError( + "void in variable declaration (use .or() to provide a default)", + ); + } + if (isDeleted(val)) { + if (va.path.length === 0) { + return mkError("cannot assign deleted() to a variable"); + } + } + if (va.path.length === 0) { + this.scope.set(va.name, deepClone(val)); + } else { + // Path assignment to an undeclared variable declares it (Section 3.7). + const existing = this.scope.get(va.name); + const clone = existing === undefined ? NULL : deepClone(existing); + const ref: { v: Value } = { v: clone }; + this.assignPath(ref, va.path, val); + this.scope.set(va.name, ref.v); + } + } + return this.evalExpr(body.result); + } + + private evalPathExpr(e: PathExpr): Value { + let root: Value; + switch (e.root) { + case "input": + root = this.input; + break; + case "input_meta": + root = this.inputMeta; + break; + case "output": + root = deepClone(this.output); + break; + case "output_meta": + root = deepClone(this.outputMeta); + break; + case "var": { + const v = this.scope.get(e.varName); + if (v === undefined) { + return mkError("undefined variable $" + e.varName); + } + root = v; + break; + } + } + + let current: Value = root; + for (const seg of e.segments) { + if (isError(current)) return current; + + switch (seg.segKind) { + case "field": { + if (seg.nullSafe && isNull(current)) { + return NULL; + } + if (!isObject(current)) { + return mkError( + `cannot access field "${seg.name}" on ${typeName(current)}`, + ); + } + current = current.value.get(seg.name) ?? NULL; + break; + } + case "index": { + if (seg.nullSafe && isNull(current)) { + return NULL; + } + const idx = this.evalExpr(seg.index!); + if (isError(idx)) return idx; + current = this.indexValue(current, idx); + if (isError(current)) return current; + break; + } + case "method": { + if (seg.nullSafe && isNull(current)) { + return NULL; + } + const spec = this.methods.get(seg.name); + if (spec === undefined) { + return mkError(`unknown method .${seg.name}()`); + } + // Intrinsic methods cannot appear in path expressions. + if (spec.intrinsic) { + return mkError( + `.${seg.name}() cannot be used in path expressions`, + ); + } + if ( + isNull(current) && + !seg.nullSafe && + !spec.acceptsNull + ) { + return mkError(`.${seg.name}() does not support null`); + } + if (isVoid(current)) { + return mkError("cannot call method on void"); + } + if (isDeleted(current)) { + return mkError("cannot call method on deleted value"); + } + if (spec.lambdaFn !== null) { + let lambdaArgs = seg.args; + if (seg.named && spec.params !== null) { + lambdaArgs = reorderNamedCallArgs( + lambdaArgs, + spec.params, + ); + } + current = spec.lambdaFn(this, current, lambdaArgs); + } else { + const args = this.evalArgs(seg.args); + for (const a of args) { + if (isError(a)) return a; + } + current = spec.fn!(this, current, args); + } + break; + } + } + } + return current; + } + + // --- Helpers --- + + private resolveNamedArgs( + callArgs: CallArg[], + params: { name: string; default_: Value | null; hasDefault: boolean }[], + context: string, + ): Value { + if (params.length === 0) { + // No parameter metadata — evaluate named args by name order. + const args: Value[] = []; + for (const arg of callArgs) { + if (arg.folded !== undefined) { + args.push(mkFolded(arg.folded)); + continue; + } + const v = this.evalExpr(arg.value); + if (isError(v)) return v; + args.push(v); + } + return mkArray(args); + } + + // Build named arg map. + const named = new Map(); + for (const arg of callArgs) { + const v: Value = arg.folded !== undefined + ? mkFolded(arg.folded) + : this.evalExpr(arg.value); + if (isError(v)) return v; + named.set(arg.name, v); + } + + // Map to positional based on parameter metadata. + const args: Value[] = new Array(params.length); + for (let i = 0; i < params.length; i++) { + const p = params[i]!; + const v = named.get(p.name); + if (v !== undefined) { + args[i] = v; + } else if (p.hasDefault) { + args[i] = p.default_ ?? NULL; + } else { + return mkError( + `${context}: missing required argument "${p.name}"`, + ); + } + } + return mkArray(args); + } + + private resolveNamedMethodArgs(e: MethodCallExpr): Value { + const spec = this.methods.get(e.method); + let params: { + name: string; + default_: Value | null; + hasDefault: boolean; + }[] = []; + if (spec !== undefined && spec.params !== null) { + params = spec.params; + } + return this.resolveNamedArgs(e.args, params, "." + e.method + "()"); + } + + private resolveNamedFuncArgs(e: CallExpr, spec: FunctionSpec): Value { + const params = spec.params.map((p) => ({ + name: p.name, + default_: p.default_, + hasDefault: p.hasDefault, + })); + const resolved = this.resolveNamedArgs(e.args, params, e.name + "()"); + if (isError(resolved)) return resolved; + const args = (resolved as { tag: "array"; value: Value[] }).value; + + // Truncate trailing default-filled args. + const provided = new Set(); + for (const arg of e.args) { + provided.add(arg.name); + } + let lastExplicit = -1; + for (let i = 0; i < spec.params.length; i++) { + if (provided.has(spec.params[i]!.name)) { + lastExplicit = i; + } + } + if (lastExplicit >= 0 && lastExplicit < args.length - 1) { + return mkArray(args.slice(0, lastExplicit + 1)); + } + return mkArray(args); + } + + evalArgs(args: CallArg[]): Value[] { + const result: Value[] = new Array(args.length); + for (let i = 0; i < args.length; i++) { + const a = args[i]!; + // Parse-time-folded values (e.g. precompiled regex patterns + // produced by the resolver's ArgFolder hook) substitute for the + // AST expression verbatim, skipping re-evaluation and the + // void/deleted checks. + if (a.folded !== undefined) { + result[i] = mkFolded(a.folded); + continue; + } + const v = this.evalExpr(a.value); + if (isVoid(v)) { + result[i] = mkError( + "void passed as argument (use .or() to provide a default)", + ); + } else if (isDeleted(v)) { + result[i] = mkError("deleted() passed as argument"); + } else { + result[i] = v; + } + } + return result; + } + + private bindPositionalParams( + s: Scope, + params: Param[], + args: Value[], + ): string { + let argIdx = 0; + for (const p of params) { + if (p.discard) { + if (argIdx < args.length) argIdx++; + continue; + } + if (argIdx < args.length) { + s.vars.set(p.name, deepClone(args[argIdx]!)); + argIdx++; + } else if (p.default_ !== null) { + s.vars.set(p.name, this.evalExpr(p.default_)); + } else { + return `missing argument for parameter "${p.name}"`; + } + } + return ""; + } + + private bindNamedMapParams( + s: Scope, + m: MapDecl, + e: CallExpr, + ): string { + // Build namedArgParam descriptors, evaluating AST defaults. + const params: { + name: string; + default_: Value | null; + hasDefault: boolean; + }[] = []; + for (const p of m.params) { + if (p.discard) continue; + const nap: { + name: string; + default_: Value | null; + hasDefault: boolean; + } = { name: p.name, default_: null, hasDefault: false }; + if (p.default_ !== null) { + nap.hasDefault = true; + nap.default_ = this.evalExpr(p.default_); + } + params.push(nap); + } + + const resolved = this.resolveNamedArgs(e.args, params, e.name + "()"); + if (isError(resolved)) return resolved.message; + const args = (resolved as { tag: "array"; value: Value[] }).value; + + // Bind into scope. + for (let i = 0; i < params.length; i++) { + if (i < args.length) { + s.vars.set(params[i]!.name, deepClone(args[i]!)); + } + } + return ""; + } + + // --- Path assignment --- + + private assignPath( + root: { v: Value }, + path: PathSegment[], + value: Value, + ): void { + if (path.length === 0) { + root.v = deepClone(value); + return; + } + this.assignPathRecursive(root, path, 0, value); + } + + private assignPathRecursive( + current: { v: Value }, + path: PathSegment[], + pathIdx: number, + value: Value, + ): void { + const seg = path[pathIdx]!; + const isLast = pathIdx === path.length - 1; + + switch (seg.segKind) { + case "field": { + // Ensure current is an object. Auto-create only from null. + let obj: Map; + if (isObject(current.v)) { + obj = current.v.value; + } else if (isNull(current.v)) { + obj = new Map(); + current.v = mkObject(obj); + } else { + throw new RuntimeError( + `cannot access field "${seg.name}" on ${typeName(current.v)} (expected object)`, + ); + } + + if (isLast) { + if (isDeleted(value)) { + obj.delete(seg.name); + } else { + obj.set(seg.name, value); + } + return; + } + + let child = obj.get(seg.name); + if (child === undefined) { + child = NULL; + } + const childRef: { v: Value } = { v: child }; + this.assignPathRecursive(childRef, path, pathIdx + 1, value); + obj.set(seg.name, childRef.v); + break; + } + case "index": { + const idx = this.evalExpr(seg.index!); + if (isError(idx)) return; + + // String index → object field. + if (isString(idx)) { + let obj: Map; + if (isObject(current.v)) { + obj = current.v.value; + } else { + obj = new Map(); + current.v = mkObject(obj); + } + if (isLast) { + if (isDeleted(value)) { + obj.delete(idx.value); + } else { + obj.set(idx.value, value); + } + return; + } + let child = obj.get(idx.value); + if (child === undefined) { + child = mkObject(new Map()); + } + const childRef: { v: Value } = { v: child }; + this.assignPathRecursive(childRef, path, pathIdx + 1, value); + obj.set(idx.value, childRef.v); + return; + } + + // Integer index → array element. + const i64 = valueToInt64(idx); + if (i64 === null) return; + + let arr: Value[]; + if (isArray(current.v)) { + arr = current.v.value; + } else if (isNull(current.v)) { + arr = []; + current.v = mkArray(arr); + } else { + throw new RuntimeError( + `cannot index into ${typeName(current.v)} (expected array)`, + ); + } + + let i = Number(i64); + // Handle negative indexing. + if (i < 0) { + i += arr.length; + } + + if (isLast && isDeleted(value)) { + // Delete array element: remove and shift. + if (i < 0 || i >= arr.length) { + throw new RuntimeError( + "array index deletion: index out of bounds", + ); + } + arr.splice(i, 1); + return; + } + + // Grow array with null gaps if needed. + while (arr.length <= i) { + arr.push(NULL); + } + + if (isLast) { + arr[i] = value; + return; + } + + let child = arr[i]!; + if (isNull(child)) { + child = mkObject(new Map()); + } + const childRef: { v: Value } = { v: child }; + this.assignPathRecursive(childRef, path, pathIdx + 1, value); + arr[i] = childRef.v; + break; + } + } + } + + // --- Indexing --- + + indexValue(receiver: Value, index: Value): Value { + if (isObject(receiver)) { + if (!isString(index)) { + return mkError( + `non-string index on object: got ${typeName(index)}`, + ); + } + return receiver.value.get(index.value) ?? NULL; + } + + if (isArray(receiver)) { + return indexSequence( + index, + receiver.value.length, + (i) => receiver.value[i]!, + ); + } + + if (isString(receiver)) { + const codepoints = [...receiver.value]; // splits into codepoints + return indexSequence(index, codepoints.length, (i) => + mkInt64(BigInt(codepoints[i]!.codePointAt(0)!)), + ); + } + + if (isBytes(receiver)) { + return indexSequence(index, receiver.value.length, (i) => + mkInt64(BigInt(receiver.value[i]!)), + ); + } + + if (isNull(receiver)) { + return mkError("cannot index null value"); + } + + return mkError(`cannot index ${typeName(receiver)}`); + } + + // --- Lambda / map-ref extraction (for stdlib) --- + + /** + * Extract a lambda expression from the first argument, handling both + * direct lambdas and bare map-name references. + */ + extractLambdaOrMapRef(args: CallArg[]): LambdaExpr | null { + if (args.length === 0) return null; + + const firstValue = args[0]!.value; + + // Direct lambda. + if (firstValue.kind === "lambda") return firstValue; + + // Bare identifier or qualified reference → map name reference. + if (firstValue.kind === "ident") { + if (firstValue.namespace !== "") { + return this.synthesizeNamespacedMapLambda(firstValue); + } + const m = this.maps.get(firstValue.name); + if (m !== undefined) { + return this.synthesizeMapLambda( + firstValue.pos, + firstValue.name, + "", + m, + ); + } + } + + return null; + } + + private synthesizeMapLambda( + pos: { file: string; line: number; column: number }, + name: string, + namespace: string, + m: MapDecl, + ): LambdaExpr | null { + let required = 0; + for (const p of m.params) { + if (p.default_ === null && !p.discard) required++; + } + if (required !== 1) return null; + return { + kind: "lambda", + pos, + params: [{ name: "__arg", default_: null, discard: false, pos }], + body: { + assignments: [], + result: { + kind: "call", + pos, + namespace, + name, + args: [ + { + name: "", + value: { kind: "ident", pos, namespace: "", name: "__arg" }, + }, + ], + named: false, + }, + }, + }; + } + + private synthesizeNamespacedMapLambda( + ident: IdentExpr, + ): LambdaExpr | null { + const ns = this.namespaces.get(ident.namespace); + if (ns === undefined) return null; + const m = ns.get(ident.name); + if (m === undefined) return null; + return this.synthesizeMapLambda( + ident.pos, + ident.name, + ident.namespace, + m, + ); + } +} + +// --------------------------------------------------------------------------- +// Module-level helpers +// --------------------------------------------------------------------------- + +function reorderNamedCallArgs( + args: CallArg[], + params: MethodParam[], +): CallArg[] { + const byName = new Map(); + for (const arg of args) { + byName.set(arg.name, arg); + } + const result: CallArg[] = []; + for (const p of params) { + const arg = byName.get(p.name); + if (arg !== undefined) { + result.push(arg); + } else if (!p.hasDefault) { + // Required param missing — append placeholder. + result.push({ name: "", value: { kind: "literal", pos: { file: "", line: 0, column: 0 }, tokenType: TokenType.NULL, value: "null" } }); + } + // Optional param missing: omit. + } + return result; +} + +/** + * Convert a Value to a bigint index, or return null if not an integer-like value. + */ +function valueToInt64(v: Value): bigint | null { + switch (v.tag) { + case "int64": + return v.value; + case "int32": + return BigInt(v.value); + case "uint32": + return BigInt(v.value); + case "uint64": + if (v.value > BigInt("9223372036854775807")) return null; + return BigInt(v.value); + case "float64": { + if (!Number.isFinite(v.value) || v.value !== Math.trunc(v.value)) + return null; + if ( + v.value > Number.MAX_SAFE_INTEGER || + v.value < Number.MIN_SAFE_INTEGER + ) + return null; + return BigInt(v.value); + } + case "float32": { + const f = v.value; + if (!Number.isFinite(f) || f !== Math.trunc(f)) return null; + if (f > Number.MAX_SAFE_INTEGER || f < Number.MIN_SAFE_INTEGER) + return null; + return BigInt(f); + } + default: + return null; + } +} + +function indexSequence( + index: Value, + length: number, + get: (i: number) => Value, +): Value { + const i64 = valueToInt64(index); + if (i64 === null) { + // Distinguish non-numeric from non-whole-number float. + if (isFloat64(index) || isFloat32(index)) { + const f = index.value; + if (f !== Math.trunc(f)) { + return mkError( + "index must be a whole number, got float with fractional part", + ); + } + } + return mkError(`non-numeric index: got ${typeName(index)}`); + } + let idx = Number(i64); + if (idx < 0) idx += length; + if (idx < 0 || idx >= length) { + return mkError("index out of bounds"); + } + return get(idx); +} diff --git a/internal/bloblang2/ts/src/optimizer.ts b/internal/bloblang2/ts/src/optimizer.ts new file mode 100644 index 000000000..c39839782 --- /dev/null +++ b/internal/bloblang2/ts/src/optimizer.ts @@ -0,0 +1,464 @@ +// Post-parse AST optimizer for Bloblang V2. +// - Path collapse: chains of field/index/method access → PathExpr +// - Constant folding: literal-only expressions evaluated at compile time +// - Dead code elimination: unreachable if/match branches pruned + +import { TokenType } from "./token.js"; +import { MAX_INT64, MIN_INT64 } from "./value.js"; +import type { + Program, + Stmt, + Expr, + ExprBody, + LiteralExpr, + BinaryExpr, + UnaryExpr, + IfStmt, + IfExpr, + MatchStmt, + MatchExpr, + PathExpr, + PathRoot, + PathSegment, + IfExprBranch, + IfBranch, +} from "./ast.js"; + +export function optimize(prog: Program): void { + for (let i = 0; i < prog.stmts.length; i++) { + prog.stmts[i] = optimizeStmt(prog.stmts[i]!); + } + for (const m of prog.maps) { + optimizeExprBody(m.body); + } +} + +// --- Statement optimization --- + +function optimizeStmt(stmt: Stmt): Stmt { + switch (stmt.kind) { + case "assignment": + stmt.value = optimizeExpr(stmt.value); + return stmt; + case "if_stmt": + optimizeIfStmt(stmt); + return stmt; + case "match_stmt": + optimizeMatchStmt(stmt); + return stmt; + } +} + +function optimizeIfStmt(s: IfStmt): void { + const kept: IfBranch[] = []; + for (const branch of s.branches) { + branch.cond = optimizeExpr(branch.cond); + if (branch.cond.kind === "literal") { + if (branch.cond.tokenType === TokenType.TRUE) { + for (let i = 0; i < branch.body.length; i++) { + branch.body[i] = optimizeStmt(branch.body[i]!); + } + kept.push(branch); + s.branches = kept; + s.else_ = null; + return; + } + if (branch.cond.tokenType === TokenType.FALSE) continue; + } + for (let i = 0; i < branch.body.length; i++) { + branch.body[i] = optimizeStmt(branch.body[i]!); + } + kept.push(branch); + } + s.branches = kept; + if (s.else_) { + for (let i = 0; i < s.else_.length; i++) { + s.else_[i] = optimizeStmt(s.else_[i]!); + } + } +} + +function optimizeMatchStmt(s: MatchStmt): void { + if (s.subject) s.subject = optimizeExpr(s.subject); + for (const c of s.cases) { + if (c.pattern) c.pattern = optimizeExpr(c.pattern); + for (let i = 0; i < c.body.length; i++) { + c.body[i] = optimizeStmt(c.body[i]!); + } + } +} + +// --- Expression optimization --- + +function optimizeExpr(expr: Expr): Expr { + switch (expr.kind) { + case "binary": { + expr.left = optimizeExpr(expr.left); + expr.right = optimizeExpr(expr.right); + return foldBinary(expr) ?? expr; + } + case "unary": { + expr.operand = optimizeExpr(expr.operand); + return foldUnary(expr) ?? expr; + } + case "field_access": + expr.receiver = optimizeExpr(expr.receiver); + return tryCollapsePath(expr); + case "index": + expr.receiver = optimizeExpr(expr.receiver); + expr.index = optimizeExpr(expr.index); + return tryCollapsePath(expr); + case "method_call": + expr.receiver = optimizeExpr(expr.receiver); + for (const arg of expr.args) arg.value = optimizeExpr(arg.value); + return tryCollapsePath(expr); + case "call": + for (const arg of expr.args) arg.value = optimizeExpr(arg.value); + return expr; + case "array": + for (let i = 0; i < expr.elements.length; i++) { + expr.elements[i] = optimizeExpr(expr.elements[i]!); + } + return expr; + case "object": + for (const entry of expr.entries) { + entry.key = optimizeExpr(entry.key); + entry.value = optimizeExpr(entry.value); + } + return expr; + case "if_expr": + return optimizeIfExpr(expr); + case "match_expr": + return optimizeMatchExpr(expr); + case "lambda": + optimizeExprBody(expr.body); + return expr; + case "path": + for (const seg of expr.segments) { + if (seg.index) seg.index = optimizeExpr(seg.index); + for (const arg of seg.args) arg.value = optimizeExpr(arg.value); + } + return expr; + default: + return expr; + } +} + +function optimizeExprBody(body: ExprBody): void { + for (const va of body.assignments) { + va.value = optimizeExpr(va.value); + } + body.result = optimizeExpr(body.result); +} + +function optimizeIfExpr(e: IfExpr): Expr { + const kept: IfExprBranch[] = []; + for (const branch of e.branches) { + branch.cond = optimizeExpr(branch.cond); + if (branch.cond.kind === "literal") { + if (branch.cond.tokenType === TokenType.TRUE) { + optimizeExprBody(branch.body); + kept.push(branch); + e.branches = kept; + e.else_ = null; + return e; + } + if (branch.cond.tokenType === TokenType.FALSE) continue; + } + optimizeExprBody(branch.body); + kept.push(branch); + } + e.branches = kept; + if (e.else_) optimizeExprBody(e.else_); + return e; +} + +function optimizeMatchExpr(e: MatchExpr): Expr { + if (e.subject) e.subject = optimizeExpr(e.subject); + for (const c of e.cases) { + if (c.pattern) c.pattern = optimizeExpr(c.pattern); + if ("kind" in c.body) { + c.body = optimizeExpr(c.body); + } else { + optimizeExprBody(c.body); + } + } + return e; +} + +// --- Path collapse --- + +function tryCollapsePath(expr: Expr): Expr { + const segments: PathSegment[] = []; + let current: Expr = expr; + + for (;;) { + switch (current.kind) { + case "field_access": + segments.push({ + segKind: "field", + name: current.field, + index: null, + args: [], + named: false, + nullSafe: current.nullSafe, + pos: current.fieldPos, + }); + current = current.receiver; + continue; + case "index": + segments.push({ + segKind: "index", + name: "", + index: current.index, + args: [], + named: false, + nullSafe: current.nullSafe, + pos: current.pos, + }); + current = current.receiver; + continue; + case "method_call": + // Intrinsic methods (catch, or) require special dispatch in the + // interpreter (short-circuit evaluation, error interception) and + // cannot be collapsed into PathExpr segments. + if (current.method === "catch" || current.method === "or") { + return expr; + } + segments.push({ + segKind: "method", + name: current.method, + index: null, + args: current.args, + named: current.named, + nullSafe: current.nullSafe, + pos: current.methodPos, + }); + current = current.receiver; + continue; + case "input": + case "input_meta": + case "output": + case "output_meta": + case "var": { + if (segments.length === 0) return expr; + segments.reverse(); + return { + kind: "path", + pos: current.pos, + root: current.kind as PathRoot, + varName: current.kind === "var" ? current.name : "", + segments, + } satisfies PathExpr; + } + default: + return expr; + } + } +} + +// --- Constant folding --- + +function foldBinary(e: BinaryExpr): Expr | null { + if (e.left.kind !== "literal" || e.right.kind !== "literal") return null; + const left = e.left; + const right = e.right; + const pos = left.pos; + + // String concatenation. + if (e.op === TokenType.PLUS && isStringLit(left) && isStringLit(right)) { + return lit(pos, TokenType.STRING, left.value + right.value); + } + + // Integer arithmetic. + if (left.tokenType === TokenType.INT && right.tokenType === TokenType.INT) { + try { + const a = BigInt(left.value); + const b = BigInt(right.value); + const result = foldIntOp(a, b, e.op); + if (result === null) return null; + return lit(pos, TokenType.INT, result.toString()); + } catch { + return null; + } + } + + // Float arithmetic. + if (isNumericLit(left) && isNumericLit(right) && (left.tokenType === TokenType.FLOAT || right.tokenType === TokenType.FLOAT)) { + if (!canSafelyPromoteToFloat(left) || !canSafelyPromoteToFloat(right)) return null; + const a = parseFloat(left.value); + const b = parseFloat(right.value); + if (isNaN(a) || isNaN(b)) return null; + const result = foldFloatOp(a, b, e.op); + if (result === null) return null; + return lit(pos, TokenType.FLOAT, formatFloat(result)); + } + + // Boolean logic. + if (isBoolLit(left) && isBoolLit(right)) { + const a = left.tokenType === TokenType.TRUE; + const b = right.tokenType === TokenType.TRUE; + let result: boolean; + switch (e.op) { + case TokenType.AND: result = a && b; break; + case TokenType.OR: result = a || b; break; + case TokenType.EQ: result = a === b; break; + case TokenType.NE: result = a !== b; break; + default: return null; + } + return boolLit(pos, result); + } + + // Equality of same-type literals. + if (e.op === TokenType.EQ || e.op === TokenType.NE) { + if (left.tokenType === right.tokenType) { + let eq = left.value === right.value; + if (e.op === TokenType.NE) eq = !eq; + return boolLit(pos, eq); + } + if (isLiteralCrossType(left, right)) { + return boolLit(pos, e.op === TokenType.NE); + } + } + + return null; +} + +function foldUnary(e: UnaryExpr): Expr | null { + if (e.operand.kind !== "literal") return null; + const l = e.operand; + const pos = l.pos; + + switch (e.op) { + case TokenType.BANG: + if (l.tokenType === TokenType.TRUE) return boolLit(pos, false); + if (l.tokenType === TokenType.FALSE) return boolLit(pos, true); + break; + case TokenType.MINUS: + if (l.tokenType === TokenType.INT) { + const n = BigInt(l.value); + if (n === -9223372036854775808n) return null; // -MinInt64 overflows + return lit(pos, TokenType.INT, (-n).toString()); + } + if (l.tokenType === TokenType.FLOAT) { + const f = parseFloat(l.value); + if (isNaN(f)) return null; + return lit(pos, TokenType.FLOAT, formatFloat(-f)); + } + break; + } + return null; +} + +// --- Folding helpers --- + +import type { Pos } from "./token.js"; + +function lit(pos: Pos, tokenType: TokenType, value: string): LiteralExpr { + return { kind: "literal", pos, tokenType, value }; +} + +function boolLit(pos: Pos, v: boolean): LiteralExpr { + return v + ? { kind: "literal", pos, tokenType: TokenType.TRUE, value: "true" } + : { kind: "literal", pos, tokenType: TokenType.FALSE, value: "false" }; +} + +function isStringLit(l: LiteralExpr): boolean { + return l.tokenType === TokenType.STRING || l.tokenType === TokenType.RAW_STRING; +} + +function isNumericLit(l: LiteralExpr): boolean { + return l.tokenType === TokenType.INT || l.tokenType === TokenType.FLOAT; +} + +function isBoolLit(l: LiteralExpr): boolean { + return l.tokenType === TokenType.TRUE || l.tokenType === TokenType.FALSE; +} + +function canSafelyPromoteToFloat(l: LiteralExpr): boolean { + if (l.tokenType === TokenType.FLOAT) return true; + if (l.tokenType !== TokenType.INT) return false; + try { + const n = BigInt(l.value); + return n >= -9007199254740992n && n <= 9007199254740992n; + } catch { + return false; + } +} + +function isLiteralCrossType(a: LiteralExpr, b: LiteralExpr): boolean { + const af = literalFamily(a); + const bf = literalFamily(b); + return af !== bf && af !== 0 && bf !== 0; +} + +function literalFamily(l: LiteralExpr): number { + switch (l.tokenType) { + case TokenType.INT: + case TokenType.FLOAT: + return 1; + case TokenType.STRING: + case TokenType.RAW_STRING: + return 2; + case TokenType.TRUE: + case TokenType.FALSE: + return 3; + case TokenType.NULL: + return 4; + default: + return 0; + } +} + + +function foldIntOp(a: bigint, b: bigint, op: TokenType): bigint | null { + let r: bigint; + switch (op) { + case TokenType.PLUS: + r = a + b; + if (r > MAX_INT64 || r < MIN_INT64) return null; + return r; + case TokenType.MINUS: + r = a - b; + if (r > MAX_INT64 || r < MIN_INT64) return null; + return r; + case TokenType.STAR: + r = a * b; + if (r > MAX_INT64 || r < MIN_INT64) return null; + return r; + case TokenType.PERCENT: + if (b === 0n) return null; + return a % b; + default: + return null; + } +} + +function foldFloatOp(a: number, b: number, op: TokenType): number | null { + switch (op) { + case TokenType.PLUS: + return a + b; + case TokenType.MINUS: + return a - b; + case TokenType.STAR: + return a * b; + case TokenType.SLASH: + if (b === 0) return null; + return a / b; + case TokenType.PERCENT: + if (b === 0) return null; + return a % b; + default: + return null; + } +} + +function formatFloat(v: number): string { + const s = String(v); + // Ensure it looks like a float (has a dot). + if (!s.includes(".") && !s.includes("e") && !s.includes("E") && isFinite(v)) { + return s + ".0"; + } + return s; +} diff --git a/internal/bloblang2/ts/src/parser.ts b/internal/bloblang2/ts/src/parser.ts new file mode 100644 index 000000000..800b7b3a5 --- /dev/null +++ b/internal/bloblang2/ts/src/parser.ts @@ -0,0 +1,1106 @@ +// Pratt parser for Bloblang V2. + +import { Scanner } from "./scanner.js"; +import { type Token, type Pos, type PosError, TokenType, isKeyword } from "./token.js"; +import type { + Program, + MapDecl, + Param, + ImportStmt, + ExprBody, + VarAssign, + Stmt, + Assignment, + AssignTarget, + IfStmt, + IfBranch, + MatchStmt, + MatchStmtCase, + MatchExprCase, + Expr, + CallArg, + PathSegment, + IfExprBranch, +} from "./ast.js"; + +// Binding powers. +const BP_NONE = 0; +const BP_OR = 10; +const BP_AND = 20; +const BP_EQUALITY = 40; +const BP_COMPARISON = 60; +const BP_ADDITIVE = 80; +const BP_MULTIPLY = 100; +const BP_UNARY = 120; +const BP_POSTFIX = 140; + +export function parse( + src: string, + file: string, + files: Map | null, +): { program: Program; errors: PosError[] } { + const p = new Parser(files ?? new Map(), new Set([file]), file); + p.init(src, file); + const program = p.parseProgram(); + return { program, errors: p.errors }; +} + +class Parser { + private s!: Scanner; + private tok!: Token; + private files: Map; + private parsing: Set; + private currentFile: string; + errors: PosError[] = []; + + constructor(files: Map, parsing: Set, currentFile: string) { + this.files = files; + this.parsing = parsing; + this.currentFile = currentFile; + } + + init(src: string, file: string): void { + this.s = new Scanner(src, file); + this.currentFile = file; + this.advance(); + } + + private advance(): void { + this.tok = this.s.next(); + if (this.s.errors.length > 0) { + this.errors.push(...this.s.errors); + this.s.errors.length = 0; + } + } + + private expect(type: TokenType): Token { + const tok = this.tok; + if (tok.type !== type) { + this.error(tok.pos, `expected ${type}, got ${tok.type}`); + return tok; + } + this.advance(); + return tok; + } + + private at(type: TokenType): boolean { + return this.tok.type === type; + } + + private skipNL(): void { + while (this.tok.type === TokenType.NL) { + this.advance(); + } + } + + private error(pos: Pos, msg: string): void { + this.errors.push({ pos, msg }); + } + + private recover(): void { + while (this.tok.type !== TokenType.NL && this.tok.type !== TokenType.EOF) { + this.advance(); + } + } + + // --- Top-level --- + + parseProgram(): Program { + const prog: Program = { + stmts: [], + maps: [], + imports: [], + namespaces: new Map(), + }; + + this.skipNL(); + while (this.tok.type !== TokenType.EOF) { + switch (this.tok.type) { + case TokenType.MAP: { + const m = this.parseMapDecl(); + if (m) prog.maps.push(m); + break; + } + case TokenType.IMPORT: { + const imp = this.parseImport(prog); + if (imp) prog.imports.push(imp); + break; + } + default: { + const stmt = this.parseStatement(); + if (stmt) prog.stmts.push(stmt); + break; + } + } + if (this.at(TokenType.NL)) { + this.advance(); + this.skipNL(); + } else if (!this.at(TokenType.EOF)) { + this.error(this.tok.pos, `expected newline or end of input, got ${this.tok.type}`); + this.recover(); + this.skipNL(); + } + } + + return prog; + } + + private parseMapDecl(): MapDecl | null { + const pos = this.tok.pos; + this.advance(); // skip 'map' + + const nameTok = this.expect(TokenType.IDENT); + this.expect(TokenType.LPAREN); + const params = this.parseParamList(); + this.expect(TokenType.RPAREN); + this.expect(TokenType.LBRACE); + + const body = this.parseExprBody(); + + this.skipNL(); + this.expect(TokenType.RBRACE); + + return { + pos, + name: nameTok.literal, + params, + body, + namespaces: new Map(), + }; + } + + private parseParamList(): Param[] { + if (this.at(TokenType.RPAREN)) return []; + + const params: Param[] = [this.parseParam()]; + while (this.at(TokenType.COMMA)) { + this.advance(); + params.push(this.parseParam()); + } + return params; + } + + private parseParam(): Param { + const pos = this.tok.pos; + + if (this.at(TokenType.UNDERSCORE)) { + this.advance(); + if (this.at(TokenType.ASSIGN)) { + this.error(pos, "discard parameter _ cannot have a default value"); + this.advance(); + this.parseLiteral(); + } + return { name: "", discard: true, default_: null, pos }; + } + + const nameTok = this.expect(TokenType.IDENT); + const param: Param = { name: nameTok.literal, discard: false, default_: null, pos }; + + if (this.at(TokenType.ASSIGN)) { + this.advance(); + param.default_ = this.parseLiteral(); + if (!this.at(TokenType.COMMA) && !this.at(TokenType.RPAREN)) { + this.error(this.tok.pos, "default parameter values must be literals, not expressions"); + while (!this.at(TokenType.COMMA) && !this.at(TokenType.RPAREN) && !this.at(TokenType.EOF)) { + this.advance(); + } + } + } + + return param; + } + + private parseLiteral(): Expr { + const tok = this.tok; + switch (tok.type) { + case TokenType.INT: + case TokenType.FLOAT: + case TokenType.STRING: + case TokenType.RAW_STRING: + case TokenType.TRUE: + case TokenType.FALSE: + case TokenType.NULL: + this.advance(); + return { kind: "literal", pos: tok.pos, tokenType: tok.type, value: tok.literal }; + default: + this.error(tok.pos, `expected literal value, got ${tok.type}`); + return { kind: "literal", pos: tok.pos, tokenType: TokenType.NULL, value: "null" }; + } + } + + private parseImport(prog: Program): ImportStmt | null { + const pos = this.tok.pos; + this.advance(); // skip 'import' + + const pathTok = this.tok; + if (pathTok.type !== TokenType.STRING && pathTok.type !== TokenType.RAW_STRING) { + this.error(pathTok.pos, "expected string literal for import path"); + this.recover(); + return null; + } + this.advance(); + + this.expect(TokenType.AS); + const nsTok = this.expect(TokenType.IDENT); + + const imp: ImportStmt = { pos, path: pathTok.literal, namespace: nsTok.literal }; + this.resolveImport(prog, imp); + return imp; + } + + private resolveImport(prog: Program, imp: ImportStmt): void { + if (prog.namespaces.has(imp.namespace)) { + this.error(imp.pos, `duplicate namespace "${imp.namespace}"`); + return; + } + + const src = this.files.get(imp.path); + if (src === undefined) { + this.error(imp.pos, `import file "${imp.path}" not found`); + return; + } + + if (this.parsing.has(imp.path)) { + this.error(imp.pos, `circular import: "${imp.path}"`); + return; + } + + const sub = new Parser(this.files, this.parsing, imp.path); + this.parsing.add(imp.path); + sub.init(src, imp.path); + const importProg = sub.parseProgram(); + this.parsing.delete(imp.path); + + this.errors.push(...sub.errors); + + if (importProg.stmts.length > 0) { + this.error(imp.pos, `imported file "${imp.path}" contains statements (only map declarations and imports are allowed)`); + } + + for (const m of importProg.maps) { + for (const [ns, maps] of importProg.namespaces) { + m.namespaces.set(ns, maps); + } + } + prog.namespaces.set(imp.namespace, importProg.maps); + } + + // --- Statement parsing --- + + private parseStatement(): Stmt | null { + switch (this.tok.type) { + case TokenType.IF: + return this.parseIfStmt(); + case TokenType.MATCH: + return this.parseMatchStmt(); + default: + return this.parseAssignment(); + } + } + + private parseAssignment(): Stmt | null { + const target = this.parseAssignTarget(); + if (!target) { + this.recover(); + return null; + } + + this.expect(TokenType.ASSIGN); + const value = this.parseExpr(BP_NONE); + + return { kind: "assignment", pos: target.pos, target, value } satisfies Assignment; + } + + private parseAssignTarget(): AssignTarget | null { + const pos = this.tok.pos; + + switch (this.tok.type) { + case TokenType.OUTPUT: { + this.advance(); + let metaAccess = false; + if (this.at(TokenType.AT)) { + metaAccess = true; + this.advance(); + } + const path = this.parsePathSegments(); + return { pos, root: "output", varName: "", metaAccess, path }; + } + case TokenType.VAR: { + const varName = this.tok.literal; + this.advance(); + const path = this.parsePathSegments(); + return { pos, root: "var", varName, metaAccess: false, path }; + } + default: + this.error(pos, `unexpected expression in statement context (expected output or $variable assignment, got ${this.tok.type})`); + return null; + } + } + + private parsePathSegments(): PathSegment[] { + const segs: PathSegment[] = []; + for (;;) { + switch (this.tok.type) { + case TokenType.DOT: + case TokenType.QDOT: { + const nullSafe = this.tok.type === TokenType.QDOT; + const pos = this.tok.pos; + this.advance(); + const name = this.expectWord(); + if (this.at(TokenType.LPAREN)) { + this.advance(); + const { args, named } = this.parseArgList(); + this.expect(TokenType.RPAREN); + segs.push({ segKind: "method", name, index: null, args, named, nullSafe, pos }); + } else { + segs.push({ segKind: "field", name, index: null, args: [], named: false, nullSafe, pos }); + } + break; + } + case TokenType.LBRACKET: + case TokenType.QLBRACKET: { + const nullSafe = this.tok.type === TokenType.QLBRACKET; + const pos = this.tok.pos; + this.advance(); + const idx = this.parseExpr(BP_NONE); + this.expect(TokenType.RBRACKET); + segs.push({ segKind: "index", name: "", index: idx, args: [], named: false, nullSafe, pos }); + break; + } + default: + return segs; + } + } + } + + private expectWord(): string { + const tok = this.tok; + if (tok.type === TokenType.IDENT || isKeyword(tok.type) || tok.type === TokenType.DELETED || tok.type === TokenType.THROW || tok.type === TokenType.VOID) { + this.advance(); + return tok.literal; + } + if (tok.type === TokenType.STRING) { + this.advance(); + return tok.literal; + } + this.error(tok.pos, `expected field name, got ${tok.type}`); + return ""; + } + + private parseIfStmt(): IfStmt { + const pos = this.tok.pos; + this.advance(); // skip 'if' + + const branches: IfBranch[] = []; + const cond = this.parseExpr(BP_NONE); + this.expect(TokenType.LBRACE); + const body = this.parseStmtBody(); + this.expect(TokenType.RBRACE); + branches.push({ cond, body }); + + let else_: Stmt[] | null = null; + while (this.at(TokenType.ELSE)) { + this.advance(); + if (this.at(TokenType.IF)) { + this.advance(); + const c = this.parseExpr(BP_NONE); + this.expect(TokenType.LBRACE); + const b = this.parseStmtBody(); + this.expect(TokenType.RBRACE); + branches.push({ cond: c, body: b }); + } else { + this.expect(TokenType.LBRACE); + else_ = this.parseStmtBody(); + this.expect(TokenType.RBRACE); + break; + } + } + + return { kind: "if_stmt", pos, branches, else_ }; + } + + private parseMatchStmt(): MatchStmt { + const pos = this.tok.pos; + this.advance(); // skip 'match' + + let subject: Expr | null = null; + let binding = ""; + + if (!this.at(TokenType.LBRACE)) { + subject = this.parseExpr(BP_NONE); + if (this.at(TokenType.AS)) { + this.advance(); + binding = this.expect(TokenType.IDENT).literal; + } + } + + this.expect(TokenType.LBRACE); + this.skipNL(); + + const cases: MatchStmtCase[] = []; + while (!this.at(TokenType.RBRACE) && !this.at(TokenType.EOF)) { + cases.push(this.parseMatchCaseStmt()); + if (this.at(TokenType.COMMA)) this.advance(); + this.skipNL(); + } + + this.expect(TokenType.RBRACE); + return { kind: "match_stmt", pos, subject, binding, cases }; + } + + private parseMatchCaseStmt(): MatchStmtCase { + let pattern: Expr | null = null; + let wildcard = false; + + if (this.at(TokenType.UNDERSCORE)) { + wildcard = true; + this.advance(); + } else { + pattern = this.parseExpr(BP_NONE); + } + + this.expect(TokenType.FATARROW); + this.expect(TokenType.LBRACE); + const body = this.parseStmtBody(); + this.expect(TokenType.RBRACE); + + return { pattern, wildcard, body }; + } + + private parseStmtBody(): Stmt[] { + this.skipNL(); + const stmts: Stmt[] = []; + while (!this.at(TokenType.RBRACE) && !this.at(TokenType.EOF)) { + const stmt = this.parseStatement(); + if (stmt) stmts.push(stmt); + if (this.at(TokenType.NL)) { + this.advance(); + this.skipNL(); + } else if (!this.at(TokenType.RBRACE) && !this.at(TokenType.EOF)) { + this.error(this.tok.pos, `expected newline or }, got ${this.tok.type}`); + this.recover(); + this.skipNL(); + } + } + return stmts; + } + + // --- Expression parsing (Pratt) --- + + private parseExpr(minBP: number): Expr { + let left = this.parsePrefix(); + + for (;;) { + const { leftBP, rightBP, nonAssoc } = infixBP(this.tok.type); + if (leftBP === BP_NONE || leftBP < minBP) break; + + switch (this.tok.type) { + case TokenType.DOT: + case TokenType.QDOT: + left = this.parsePostfixDot(left); + break; + case TokenType.LBRACKET: + case TokenType.QLBRACKET: + left = this.parsePostfixIndex(left); + break; + default: { + const op = this.tok; + this.advance(); + const right = this.parseExpr(rightBP); + + if (nonAssoc) { + const next = infixBP(this.tok.type); + if (next.leftBP === leftBP) { + this.error(this.tok.pos, `cannot chain non-associative operator ${this.tok.type}`); + } + } + + left = { kind: "binary", left, op: op.type, opPos: op.pos, right }; + break; + } + } + } + + return left; + } + + // --- Prefix / atom parsers --- + + private parsePrefix(): Expr { + const tok = this.tok; + + switch (tok.type) { + case TokenType.INT: + case TokenType.FLOAT: + case TokenType.STRING: + case TokenType.RAW_STRING: + case TokenType.TRUE: + case TokenType.FALSE: + case TokenType.NULL: + this.advance(); + return { kind: "literal", pos: tok.pos, tokenType: tok.type, value: tok.literal }; + + case TokenType.MINUS: + this.advance(); + return { kind: "unary", op: TokenType.MINUS, pos: tok.pos, operand: this.parseExpr(BP_UNARY) }; + + case TokenType.BANG: + this.advance(); + return { kind: "unary", op: TokenType.BANG, pos: tok.pos, operand: this.parseExpr(BP_UNARY) }; + + case TokenType.LPAREN: + return this.parseParenOrLambda(); + + case TokenType.LBRACKET: + return this.parseArrayLiteral(); + + case TokenType.LBRACE: + return this.parseObjectLiteral(); + + case TokenType.IF: + return this.parseIfExpr(); + + case TokenType.MATCH: + return this.parseMatchExpr(); + + case TokenType.INPUT: + this.advance(); + if (this.at(TokenType.AT)) { + this.advance(); + return { kind: "input_meta", pos: tok.pos }; + } + return { kind: "input", pos: tok.pos }; + + case TokenType.OUTPUT: + this.advance(); + if (this.at(TokenType.AT)) { + this.advance(); + return { kind: "output_meta", pos: tok.pos }; + } + return { kind: "output", pos: tok.pos }; + + case TokenType.VAR: + this.advance(); + if (this.at(TokenType.LPAREN)) { + this.error(tok.pos, `$${tok.literal} is a variable, not a callable function (use a named map instead)`); + } + return { kind: "var", pos: tok.pos, name: tok.literal }; + + case TokenType.IDENT: + return this.parseIdentOrCall(); + + case TokenType.DELETED: + case TokenType.THROW: + case TokenType.VOID: + return this.parseReservedCall(); + + case TokenType.UNDERSCORE: + this.advance(); + if (this.at(TokenType.THINARROW)) { + this.advance(); + const body = this.parseLambdaBody(); + return { kind: "lambda", pos: tok.pos, params: [{ name: "", discard: true, default_: null, pos: tok.pos }], body }; + } + this.error(tok.pos, "unexpected _ in expression position"); + return { kind: "literal", pos: tok.pos, tokenType: TokenType.NULL, value: "null" }; + + default: + this.error(tok.pos, `expected expression, got ${tok.type}`); + this.advance(); + return { kind: "literal", pos: tok.pos, tokenType: TokenType.NULL, value: "null" }; + } + } + + private parseIdentOrCall(): Expr { + const tok = this.tok; + this.advance(); + + // Qualified: namespace::name or namespace::name(args) + if (this.at(TokenType.DCOLON)) { + this.advance(); + const name = this.expect(TokenType.IDENT); + if (this.at(TokenType.LPAREN)) { + this.advance(); + const { args, named } = this.parseArgList(); + this.expect(TokenType.RPAREN); + return { kind: "call", pos: tok.pos, namespace: tok.literal, name: name.literal, args, named }; + } + return { kind: "ident", pos: tok.pos, namespace: tok.literal, name: name.literal }; + } + + // Function call: name( + if (this.at(TokenType.LPAREN)) { + this.advance(); + const { args, named } = this.parseArgList(); + this.expect(TokenType.RPAREN); + return { kind: "call", pos: tok.pos, namespace: "", name: tok.literal, args, named }; + } + + // Single-param lambda: ident -> + if (this.at(TokenType.THINARROW)) { + this.advance(); + const body = this.parseLambdaBody(); + return { kind: "lambda", pos: tok.pos, params: [{ name: tok.literal, discard: false, default_: null, pos: tok.pos }], body }; + } + + // Bare identifier. + return { kind: "ident", pos: tok.pos, namespace: "", name: tok.literal }; + } + + private parseReservedCall(): Expr { + const tok = this.tok; + this.advance(); + this.expect(TokenType.LPAREN); + const { args, named } = this.parseArgList(); + this.expect(TokenType.RPAREN); + return { kind: "call", pos: tok.pos, namespace: "", name: tok.literal, args, named }; + } + + private parseParenOrLambda(): Expr { + const pos = this.tok.pos; + if (this.isLambdaAhead()) { + return this.parseMultiParamLambda(pos); + } + this.advance(); // skip ( + const expr = this.parseExpr(BP_NONE); + this.expect(TokenType.RPAREN); + return expr; + } + + private isLambdaAhead(): boolean { + const savedTok = this.tok; + const savedS = this.s.saveState(); + + let depth = 0; + this.advance(); // skip ( + depth++; + while (depth > 0 && this.tok.type !== TokenType.EOF) { + if (this.tok.type === TokenType.LPAREN) depth++; + else if (this.tok.type === TokenType.RPAREN) depth--; + if (depth > 0) this.advance(); + } + this.advance(); // skip ) + const isLambda = this.tok.type === TokenType.THINARROW; + + this.s.restoreState(savedS); + this.tok = savedTok; + return isLambda; + } + + private parseMultiParamLambda(pos: Pos): Expr { + this.advance(); // skip ( + const params = this.parseParamList(); + this.expect(TokenType.RPAREN); + this.expect(TokenType.THINARROW); + const body = this.parseLambdaBody(); + return { kind: "lambda", pos, params, body }; + } + + private parseLambdaBody(): ExprBody { + if (this.at(TokenType.LBRACE)) { + if (this.isLambdaBlock()) { + this.advance(); // skip { + const body = this.parseExprBody(); + this.skipNL(); + this.expect(TokenType.RBRACE); + return body; + } + const expr = this.parseExpr(BP_NONE); + return { assignments: [], result: expr }; + } + const expr = this.parseExpr(BP_NONE); + return { assignments: [], result: expr }; + } + + private isLambdaBlock(): boolean { + const savedTok = this.tok; + const savedS = this.s.saveState(); + + this.advance(); // skip { + while (this.tok.type === TokenType.NL) this.advance(); + + let isBlock = false; + switch (this.tok.type) { + case TokenType.VAR: + case TokenType.OUTPUT: + isBlock = true; + break; + case TokenType.IDENT: { + const savedInner = this.tok; + const savedInnerS = this.s.saveState(); + this.advance(); + isBlock = this.at(TokenType.ASSIGN); + this.s.restoreState(savedInnerS); + this.tok = savedInner; + break; + } + } + + this.s.restoreState(savedS); + this.tok = savedTok; + return isBlock; + } + + private parseArrayLiteral(): Expr { + const pos = this.tok.pos; + this.advance(); // skip [ + + const elements: Expr[] = []; + while (!this.at(TokenType.RBRACKET) && !this.at(TokenType.EOF)) { + elements.push(this.parseExpr(BP_NONE)); + if (!this.at(TokenType.RBRACKET)) { + this.expect(TokenType.COMMA); + } + } + this.expect(TokenType.RBRACKET); + return { kind: "array", pos, elements }; + } + + private parseObjectLiteral(): Expr { + const pos = this.tok.pos; + this.advance(); // skip { + this.skipNL(); + + const entries: { key: Expr; value: Expr }[] = []; + while (!this.at(TokenType.RBRACE) && !this.at(TokenType.EOF)) { + const key = this.parseExpr(BP_NONE); + this.expect(TokenType.COLON); + const value = this.parseExpr(BP_NONE); + entries.push({ key, value }); + if (!this.at(TokenType.RBRACE)) { + this.expect(TokenType.COMMA); + this.skipNL(); + } + } + this.skipNL(); + this.expect(TokenType.RBRACE); + return { kind: "object", pos, entries }; + } + + // --- If/match expressions --- + + private parseIfExpr(): Expr { + const pos = this.tok.pos; + this.advance(); // skip 'if' + + const branches: IfExprBranch[] = []; + const cond = this.parseExpr(BP_NONE); + this.expect(TokenType.LBRACE); + const body = this.parseExprBody(); + this.skipNL(); + this.expect(TokenType.RBRACE); + branches.push({ cond, body }); + + let else_: ExprBody | null = null; + while (this.at(TokenType.ELSE)) { + this.advance(); + if (this.at(TokenType.IF)) { + this.advance(); + const c = this.parseExpr(BP_NONE); + this.expect(TokenType.LBRACE); + const b = this.parseExprBody(); + this.skipNL(); + this.expect(TokenType.RBRACE); + branches.push({ cond: c, body: b }); + } else { + this.expect(TokenType.LBRACE); + else_ = this.parseExprBody(); + this.skipNL(); + this.expect(TokenType.RBRACE); + break; + } + } + + return { kind: "if_expr", pos, branches, else_ }; + } + + private parseMatchExpr(): Expr { + const pos = this.tok.pos; + this.advance(); // skip 'match' + + let subject: Expr | null = null; + let binding = ""; + + if (!this.at(TokenType.LBRACE)) { + subject = this.parseExpr(BP_NONE); + if (this.at(TokenType.AS)) { + this.advance(); + binding = this.expect(TokenType.IDENT).literal; + } + } + + this.expect(TokenType.LBRACE); + this.skipNL(); + + const cases: MatchExprCase[] = []; + while (!this.at(TokenType.RBRACE) && !this.at(TokenType.EOF)) { + cases.push(this.parseMatchCaseExpr()); + if (this.at(TokenType.COMMA)) this.advance(); + this.skipNL(); + } + + this.expect(TokenType.RBRACE); + return { kind: "match_expr", pos, subject, binding, cases }; + } + + private parseMatchCaseExpr(): MatchExprCase { + let pattern: Expr | null = null; + let wildcard = false; + + if (this.at(TokenType.UNDERSCORE)) { + wildcard = true; + this.advance(); + } else { + pattern = this.parseExpr(BP_NONE); + } + + this.expect(TokenType.FATARROW); + + let body: Expr | ExprBody; + if (this.at(TokenType.LBRACE)) { + this.advance(); + body = this.parseExprBody(); + this.skipNL(); + this.expect(TokenType.RBRACE); + } else { + body = this.parseExpr(BP_NONE); + } + + return { pattern, wildcard, body }; + } + + // --- Expression body --- + + private parseExprBody(): ExprBody { + this.skipNL(); + const assignments: VarAssign[] = []; + + for (;;) { + // Output assignment in expression context — error. + if (this.at(TokenType.OUTPUT) && this.isOutputAssignAhead()) { + this.error(this.tok.pos, "cannot assign to output in expression context (only $variable assignments are allowed)"); + this.recover(); + this.skipNL(); + continue; + } + + // Bare ident = ... — parameters are read-only. + if (this.at(TokenType.IDENT)) { + const savedTok = this.tok; + const savedS = this.s.saveState(); + this.advance(); + const isAssign = this.tok.type === TokenType.ASSIGN; + this.s.restoreState(savedS); + this.tok = savedTok; + if (isAssign) { + this.error(this.tok.pos, "cannot assign to identifier (parameters are read-only, use $variable for local assignments)"); + this.recover(); + this.skipNL(); + continue; + } + } + + // Var assignment: $var[.path...] = expr + if (this.at(TokenType.VAR) && this.isVarAssignAhead()) { + const va = this.parseVarAssign(); + assignments.push(va); + if (this.at(TokenType.NL)) { + this.advance(); + this.skipNL(); + } + continue; + } + break; + } + + const result = this.parseExpr(BP_NONE); + return { assignments, result }; + } + + private isOutputAssignAhead(): boolean { + const savedTok = this.tok; + const savedS = this.s.saveState(); + + this.advance(); // skip OUTPUT + if (this.at(TokenType.AT)) this.advance(); + while (this.at(TokenType.DOT) || this.at(TokenType.LBRACKET) || this.at(TokenType.QLBRACKET) || this.at(TokenType.QDOT)) { + if (this.at(TokenType.LBRACKET) || this.at(TokenType.QLBRACKET)) { + let depth = 1; + this.advance(); + while (depth > 0 && !this.at(TokenType.EOF)) { + if (this.at(TokenType.LBRACKET) || this.at(TokenType.QLBRACKET)) depth++; + else if (this.at(TokenType.RBRACKET)) depth--; + this.advance(); + } + } else { + this.advance(); // skip . or ?. + this.advance(); // skip field name + } + } + const isAssign = this.at(TokenType.ASSIGN); + + this.s.restoreState(savedS); + this.tok = savedTok; + return isAssign; + } + + private isVarAssignAhead(): boolean { + const savedTok = this.tok; + const savedS = this.s.saveState(); + + this.advance(); // skip VAR + while (this.at(TokenType.DOT) || this.at(TokenType.LBRACKET) || this.at(TokenType.QLBRACKET) || this.at(TokenType.QDOT)) { + if (this.at(TokenType.LBRACKET) || this.at(TokenType.QLBRACKET)) { + let depth = 1; + this.advance(); + while (depth > 0 && !this.at(TokenType.EOF)) { + if (this.at(TokenType.LBRACKET) || this.at(TokenType.QLBRACKET)) depth++; + else if (this.at(TokenType.RBRACKET)) depth--; + this.advance(); + } + } else { + this.advance(); + this.advance(); + } + } + const isAssign = this.at(TokenType.ASSIGN); + + this.s.restoreState(savedS); + this.tok = savedTok; + return isAssign; + } + + private parseVarAssign(): VarAssign { + const pos = this.tok.pos; + const name = this.tok.literal; + this.advance(); // skip VAR + + const path = this.parsePathSegments(); + this.expect(TokenType.ASSIGN); + const value = this.parseExpr(BP_NONE); + + return { pos, name, path, value }; + } + + // --- Postfix --- + + private parsePostfixDot(receiver: Expr): Expr { + const nullSafe = this.tok.type === TokenType.QDOT; + const dotPos = this.tok.pos; + this.advance(); // skip . or ?. + + const name = this.expectWord(); + + if (this.at(TokenType.LPAREN)) { + this.advance(); + const { args, named } = this.parseArgList(); + this.expect(TokenType.RPAREN); + return { kind: "method_call", receiver, method: name, methodPos: dotPos, args, named, nullSafe }; + } + + return { kind: "field_access", receiver, field: name, fieldPos: dotPos, nullSafe }; + } + + private parsePostfixIndex(receiver: Expr): Expr { + const nullSafe = this.tok.type === TokenType.QLBRACKET; + const pos = this.tok.pos; + this.advance(); // skip [ or ?[ + + const index = this.parseExpr(BP_NONE); + this.expect(TokenType.RBRACKET); + + return { kind: "index", receiver, index, pos, nullSafe }; + } + + // --- Argument lists --- + + private parseArgList(): { args: CallArg[]; named: boolean } { + if (this.at(TokenType.RPAREN)) return { args: [], named: false }; + + const named = this.isNamedArgList(); + const args: CallArg[] = []; + + for (;;) { + if (named) { + if (this.tok.type !== TokenType.IDENT) { + this.error(this.tok.pos, "cannot mix named and positional arguments in the same call"); + while (!this.at(TokenType.RPAREN) && !this.at(TokenType.EOF)) this.advance(); + break; + } + const nameTok = this.expect(TokenType.IDENT); + this.expect(TokenType.COLON); + const value = this.parseExpr(BP_NONE); + args.push({ name: nameTok.literal, value }); + } else { + const value = this.parseExpr(BP_NONE); + if (this.at(TokenType.COLON)) { + this.error(this.tok.pos, "cannot mix positional and named arguments in the same call"); + while (!this.at(TokenType.RPAREN) && !this.at(TokenType.EOF)) this.advance(); + break; + } + args.push({ name: "", value }); + } + if (!this.at(TokenType.COMMA)) break; + this.advance(); + } + + return { args, named }; + } + + private isNamedArgList(): boolean { + if (!this.at(TokenType.IDENT)) return false; + const savedTok = this.tok; + const savedS = this.s.saveState(); + this.advance(); + const isNamed = this.at(TokenType.COLON); + this.s.restoreState(savedS); + this.tok = savedTok; + return isNamed; + } +} + +interface InfixInfo { + leftBP: number; + rightBP: number; + nonAssoc: boolean; +} + +const INFIX_NONE: InfixInfo = { leftBP: BP_NONE, rightBP: BP_NONE, nonAssoc: false }; +const INFIX_OR: InfixInfo = { leftBP: BP_OR, rightBP: BP_OR + 1, nonAssoc: false }; +const INFIX_AND: InfixInfo = { leftBP: BP_AND, rightBP: BP_AND + 1, nonAssoc: false }; +const INFIX_EQ: InfixInfo = { leftBP: BP_EQUALITY, rightBP: BP_EQUALITY + 1, nonAssoc: true }; +const INFIX_CMP: InfixInfo = { leftBP: BP_COMPARISON, rightBP: BP_COMPARISON + 1, nonAssoc: true }; +const INFIX_ADD: InfixInfo = { leftBP: BP_ADDITIVE, rightBP: BP_ADDITIVE + 1, nonAssoc: false }; +const INFIX_MUL: InfixInfo = { leftBP: BP_MULTIPLY, rightBP: BP_MULTIPLY + 1, nonAssoc: false }; +const INFIX_POST: InfixInfo = { leftBP: BP_POSTFIX, rightBP: BP_POSTFIX, nonAssoc: false }; + +function infixBP(type: TokenType): InfixInfo { + switch (type) { + case TokenType.OR: + return INFIX_OR; + case TokenType.AND: + return INFIX_AND; + case TokenType.EQ: + case TokenType.NE: + return INFIX_EQ; + case TokenType.GT: + case TokenType.GE: + case TokenType.LT: + case TokenType.LE: + return INFIX_CMP; + case TokenType.PLUS: + case TokenType.MINUS: + return INFIX_ADD; + case TokenType.STAR: + case TokenType.SLASH: + case TokenType.PERCENT: + return INFIX_MUL; + case TokenType.DOT: + case TokenType.QDOT: + case TokenType.LBRACKET: + case TokenType.QLBRACKET: + return INFIX_POST; + default: + return INFIX_NONE; + } +} diff --git a/internal/bloblang2/ts/src/resolver.ts b/internal/bloblang2/ts/src/resolver.ts new file mode 100644 index 000000000..b9691130a --- /dev/null +++ b/internal/bloblang2/ts/src/resolver.ts @@ -0,0 +1,694 @@ +// Semantic analysis for Bloblang V2. +// Checks variable scoping, map isolation, lambda purity, arity, and naming. + +import type { Pos, PosError } from "./token.js"; +import { TokenType } from "./token.js"; +import type { + Program, + Stmt, + Expr, + ExprBody, + MapDecl, + Param, + Assignment, + IfStmt, + MatchStmt, + IfExpr, + MatchExpr, + CallExpr, + CallArg, + IdentExpr, + PathSegment, +} from "./ast.js"; + +/** + * ArgFolder performs parse-time evaluation of a stdlib call's arguments + * so the runtime can skip repeat work. The folder inspects the AST args + * (typically checking for string-literal shapes) and returns a same- + * length array of folded values, using null/undefined for argument + * positions that aren't eligible for folding. On success the resolver + * writes each non-null entry onto the matching CallArg.folded field, + * and the interpreter substitutes the folded value for the arg at + * runtime. + * + * Throwing an error surfaces as a resolver diagnostic anchored at the + * call site — used e.g. to reject an invalid regex pattern at parse + * time rather than on first call. + */ +export type ArgFolder = (args: CallArg[]) => Array; + +export interface FunctionInfo { + required: number; + /** Total params (required + optional). -1 means no arity checking. */ + total: number; + /** + * argFolder, if set, is invoked by the resolver to precompute literal + * arguments (see ArgFolder docs). + */ + argFolder?: ArgFolder; +} + +export interface MethodInfo { + required: number; + /** Total params (required + optional). -1 means no arity checking. */ + total: number; + /** + * Per-parameter metadata, parallel to declared positions. Empty when the + * method doesn't declare params (variadic — e.g. .sort); in that case + * `acceptsLambda` is the method-level fallback. + */ + params?: MethodParamInfo[]; + /** + * Method-level fallback used when `params` is empty. Methods not marked as + * lambda-accepting (e.g. .or()) reject lambdas at compile time + * (spec Section 3.4). + */ + acceptsLambda?: boolean; + /** + * argFolder, if set, is invoked by the resolver to precompute literal + * arguments (see ArgFolder docs). + */ + argFolder?: ArgFolder; +} + +export interface MethodParamInfo { + name: string; + hasDefault: boolean; + acceptsLambda: boolean; +} + +/** Reports whether a lambda is accepted at the given argument position. */ +export function paramAcceptsLambda( + mi: MethodInfo, + position: number, + name: string, +): boolean { + const params = mi.params; + if (!params || params.length === 0) { + return mi.acceptsLambda === true; + } + if (name) { + for (const p of params) { + if (p.name === name) return p.acceptsLambda; + } + return false; + } + if (position < 0 || position >= params.length) return false; + return params[position]!.acceptsLambda; +} + +export function resolve( + prog: Program, + knownMethods: Set | Map, + knownFunctions: Map, +): PosError[] { + const r = new Resolver(prog, knownMethods, knownFunctions); + r.resolve(); + return r.errors; +} + +class ResolveScope { + parent: ResolveScope | null; + vars = new Set(); + params = new Set(); + + constructor(parent: ResolveScope | null) { + this.parent = parent; + } + + isDeclared(name: string): boolean { + for (let cur: ResolveScope | null = this; cur; cur = cur.parent) { + if (cur.vars.has(name) || cur.params.has(name)) return true; + } + return false; + } + + // isParam checks whether a name is declared as a parameter (map param, lambda + // param, match-as binding) without checking variables. Bare identifiers must + // not resolve to $variables. + isParam(name: string): boolean { + for (let cur: ResolveScope | null = this; cur; cur = cur.parent) { + if (cur.params.has(name)) return true; + } + return false; + } +} + +class Resolver { + private prog: Program; + private knownMethods: Set | Map; + private knownFunctions: Map; + errors: PosError[] = []; + private scope!: ResolveScope; + private inMap = false; + private inMethodArg = false; + + constructor( + prog: Program, + knownMethods: Set | Map, + knownFunctions: Map, + ) { + this.prog = prog; + this.knownMethods = knownMethods; + this.knownFunctions = knownFunctions; + } + + // methodInfo returns arity info for a known method, or null if the + // registry is the legacy Set (no arity) form. + private methodInfo(name: string): MethodInfo | null { + const km = this.knownMethods; + if (km instanceof Map) { + return km.get(name) ?? null; + } + return null; + } + + private hasMethod(name: string): boolean { + const km = this.knownMethods; + if (km instanceof Map) return km.has(name); + return km.has(name); + } + + private mapIndex = new Map(); + + resolve(): void { + // Check duplicate map names and build index. + const seen = new Map(); + for (const m of this.prog.maps) { + const prev = seen.get(m.name); + if (prev) { + this.error(m.pos, `duplicate map name "${m.name}" (previously declared at ${prev.line}:${prev.column})`); + } + seen.set(m.name, m.pos); + this.mapIndex.set(m.name, m); + } + + this.scope = new ResolveScope(null); + + for (const m of this.prog.maps) { + this.resolveMapDecl(m); + } + for (const stmt of this.prog.stmts) { + this.resolveStmt(stmt); + } + } + + private error(pos: Pos, msg: string): void { + this.errors.push({ pos, msg }); + } + + private resolveMapDecl(m: MapDecl): void { + this.validateParams(m.params); + + const saved = this.scope; + const savedInMap = this.inMap; + + this.inMap = true; + const mapScope = new ResolveScope(null); // isolated + for (const p of m.params) { + if (!p.discard) mapScope.params.add(p.name); + } + this.scope = mapScope; + this.resolveExprBody(m.body); + + this.scope = saved; + this.inMap = savedInMap; + } + + private validateParams(params: Param[]): void { + let seenDefault = false; + for (const p of params) { + if (p.discard) { + if (p.default_) this.error(p.pos, "discard parameter _ cannot have a default value"); + continue; + } + if (p.default_) { + seenDefault = true; + } else if (seenDefault) { + this.error(p.pos, "required parameter after default parameter"); + } + } + } + + private resolveStmt(stmt: Stmt): void { + switch (stmt.kind) { + case "assignment": + this.resolveAssignment(stmt); + break; + case "if_stmt": + this.resolveIfStmt(stmt); + break; + case "match_stmt": + this.resolveMatchStmt(stmt); + break; + } + } + + private resolveAssignment(a: Assignment): void { + // Lambdas in non-argument positions are caught by resolveExpr's "lambda" + // case (spec Section 3.4). + if (a.value.kind === "ident" && a.target.root === "var") { + const isFn = this.knownFunctions.has(a.value.name); + if (this.isKnownMap(a.value.name) || isFn) { + this.error(a.pos, `cannot store ${a.value.name} in a variable (it is not a value)`); + } + } + + this.resolveExpr(a.value); + + if (a.target.root === "var" && !this.scope.isDeclared(a.target.varName)) { + this.scope.vars.add(a.target.varName); + } + } + + private resolveIfStmt(s: IfStmt): void { + for (const branch of s.branches) { + this.resolveExpr(branch.cond); + this.withScope(() => { + for (const stmt of branch.body) this.resolveStmt(stmt); + }); + } + if (s.else_) { + this.withScope(() => { + for (const stmt of s.else_!) this.resolveStmt(stmt); + }); + } + } + + private resolveMatchStmt(s: MatchStmt): void { + if (s.subject) this.resolveExpr(s.subject); + for (const c of s.cases) { + this.withScope(() => { + if (s.binding) this.scope.params.add(s.binding); + if (c.pattern && !c.wildcard) this.resolveExpr(c.pattern); + for (const stmt of c.body) this.resolveStmt(stmt); + }); + } + } + + private resolveExprBody(body: ExprBody): void { + for (const va of body.assignments) { + // Lambdas in non-argument positions are caught by resolveExpr's + // "lambda" case (spec Section 3.4). + this.resolveExpr(va.value); + if (!this.scope.isDeclared(va.name)) { + this.scope.vars.add(va.name); + } + } + this.resolveExpr(body.result); + } + + private resolveExpr(expr: Expr): void { + switch (expr.kind) { + case "literal": + break; + case "input": + case "input_meta": + if (this.inMap) this.error(expr.pos, "cannot access input inside a map body"); + break; + case "output": + case "output_meta": + if (this.inMap) this.error(expr.pos, "cannot access output inside a map body"); + break; + case "var": + if (!this.scope.isDeclared(expr.name)) { + this.error(expr.pos, "undeclared variable $" + expr.name); + } + break; + case "ident": + this.resolveIdent(expr); + break; + case "binary": + this.resolveExpr(expr.left); + this.resolveExpr(expr.right); + break; + case "unary": + this.resolveExpr(expr.operand); + break; + case "call": + this.resolveCall(expr); + break; + case "method_call": { + this.resolveExpr(expr.receiver); + this.checkMethodCallArity(expr); + const mi = this.methodInfo(expr.method); + this.applyArgFolder(mi?.argFolder, expr.args, expr.methodPos, `.${expr.method}()`); + this.resolveMethodArgs(expr.args, mi, expr.method); + break; + } + case "field_access": + this.resolveExpr(expr.receiver); + break; + case "index": + this.resolveExpr(expr.receiver); + this.resolveExpr(expr.index); + break; + case "array": + for (const elem of expr.elements) this.resolveExpr(elem); + break; + case "object": + for (const entry of expr.entries) { + this.resolveExpr(entry.key); + this.resolveExpr(entry.value); + } + break; + case "if_expr": + this.resolveIfExpr(expr); + break; + case "match_expr": + this.resolveMatchExpr(expr); + break; + case "lambda": + this.error(expr.pos, "lambda is only valid as a call argument (spec Section 3.4)"); + // Still resolve the body so downstream passes don't see unresolved + // parameter bindings; the emitted error will surface the problem. + this.resolveLambda(expr); + break; + case "path": + this.resolvePath(expr); + break; + } + } + + private resolveIdent(e: IdentExpr): void { + if (e.namespace) { + if (!this.inMethodArg) { + this.error(e.pos, `${e.namespace}::${e.name} is not a valid expression (call it with parentheses or pass to a method)`); + } + this.resolveQualifiedIdent(e); + } else if (this.scope.isParam(e.name)) { + // Resolves to a parameter (map param, lambda param, match-as binding). + // Bare identifiers must NOT resolve to $variables (those require the $ + // prefix via VarExpr). + } else { + const isFn = this.knownFunctions.has(e.name); + if (this.isKnownMap(e.name) || isFn) { + if (!this.inMethodArg) { + this.error(e.pos, `${e.name} is not a valid expression (call it with parentheses or pass to a method)`); + } + } else { + this.error(e.pos, `undeclared identifier "${e.name}"`); + } + } + } + + private resolveQualifiedIdent(e: IdentExpr): void { + const maps = this.prog.namespaces.get(e.namespace); + if (!maps) { + this.error(e.pos, `unknown namespace "${e.namespace}"`); + return; + } + if (!maps.some((m) => m.name === e.name)) { + this.error(e.pos, `nonexistent map ${e.namespace}::${e.name}`); + } + } + + private resolveCall(e: CallExpr): void { + // Validate named arg consistency. + if (e.named && e.args.length > 0) { + const seen = new Set(); + for (const arg of e.args) { + if (!arg.name) { + this.error(e.pos, "cannot mix positional and named arguments"); + break; + } + if (seen.has(arg.name)) { + this.error(e.pos, `duplicate named argument "${arg.name}"`); + } + seen.add(arg.name); + } + } + + if (!e.namespace) { + const m = this.findMap(e.name); + if (m) { + this.checkMapArity(e, m); + } else if (this.knownFunctions.has(e.name)) { + const fi = this.knownFunctions.get(e.name)!; + this.checkFunctionArity(e, fi); + this.applyArgFolder(fi.argFolder, e.args, e.pos, `${e.name}()`); + } else { + this.error(e.pos, `unknown function or map "${e.name}"`); + } + + if (e.name === "throw" && e.args.length === 1) { + const arg = e.args[0]!.value; + if (arg.kind === "literal" && arg.tokenType !== TokenType.STRING && arg.tokenType !== TokenType.RAW_STRING) { + this.error(e.pos, "throw() requires a string argument"); + } + } + } + + if (e.namespace) { + const maps = this.prog.namespaces.get(e.namespace); + if (!maps) { + this.error(e.pos, `unknown namespace "${e.namespace}"`); + } else if (!maps.some((m) => m.name === e.name)) { + this.error(e.pos, `nonexistent map ${e.namespace}::${e.name}()`); + } + } + + // No function or user map accepts a lambda argument. + for (const arg of e.args) this.resolveArgValue(arg.value, false, e.name); + } + + private resolveArgValue(value: Expr, acceptsLambda: boolean, calleeName: string): void { + if (value.kind === "lambda") { + if (!acceptsLambda) { + this.error(value.pos, `${calleeName}() does not accept a lambda argument`); + } + this.resolveLambda(value); + return; + } + this.resolveExpr(value); + } + + private resolveMethodArgs( + args: CallArg[], + mi: MethodInfo | null, + calleeName: string, + ): void { + const saved = this.inMethodArg; + this.inMethodArg = true; + for (let i = 0; i < args.length; i++) { + const arg = args[i]!; + if (arg.value.kind === "ident") { + const ident = arg.value; + if (ident.namespace) { + const m = this.findNamespacedMap(ident.namespace, ident.name); + if (m) this.checkMapRefArity(ident.pos, `${ident.namespace}::${ident.name}`, m); + } else { + const m = this.findMap(ident.name); + if (m) this.checkMapRefArity(ident.pos, ident.name, m); + } + } + const acceptsLambda = mi === null ? true : paramAcceptsLambda(mi, i, arg.name); + this.resolveArgValue(arg.value, acceptsLambda, calleeName); + } + this.inMethodArg = saved; + } + + private resolvePath(expr: { + kind: "path"; + pos: Pos; + root: string; + varName: string; + segments: PathSegment[]; + }): void { + if (this.inMap) { + if (expr.root === "input" || expr.root === "input_meta") { + this.error(expr.pos, "cannot access input inside a map body"); + } + if (expr.root === "output" || expr.root === "output_meta") { + this.error(expr.pos, "cannot access output inside a map body"); + } + } + if (expr.root === "var" && !this.scope.isDeclared(expr.varName)) { + this.error(expr.pos, "undeclared variable $" + expr.varName); + } + for (const seg of expr.segments) { + if (seg.index) this.resolveExpr(seg.index); + if (seg.args.length > 0) { + const mi = seg.segKind === "method" ? this.methodInfo(seg.name) : null; + if (mi?.argFolder) { + this.applyArgFolder(mi.argFolder, seg.args, seg.pos, `.${seg.name}()`); + } + this.resolveMethodArgs(seg.args, mi, seg.name); + } + } + } + + /** + * applyArgFolder runs folder against args and, on success, attaches + * non-null folded values to the matching CallArg.folded field. A + * folder throwing an error is recorded as a resolver diagnostic at + * pos. Silently tolerates folder-returned arrays of the wrong length + * (contract violation we don't want to block compilation for). + */ + private applyArgFolder(folder: ArgFolder | undefined, args: CallArg[], pos: Pos, calleeLabel: string): void { + if (!folder || args.length === 0) return; + let folded: Array; + try { + folded = folder(args); + } catch (e) { + this.error(pos, `${calleeLabel}: ${(e as Error).message}`); + return; + } + if (folded.length !== args.length) return; + for (let i = 0; i < args.length; i++) { + if (folded[i] !== null && folded[i] !== undefined) { + args[i]!.folded = folded[i]; + } + } + } + + private resolveIfExpr(e: IfExpr): void { + for (const branch of e.branches) { + this.resolveExpr(branch.cond); + this.withScope(() => this.resolveExprBody(branch.body)); + } + if (e.else_) { + this.withScope(() => this.resolveExprBody(e.else_!)); + } + } + + private resolveMatchExpr(e: MatchExpr): void { + if (e.subject) this.resolveExpr(e.subject); + const isEqualityMatch = e.subject !== null && !e.binding; + + for (const c of e.cases) { + if (c.pattern && !c.wildcard) { + if (isEqualityMatch && c.pattern.kind === "literal") { + if (c.pattern.tokenType === TokenType.TRUE || c.pattern.tokenType === TokenType.FALSE) { + this.error(c.pattern.pos, "boolean literal as case value in equality match (use 'as' for boolean conditions)"); + } + } + this.withScope(() => { + if (e.binding) this.scope.params.add(e.binding); + this.resolveExpr(c.pattern!); + }); + } + this.withScope(() => { + if (e.binding) this.scope.params.add(e.binding); + if ("kind" in c.body) { + this.resolveExpr(c.body); + } else { + this.resolveExprBody(c.body); + } + }); + } + } + + private resolveLambda(e: { params: Param[]; body: ExprBody; pos: Pos }): void { + this.validateParams(e.params); + this.withScope(() => { + for (const p of e.params) { + if (!p.discard) this.scope.params.add(p.name); + } + this.resolveExprBody(e.body); + }); + } + + private withScope(fn: () => void): void { + const saved = this.scope; + this.scope = new ResolveScope(this.scope); + fn(); + this.scope = saved; + } + + private findMap(name: string): MapDecl | null { + return this.mapIndex.get(name) ?? null; + } + + private findNamespacedMap(namespace: string, name: string): MapDecl | null { + const maps = this.prog.namespaces.get(namespace); + return maps?.find((m) => m.name === name) ?? null; + } + + private isKnownMap(name: string): boolean { + return this.mapIndex.has(name); + } + + private checkMapArity(e: CallExpr, m: MapDecl): void { + let required = 0; + let total = 0; + let hasDiscard = false; + for (const p of m.params) { + total++; + if (p.discard) { + hasDiscard = true; + required++; + } else if (!p.default_) { + required++; + } + } + + if (e.named && hasDiscard) { + this.error(e.pos, "cannot use named arguments with discard parameters"); + return; + } + + if (e.named) { + const paramNames = new Set(m.params.filter((p) => !p.discard).map((p) => p.name)); + for (const arg of e.args) { + if (!paramNames.has(arg.name)) { + this.error(e.pos, `unknown named argument "${arg.name}"`); + } + } + const provided = new Set(e.args.map((a) => a.name)); + for (const p of m.params) { + if (!p.discard && !provided.has(p.name) && !p.default_) { + this.error(e.pos, `arity mismatch: missing required named argument "${p.name}"`); + } + } + } else { + if (e.args.length < required) { + this.error(e.pos, `arity mismatch: ${e.name}() requires at least ${required} arguments, got ${e.args.length}`); + } + if (e.args.length > total) { + this.error(e.pos, `arity mismatch: ${e.name}() accepts at most ${total} arguments, got ${e.args.length}`); + } + } + } + + private checkFunctionArity(e: CallExpr, fi: FunctionInfo): void { + if (fi.total < 0) return; + if (e.args.length < fi.required) { + this.error(e.pos, `${e.name}() requires at least ${fi.required} arguments, got ${e.args.length}`); + } + if (e.args.length > fi.total) { + this.error(e.pos, `${e.name}() accepts at most ${fi.total} arguments, got ${e.args.length}`); + } + } + + private checkMethodCallArity(e: { + method: string; + methodPos: Pos; + args: { name: string; value: Expr }[]; + }): void { + const info = this.methodInfo(e.method); + if (!info || info.total < 0) return; + if (e.args.length < info.required) { + this.error( + e.methodPos, + `.${e.method}() requires at least ${info.required} arguments, got ${e.args.length}`, + ); + } + if (e.args.length > info.total) { + this.error( + e.methodPos, + `.${e.method}() accepts at most ${info.total} arguments, got ${e.args.length}`, + ); + } + } + + private checkMapRefArity(pos: Pos, displayName: string, m: MapDecl): void { + let required = 0; + for (const p of m.params) { + if (!p.default_ && !p.discard) required++; + } + if (required !== 1) { + this.error(pos, `arity mismatch: ${displayName}() requires ${required} arguments, but higher-order methods pass 1`); + } + } +} diff --git a/internal/bloblang2/ts/src/scanner.ts b/internal/bloblang2/ts/src/scanner.ts new file mode 100644 index 000000000..030f4d964 --- /dev/null +++ b/internal/bloblang2/ts/src/scanner.ts @@ -0,0 +1,597 @@ +// Tokenizer for Bloblang V2 source code. + +import { + type Token, + type Pos, + type PosError, + TokenType, + lookupIdent, + isReservedName, + suppressesFollowingNL, + isPostfixContinuation, +} from "./token.js"; + +export interface ScannerState { + pos: number; + line: number; + col: number; + prevTok: TokenType; + parenDepth: number; + bracketDepth: number; + peeked: Token | null; + errorLen: number; +} + +export class Scanner { + private src: string; + private file: string; + + private pos = 0; + private line = 1; + private col = 1; + private prevTok: TokenType = TokenType.NL; // suppress leading newlines + + private parenDepth = 0; + private bracketDepth = 0; + + private peeked: Token | null = null; + + errors: PosError[] = []; + + constructor(src: string, file: string) { + this.src = src; + this.file = file; + } + + /** Save scanner state for lookahead/backtracking. */ + saveState(): ScannerState { + return { + pos: this.pos, + line: this.line, + col: this.col, + prevTok: this.prevTok, + parenDepth: this.parenDepth, + bracketDepth: this.bracketDepth, + peeked: this.peeked, + errorLen: this.errors.length, + }; + } + + /** Restore scanner state from a previous save. */ + restoreState(state: ScannerState): void { + this.pos = state.pos; + this.line = state.line; + this.col = state.col; + this.prevTok = state.prevTok; + this.parenDepth = state.parenDepth; + this.bracketDepth = state.bracketDepth; + this.peeked = state.peeked; + this.errors.length = state.errorLen; + } + + /** Returns the next token. Returns EOF repeatedly after input is exhausted. */ + next(): Token { + if (this.peeked !== null) { + const tok = this.peeked; + this.peeked = null; + this.trackToken(tok); + return tok; + } + return this.scan(); + } + + private trackToken(tok: Token): void { + if (tok.type !== TokenType.NL) { + this.prevTok = tok.type; + } + switch (tok.type) { + case TokenType.LPAREN: + this.parenDepth++; + break; + case TokenType.RPAREN: + if (this.parenDepth > 0) this.parenDepth--; + break; + case TokenType.LBRACKET: + case TokenType.QLBRACKET: + this.bracketDepth++; + break; + case TokenType.RBRACKET: + if (this.bracketDepth > 0) this.bracketDepth--; + break; + } + } + + /** Produces the next token with newline suppression applied. */ + private scan(): Token { + for (;;) { + const tok = this.scanRaw(); + if (tok.type !== TokenType.NL) { + this.trackToken(tok); + return tok; + } + + // Mechanism 1: inside () or []. + if (this.parenDepth > 0 || this.bracketDepth > 0) continue; + + // Mechanism 3: previous token suppresses NL. + if (suppressesFollowingNL(this.prevTok)) continue; + + // Mechanism 2: next token is postfix continuation. + const nextTok = this.peekNextNonNL(); + if (isPostfixContinuation(nextTok.type)) continue; + + // Collapse consecutive NLs. + if (this.prevTok === TokenType.NL) continue; + + this.prevTok = TokenType.NL; + return tok; + } + } + + /** Peek ahead past NL tokens to find the next substantive token. */ + private peekNextNonNL(): Token { + const saved = this.saveState(); + for (;;) { + const tok = this.scanRaw(); + if (tok.type !== TokenType.NL) { + this.restoreState(saved); + return tok; + } + } + } + + /** Produces the next raw token without newline suppression. */ + private scanRaw(): Token { + this.skipWhitespaceAndComments(); + + if (this.pos >= this.src.length) { + return this.makeToken(TokenType.EOF, ""); + } + + const ch = this.src[this.pos]!; + + // Newlines. + if (ch === "\n") { + const tok = this.makeToken(TokenType.NL, "\n"); + this.advance(); + return tok; + } + if (ch === "\r") { + const tok = this.makeToken(TokenType.NL, "\n"); + this.advance(); + if (this.pos < this.src.length && this.src[this.pos] === "\n") { + this.advance(); + } + return tok; + } + + // String literals. + if (ch === '"') return this.scanString(); + if (ch === "`") return this.scanRawString(); + + // Numbers. + if (isDigit(ch)) return this.scanNumber(); + + // Variable $name. + if (ch === "$") return this.scanVar(); + + // Identifiers and keywords. + if (isIdentStart(ch)) return this.scanWord(); + + // Operators and delimiters. + return this.scanOperator(); + } + + private scanString(): Token { + const startPos = this.currentPos(); + this.advance(); // skip opening " + + let s = ""; + while (this.pos < this.src.length) { + const ch = this.src[this.pos]!; + if (ch === '"') { + this.advance(); // skip closing " + return { type: TokenType.STRING, literal: s, pos: startPos }; + } + if (ch === "\n" || ch === "\r") { + this.addError(this.currentPos(), "unterminated string literal"); + return { type: TokenType.ILLEGAL, literal: s, pos: startPos }; + } + if (ch === "\\") { + this.advance(); + const escaped = this.scanEscapeSeq(); + if (escaped === null) { + return { type: TokenType.ILLEGAL, literal: s, pos: startPos }; + } + s += escaped; + continue; + } + // Regular character — read full codepoint. + const cp = this.src.codePointAt(this.pos)!; + s += String.fromCodePoint(cp); + this.advanceN(cp > 0xffff ? 2 : 1); + } + this.addError(startPos, "unterminated string literal"); + return { type: TokenType.ILLEGAL, literal: s, pos: startPos }; + } + + private scanEscapeSeq(): string | null { + if (this.pos >= this.src.length) { + this.addError(this.currentPos(), "unterminated escape sequence"); + return null; + } + const ch = this.src[this.pos]!; + const chPos = this.currentPos(); + this.advance(); + switch (ch) { + case '"': + return '"'; + case "\\": + return "\\"; + case "n": + return "\n"; + case "t": + return "\t"; + case "r": + return "\r"; + case "u": + return this.scanUnicodeEscape(); + default: + this.addError(chPos, `invalid escape character '${ch}'`); + return null; + } + } + + private scanUnicodeEscape(): string | null { + if (this.pos >= this.src.length) { + this.addError(this.currentPos(), "unterminated unicode escape"); + return null; + } + + // \u{X...} form: 1-6 hex digits. + if (this.src[this.pos] === "{") { + this.advance(); // skip { + const start = this.pos; + while (this.pos < this.src.length && isHexDigit(this.src[this.pos]!)) { + this.advance(); + } + const hexStr = this.src.slice(start, this.pos); + if (hexStr.length === 0 || hexStr.length > 6) { + this.addError(this.currentPos(), "\\u{} requires 1-6 hex digits"); + return null; + } + if (this.pos >= this.src.length || this.src[this.pos] !== "}") { + this.addError(this.currentPos(), "unterminated \\u{} escape"); + return null; + } + this.advance(); // skip } + const codepoint = parseInt(hexStr, 16); + if (codepoint > 0x10ffff) { + this.addError( + this.currentPos(), + `unicode codepoint U+${codepoint.toString(16).toUpperCase()} out of range`, + ); + return null; + } + if (codepoint >= 0xd800 && codepoint <= 0xdfff) { + this.addError( + this.currentPos(), + `surrogate codepoint U+${codepoint.toString(16).toUpperCase()} is invalid`, + ); + return null; + } + return String.fromCodePoint(codepoint); + } + + // \uXXXX form: exactly 4 hex digits. + if (this.pos + 4 > this.src.length) { + this.addError(this.currentPos(), "\\uXXXX requires exactly 4 hex digits"); + return null; + } + const hexStr = this.src.slice(this.pos, this.pos + 4); + for (const c of hexStr) { + if (!isHexDigit(c)) { + this.addError(this.currentPos(), `invalid hex digit '${c}' in \\uXXXX`); + return null; + } + } + this.advanceN(4); + const codepoint = parseInt(hexStr, 16); + if (codepoint >= 0xd800 && codepoint <= 0xdfff) { + this.addError( + this.currentPos(), + `surrogate codepoint U+${codepoint.toString(16).toUpperCase().padStart(4, "0")} is invalid`, + ); + return null; + } + return String.fromCodePoint(codepoint); + } + + private scanRawString(): Token { + const startPos = this.currentPos(); + this.advance(); // skip opening ` + + const start = this.pos; + while (this.pos < this.src.length) { + if (this.src[this.pos] === "`") { + const lit = this.src.slice(start, this.pos); + this.advance(); // skip closing ` + return { type: TokenType.RAW_STRING, literal: lit, pos: startPos }; + } + this.advance(); // handles newline tracking + } + this.addError(startPos, "unterminated raw string literal"); + return { + type: TokenType.ILLEGAL, + literal: this.src.slice(start), + pos: startPos, + }; + } + + private scanNumber(): Token { + const startPos = this.currentPos(); + const start = this.pos; + + while (this.pos < this.src.length && isDigit(this.src[this.pos]!)) { + this.advance(); + } + + // Check for float: digits.digits + if (this.pos < this.src.length && this.src[this.pos] === ".") { + if ( + this.pos + 1 < this.src.length && + isDigit(this.src[this.pos + 1]!) + ) { + this.advance(); // skip . + while (this.pos < this.src.length && isDigit(this.src[this.pos]!)) { + this.advance(); + } + return { + type: TokenType.FLOAT, + literal: this.src.slice(start, this.pos), + pos: startPos, + }; + } + } + + // Integer — validate range. + const lit = this.src.slice(start, this.pos); + try { + const n = BigInt(lit); + if (n > 9223372036854775807n || n < -9223372036854775808n) { + this.addError(startPos, `integer literal ${lit} exceeds int64 range`); + return { type: TokenType.ILLEGAL, literal: lit, pos: startPos }; + } + } catch { + this.addError(startPos, `invalid integer literal ${lit}`); + return { type: TokenType.ILLEGAL, literal: lit, pos: startPos }; + } + return { type: TokenType.INT, literal: lit, pos: startPos }; + } + + private scanVar(): Token { + const startPos = this.currentPos(); + this.advance(); // skip $ + + if (this.pos >= this.src.length || !isIdentStart(this.src[this.pos]!)) { + this.addError(startPos, "expected identifier after $"); + return { type: TokenType.ILLEGAL, literal: "$", pos: startPos }; + } + + const start = this.pos; + while (this.pos < this.src.length && isIdentContinue(this.src[this.pos]!)) { + this.advance(); + } + + const name = this.src.slice(start, this.pos); + if (isReservedName(name)) { + this.addError( + startPos, + `"${name}" is a reserved function name and cannot be used as a variable name`, + ); + } + return { + type: TokenType.VAR, + literal: name, + pos: startPos, + }; + } + + private scanWord(): Token { + const startPos = this.currentPos(); + const start = this.pos; + while (this.pos < this.src.length && isIdentContinue(this.src[this.pos]!)) { + this.advance(); + } + const word = this.src.slice(start, this.pos); + return { type: lookupIdent(word), literal: word, pos: startPos }; + } + + private scanOperator(): Token { + const startPos = this.currentPos(); + const ch = this.src[this.pos]!; + this.advance(); + + switch (ch) { + case ".": + return { type: TokenType.DOT, literal: ".", pos: startPos }; + case "@": + return { type: TokenType.AT, literal: "@", pos: startPos }; + case "(": + return { type: TokenType.LPAREN, literal: "(", pos: startPos }; + case ")": + return { type: TokenType.RPAREN, literal: ")", pos: startPos }; + case "{": + return { type: TokenType.LBRACE, literal: "{", pos: startPos }; + case "}": + return { type: TokenType.RBRACE, literal: "}", pos: startPos }; + case "[": + return { type: TokenType.LBRACKET, literal: "[", pos: startPos }; + case "]": + return { type: TokenType.RBRACKET, literal: "]", pos: startPos }; + case ",": + return { type: TokenType.COMMA, literal: ",", pos: startPos }; + case "+": + return { type: TokenType.PLUS, literal: "+", pos: startPos }; + case "*": + return { type: TokenType.STAR, literal: "*", pos: startPos }; + case "/": + return { type: TokenType.SLASH, literal: "/", pos: startPos }; + case "%": + return { type: TokenType.PERCENT, literal: "%", pos: startPos }; + + case "?": + if (this.pos < this.src.length) { + if (this.src[this.pos] === ".") { + this.advance(); + return { type: TokenType.QDOT, literal: "?.", pos: startPos }; + } + if (this.src[this.pos] === "[") { + this.advance(); + return { type: TokenType.QLBRACKET, literal: "?[", pos: startPos }; + } + } + this.addError(startPos, "unexpected character '?'"); + return { type: TokenType.ILLEGAL, literal: "?", pos: startPos }; + + case ":": + if (this.pos < this.src.length && this.src[this.pos] === ":") { + this.advance(); + return { type: TokenType.DCOLON, literal: "::", pos: startPos }; + } + return { type: TokenType.COLON, literal: ":", pos: startPos }; + + case "=": + if (this.pos < this.src.length) { + if (this.src[this.pos] === "=") { + this.advance(); + return { type: TokenType.EQ, literal: "==", pos: startPos }; + } + if (this.src[this.pos] === ">") { + this.advance(); + return { type: TokenType.FATARROW, literal: "=>", pos: startPos }; + } + } + return { type: TokenType.ASSIGN, literal: "=", pos: startPos }; + + case "!": + if (this.pos < this.src.length && this.src[this.pos] === "=") { + this.advance(); + return { type: TokenType.NE, literal: "!=", pos: startPos }; + } + return { type: TokenType.BANG, literal: "!", pos: startPos }; + + case ">": + if (this.pos < this.src.length && this.src[this.pos] === "=") { + this.advance(); + return { type: TokenType.GE, literal: ">=", pos: startPos }; + } + return { type: TokenType.GT, literal: ">", pos: startPos }; + + case "<": + if (this.pos < this.src.length && this.src[this.pos] === "=") { + this.advance(); + return { type: TokenType.LE, literal: "<=", pos: startPos }; + } + return { type: TokenType.LT, literal: "<", pos: startPos }; + + case "&": + if (this.pos < this.src.length && this.src[this.pos] === "&") { + this.advance(); + return { type: TokenType.AND, literal: "&&", pos: startPos }; + } + this.addError(startPos, "unexpected character '&', did you mean '&&'?"); + return { type: TokenType.ILLEGAL, literal: "&", pos: startPos }; + + case "|": + if (this.pos < this.src.length && this.src[this.pos] === "|") { + this.advance(); + return { type: TokenType.OR, literal: "||", pos: startPos }; + } + this.addError(startPos, "unexpected character '|', did you mean '||'?"); + return { type: TokenType.ILLEGAL, literal: "|", pos: startPos }; + + case "-": + if (this.pos < this.src.length && this.src[this.pos] === ">") { + this.advance(); + return { type: TokenType.THINARROW, literal: "->", pos: startPos }; + } + return { type: TokenType.MINUS, literal: "-", pos: startPos }; + } + + this.addError(startPos, `unexpected character '${ch}'`); + return { type: TokenType.ILLEGAL, literal: ch, pos: startPos }; + } + + private skipWhitespaceAndComments(): void { + while (this.pos < this.src.length) { + const ch = this.src[this.pos]!; + if (ch === " " || ch === "\t") { + this.advance(); + continue; + } + if (ch === "#") { + // Comment: skip to end of line (don't consume newline). + while ( + this.pos < this.src.length && + this.src[this.pos] !== "\n" && + this.src[this.pos] !== "\r" + ) { + this.advance(); + } + continue; + } + break; + } + } + + private currentPos(): Pos { + return { file: this.file, line: this.line, column: this.col }; + } + + private makeToken(type: TokenType, literal: string): Token { + return { type, literal, pos: this.currentPos() }; + } + + private advance(): void { + if (this.pos < this.src.length) { + if (this.src[this.pos] === "\n") { + this.line++; + this.col = 1; + } else { + this.col++; + } + this.pos++; + } + } + + private advanceN(n: number): void { + for (let i = 0; i < n; i++) { + this.advance(); + } + } + + private addError(pos: Pos, msg: string): void { + this.errors.push({ pos, msg }); + } +} + +function isDigit(ch: string): boolean { + return ch >= "0" && ch <= "9"; +} + +function isHexDigit(ch: string): boolean { + return ( + (ch >= "0" && ch <= "9") || + (ch >= "a" && ch <= "f") || + (ch >= "A" && ch <= "F") + ); +} + +function isIdentStart(ch: string): boolean { + return (ch >= "a" && ch <= "z") || (ch >= "A" && ch <= "Z") || ch === "_"; +} + +function isIdentContinue(ch: string): boolean { + return isIdentStart(ch) || isDigit(ch); +} diff --git a/internal/bloblang2/ts/src/scope.ts b/internal/bloblang2/ts/src/scope.ts new file mode 100644 index 000000000..c982050dd --- /dev/null +++ b/internal/bloblang2/ts/src/scope.ts @@ -0,0 +1,49 @@ +// Scope for the Bloblang V2 interpreter. +// +// Two modes: +// - "statement": assigning to an existing outer variable modifies it; +// new variables are block-scoped. +// - "expression": assigning always shadows (local); used for lambdas and maps. + +import type { Value } from "./value.js"; + +export type ScopeMode = "statement" | "expression"; + +export class Scope { + parent: Scope | null; + mode: ScopeMode; + vars: Map; + + constructor(parent: Scope | null, mode: ScopeMode) { + this.parent = parent; + this.mode = mode; + this.vars = new Map(); + } + + /** Look up a variable by walking the scope chain. */ + get(name: string): Value | undefined { + for (let cur: Scope | null = this; cur !== null; cur = cur.parent) { + const v = cur.vars.get(name); + if (v !== undefined) return v; + } + return undefined; + } + + /** + * Assign a variable, respecting the scope mode: + * - Expression mode: always writes locally (shadow). + * - Statement mode: if variable exists in an ancestor, update the ancestor. + * Otherwise, create locally. + */ + set(name: string, value: Value): void { + if (this.mode === "statement") { + for (let cur = this.parent; cur !== null; cur = cur.parent) { + if (cur.vars.has(name)) { + cur.vars.set(name, value); + return; + } + } + } + this.vars.set(name, value); + } +} diff --git a/internal/bloblang2/ts/src/stdlib/array_methods.ts b/internal/bloblang2/ts/src/stdlib/array_methods.ts new file mode 100644 index 000000000..a79dd9636 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/array_methods.ts @@ -0,0 +1,423 @@ +// Array/sequence methods: length, append, concat, flatten, reverse, sort, +// unique, contains, enumerate, sum, min, max, join, collect, values, +// iter (object→entries array). + +import type { Interpreter, MethodSpec } from "../interpreter.js"; +import { TokenType } from "../token.js"; +import { evalBinaryOp } from "../arithmetic.js"; +import { + type Value, + mkInt64, + mkBool, + mkString, + mkArray, + mkObject, + mkError, + isString, + isInt64, + isInt32, + isUint32, + isUint64, + isFloat32, + isFloat64, + isArray, + isObject, + isBytes, + isTimestamp, + isNumeric, + isError as isErrorV, + typeName, + valuesEqual, + promoteChecked, +} from "../value.js"; + +// --------------------------------------------------------------------------- +// Sort helpers +// --------------------------------------------------------------------------- + +function isNaNValue(v: Value): boolean { + return (isFloat64(v) && Number.isNaN(v.value)) || (isFloat32(v) && Number.isNaN(v.value)); +} + +function isSortable(v: Value): boolean { + return ( + isInt32(v) || isInt64(v) || isUint32(v) || isUint64(v) || + isFloat32(v) || isFloat64(v) || isString(v) || isTimestamp(v) + ); +} + +function compareForSort(a: Value, b: Value): Value { + const aNaN = isNaNValue(a); + const bNaN = isNaNValue(b); + if (aNaN && bNaN) return mkInt64(0n); + if (aNaN) return mkInt64(1n); + if (bNaN) return mkInt64(-1n); + + // Numeric comparison. + if (isNumeric(a) && isNumeric(b)) { + const result = promoteChecked(a, b); + if (result === null) { + return mkError(`cannot sort: promotion failed for ${a.tag} and ${b.tag}`); + } + const [pa, pb, kind] = result; + switch (kind) { + case "int64": { + const av = (pa as { value: bigint }).value; + const bv = (pb as { value: bigint }).value; + return mkInt64(av < bv ? -1n : av > bv ? 1n : 0n); + } + case "int32": { + const av = (pa as { value: number }).value; + const bv = (pb as { value: number }).value; + return mkInt64(av < bv ? -1n : av > bv ? 1n : 0n); + } + case "uint32": { + const av = (pa as { value: number }).value; + const bv = (pb as { value: number }).value; + return mkInt64(av < bv ? -1n : av > bv ? 1n : 0n); + } + case "uint64": { + const av = (pa as { value: bigint }).value; + const bv = (pb as { value: bigint }).value; + return mkInt64(av < bv ? -1n : av > bv ? 1n : 0n); + } + case "float64": + case "float32": { + const av = (pa as { value: number }).value; + const bv = (pb as { value: number }).value; + return mkInt64(av < bv ? -1n : av > bv ? 1n : 0n); + } + } + return mkInt64(0n); + } + + // String comparison. + if (isString(a) && isString(b)) { + return mkInt64( + a.value < b.value ? -1n : a.value > b.value ? 1n : 0n, + ); + } + + // Timestamp comparison. + if (isTimestamp(a) && isTimestamp(b)) { + return mkInt64( + a.value < b.value ? -1n : a.value > b.value ? 1n : 0n, + ); + } + + return mkError(`cannot sort: incompatible types ${a.tag} and ${b.tag}`); +} + +// --------------------------------------------------------------------------- +// Registration +// --------------------------------------------------------------------------- + +export function registerArrayMethods(interp: Interpreter): void { + const m = ( + fn: (interp: Interpreter, receiver: Value, args: Value[]) => Value, + ): MethodSpec => ({ + fn, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + }); + + // --- length --- + interp.registerMethod( + "length", + m((_i, recv) => { + if (isString(recv)) { + // Count codepoints, not UTF-16 code units. + return mkInt64(BigInt([...recv.value].length)); + } + if (isArray(recv)) return mkInt64(BigInt(recv.value.length)); + if (isBytes(recv)) return mkInt64(BigInt(recv.value.length)); + if (isObject(recv)) return mkInt64(BigInt(recv.value.size)); + return mkError(`length() not supported on ${typeName(recv)}`); + }), + ); + + // --- contains (string + array + bytes) --- + interp.registerMethod( + "contains", + m((_i, recv, args) => { + if (args.length !== 1) { + return mkError("contains() requires exactly one argument"); + } + if (isString(recv)) { + const target = args[0]!; + if (!isString(target)) { + return mkError("string contains() requires string argument"); + } + return mkBool(recv.value.includes(target.value)); + } + if (isArray(recv)) { + for (const elem of recv.value) { + if (valuesEqual(elem, args[0]!)) return mkBool(true); + } + return mkBool(false); + } + if (isBytes(recv)) { + const target = args[0]!; + if (!isBytes(target)) { + return mkError("bytes contains() requires bytes argument"); + } + // Search for subsequence. + const haystack = recv.value; + const needle = target.value; + outer: for (let i = 0; i <= haystack.length - needle.length; i++) { + for (let j = 0; j < needle.length; j++) { + if (haystack[i + j] !== needle[j]) continue outer; + } + return mkBool(true); + } + return mkBool(false); + } + return mkError(`contains() not supported on ${typeName(recv)}`); + }), + ); + + // --- reverse --- + interp.registerMethod( + "reverse", + m((_i, recv) => { + if (isString(recv)) { + return mkString([...recv.value].reverse().join("")); + } + if (isArray(recv)) { + return mkArray([...recv.value].reverse()); + } + if (isBytes(recv)) { + const result = new Uint8Array(recv.value.length); + for (let i = 0, j = recv.value.length - 1; j >= 0; i++, j--) { + result[i] = recv.value[j]!; + } + return { tag: "bytes", value: result }; + } + return mkError(`reverse() not supported on ${typeName(recv)}`); + }), + ); + + // --- append --- + interp.registerMethod( + "append", + m((_i, recv, args) => { + if (!isArray(recv)) { + return mkError(`append() requires array, got ${typeName(recv)}`); + } + if (args.length !== 1) return mkError("append() requires one argument"); + return mkArray([...recv.value, args[0]!]); + }), + ); + + // --- concat --- + interp.registerMethod( + "concat", + m((_i, recv, args) => { + if (!isArray(recv)) { + return mkError(`concat() requires array, got ${typeName(recv)}`); + } + if (args.length !== 1) return mkError("concat() requires one argument"); + const other = args[0]!; + if (!isArray(other)) return mkError("concat() argument must be array"); + return mkArray([...recv.value, ...other.value]); + }), + ); + + // --- flatten --- + interp.registerMethod( + "flatten", + m((_i, recv) => { + if (!isArray(recv)) { + return mkError(`flatten() requires array, got ${typeName(recv)}`); + } + const result: Value[] = []; + for (const elem of recv.value) { + if (isArray(elem)) { + result.push(...elem.value); + } else { + result.push(elem); + } + } + return mkArray(result); + }), + ); + + // --- enumerate --- + interp.registerMethod( + "enumerate", + m((_i, recv) => { + if (!isArray(recv)) { + return mkError(`enumerate() requires array, got ${typeName(recv)}`); + } + return mkArray( + recv.value.map((v, i) => + mkObject( + new Map([ + ["index", mkInt64(BigInt(i))], + ["value", v], + ]), + ), + ), + ); + }), + ); + + // --- join --- + interp.registerMethod( + "join", + m((_i, recv, args) => { + if (!isArray(recv)) { + return mkError(`join() requires array, got ${typeName(recv)}`); + } + if (args.length !== 1) return mkError("join() requires one argument"); + const delim = args[0]!; + if (!isString(delim)) return mkError("join() delimiter must be string"); + const parts: string[] = []; + for (let i = 0; i < recv.value.length; i++) { + const elem = recv.value[i]!; + if (!isString(elem)) { + return mkError( + `join() requires all elements to be strings, element ${i} is ${typeName(elem)}`, + ); + } + parts.push(elem.value); + } + return mkString(parts.join(delim.value)); + }), + ); + + // --- sum --- + interp.registerMethod( + "sum", + m((_i, recv) => { + if (!isArray(recv)) { + return mkError(`sum() requires array, got ${typeName(recv)}`); + } + if (recv.value.length === 0) return mkInt64(0n); + let result = recv.value[0]!; + if (!isNumeric(result)) { + return mkError( + `sum() requires numeric elements, got ${typeName(result)}`, + ); + } + for (let i = 1; i < recv.value.length; i++) { + result = evalBinaryOp(TokenType.PLUS, result, recv.value[i]!); + if (isErrorV(result)) return result; + } + return result; + }), + ); + + // --- min --- + interp.registerMethod( + "min", + m((_i, recv) => { + if (!isArray(recv)) { + return mkError(`min() requires array, got ${typeName(recv)}`); + } + if (recv.value.length === 0) { + return mkError("min() requires non-empty array"); + } + let result = recv.value[0]!; + let widest = recv.value[0]!; + for (let i = 1; i < recv.value.length; i++) { + const elem = recv.value[i]!; + const cmp = compareForSort(result, elem); + if (isErrorV(cmp)) return cmp; + if ((cmp as { value: bigint }).value > 0n) { + result = elem; + } + const promoted = promoteChecked(widest, elem); + if (promoted !== null) { + widest = promoted[0]; + } + } + const finalP = promoteChecked(result, widest); + if (finalP !== null) result = finalP[0]; + return result; + }), + ); + + // --- max --- + interp.registerMethod( + "max", + m((_i, recv) => { + if (!isArray(recv)) { + return mkError(`max() requires array, got ${typeName(recv)}`); + } + if (recv.value.length === 0) { + return mkError("max() requires non-empty array"); + } + let result = recv.value[0]!; + let widest = recv.value[0]!; + for (let i = 1; i < recv.value.length; i++) { + const elem = recv.value[i]!; + const cmp = compareForSort(result, elem); + if (isErrorV(cmp)) return cmp; + if ((cmp as { value: bigint }).value < 0n) { + result = elem; + } + const promoted = promoteChecked(widest, elem); + if (promoted !== null) { + widest = promoted[0]; + } + } + const finalP = promoteChecked(result, widest); + if (finalP !== null) result = finalP[0]; + return result; + }), + ); + + // --- collect --- + interp.registerMethod( + "collect", + m((_i, recv) => { + if (!isArray(recv)) { + return mkError(`collect() requires array, got ${typeName(recv)}`); + } + const result = new Map(); + for (const elem of recv.value) { + if (!isObject(elem)) { + return mkError("collect() requires array of {key, value} objects"); + } + const key = elem.value.get("key"); + if (key === undefined || !isString(key)) { + return mkError("collect() entry missing string 'key' field"); + } + const val = elem.value.get("value"); + if (val === undefined) { + return mkError("collect() entry missing 'value' field"); + } + result.set(key.value, val); + } + return mkObject(result); + }), + ); + + // --- iter (object → array of {key, value}) --- + interp.registerMethod( + "iter", + m((_i, recv) => { + if (!isObject(recv)) { + return mkError(`iter() requires object, got ${typeName(recv)}`); + } + const result: Value[] = []; + for (const [k, v] of recv.value) { + result.push( + mkObject( + new Map([ + ["key", mkString(k)], + ["value", v], + ]), + ), + ); + } + return mkArray(result); + }), + ); +} + +// Export compareForSort and isSortable for lambda_methods to use. +export { compareForSort, isSortable, isNaNValue }; diff --git a/internal/bloblang2/ts/src/stdlib/encoding.ts b/internal/bloblang2/ts/src/stdlib/encoding.ts new file mode 100644 index 000000000..fa02ed28d --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/encoding.ts @@ -0,0 +1,424 @@ +// Encoding methods: parse_json, format_json, encode, decode. + +/* eslint-disable @typescript-eslint/no-explicit-any */ +declare const TextEncoder: { new (): { encode(s: string): Uint8Array } }; +declare const TextDecoder: { + new (label?: string, options?: { fatal?: boolean }): { + decode(input: Uint8Array): string; + }; +}; +declare const Buffer: { + from(data: Uint8Array): { toString(encoding: string): string }; + from(data: string, encoding: string): Uint8Array; +}; +declare function btoa(s: string): string; +declare function atob(s: string): string; +/* eslint-enable @typescript-eslint/no-explicit-any */ + +import type { Interpreter, MethodSpec } from "../interpreter.js"; +import { + type Value, + mkInt64, + mkFloat64, + mkString, + mkBool, + mkBytes, + mkArray, + mkObject, + mkError, + isNull, + isString, + isBool, + isInt32, + isInt64, + isUint32, + isUint64, + isFloat32, + isFloat64, + isArray, + isObject, + isBytes, + isTimestamp, + isNumeric, + typeName, + NULL, + toJSON, +} from "../value.js"; +import { strftimeFormat, DEFAULT_TIMESTAMP_FORMAT } from "./timestamp.js"; + +// --------------------------------------------------------------------------- +// JSON normalization (like Go's json.Number → int64/float64) +// --------------------------------------------------------------------------- + +/** + * Parse JSON string preserving Go-like number semantics: + * - Numbers with decimal point or exponent notation → float64 + * - Integer numbers → int64 + */ +function parseJSONToValue(data: string): Value { + // Use a two-pass approach: first find all number literals and check if they have + // decimal points or exponent notation, then parse normally. + const floatPositions = new Set(); + + // Walk the JSON string to find number tokens that contain '.', 'e', or 'E'. + let i = 0; + const len = data.length; + let path: string[] = []; + let arrayIndices: number[] = []; + + // Simple approach: parse with reviver to detect the raw text. + // JSON.parse raw source is available via the `source` parameter in modern Node. + // Fallback: use regex to detect exponent numbers in the source. + // Actually, simplest: custom reviver that gets the key, check source text. + + // Use JSON.parse with a reviver that gets the raw value. + // In Node 22+, JSON.parse has `context.source` in the reviver. + // For compatibility, use a different approach: parse, then re-scan source for number tokens. + + // Simplest correct approach: scan JSON for number tokens, tag any with e/E/. as float. + const numberRegex = /(?<=[[{:,\s]|^)-?(?:0|[1-9]\d*)(?:\.\d+)?(?:[eE][+-]?\d+)?(?=[\s,}\]\n]|$)/g; + const floatNumbers = new Set(); + let match; + while ((match = numberRegex.exec(data)) !== null) { + const numStr = match[0]; + if (numStr.includes('.') || numStr.includes('e') || numStr.includes('E')) { + floatNumbers.add(parseFloat(numStr)); + } + } + + // Parse with awareness of float numbers. + const parsed = JSON.parse(data); + return normalizeJSONValueWithFloats(parsed, floatNumbers); +} + +function normalizeJSONValueWithFloats(v: unknown, floatNumbers: Set): Value { + if (v === null || v === undefined) return NULL; + if (typeof v === "boolean") return mkBool(v); + if (typeof v === "string") return mkString(v); + if (typeof v === "number") { + // If this number was written with exponent or decimal point, it's float64. + if (!Number.isInteger(v) || floatNumbers.has(v)) { + return mkFloat64(v); + } + return mkInt64(BigInt(v)); + } + if (Array.isArray(v)) { + return mkArray(v.map(e => normalizeJSONValueWithFloats(e, floatNumbers))); + } + if (typeof v === "object") { + const m = new Map(); + for (const [key, val] of Object.entries(v as Record)) { + m.set(key, normalizeJSONValueWithFloats(val, floatNumbers)); + } + return mkObject(m); + } + return mkError(`parse_json(): unsupported type ${typeof v}`); +} + +function normalizeJSONValue(v: unknown): Value { + if (v === null || v === undefined) return NULL; + if (typeof v === "boolean") return mkBool(v); + if (typeof v === "string") return mkString(v); + if (typeof v === "number") { + if (Number.isInteger(v)) { + return mkInt64(BigInt(v)); + } + return mkFloat64(v); + } + if (Array.isArray(v)) { + return mkArray(v.map(normalizeJSONValue)); + } + if (typeof v === "object") { + const m = new Map(); + for (const [key, val] of Object.entries(v as Record)) { + m.set(key, normalizeJSONValue(val)); + } + return mkObject(m); + } + return mkError(`parse_json(): unsupported type ${typeof v}`); +} + +// --------------------------------------------------------------------------- +// format_json helpers +// --------------------------------------------------------------------------- + +function checkJSONSerializable(v: Value): string { + if (isFloat64(v) || isFloat32(v)) { + const f = v.value; + if (Number.isNaN(f)) return "format_json(): NaN is not representable in JSON"; + if (!Number.isFinite(f)) return "format_json(): Infinity is not representable in JSON"; + } + if (isBytes(v)) { + return "format_json(): bytes have no implicit JSON serialization"; + } + if (isArray(v)) { + for (const elem of v.value) { + const err = checkJSONSerializable(elem); + if (err) return err; + } + } + if (isObject(v)) { + for (const [, val] of v.value) { + const err = checkJSONSerializable(val); + if (err) return err; + } + } + return ""; +} + +/** + * Convert a Value to a JSON-compatible JS object with sorted keys + * and timestamps formatted as strings. + */ +function sortedJSONValue(v: Value): unknown { + if (isNull(v)) return null; + if (isBool(v)) return v.value; + if (isInt32(v)) return v.value; + if (isInt64(v)) return Number(v.value); + if (isUint32(v)) return v.value; + if (isUint64(v)) return Number(v.value); + if (isFloat32(v)) return v.value; + if (isFloat64(v)) return v.value; + if (isString(v)) return v.value; + if (isTimestamp(v)) return strftimeFormat(v.value, DEFAULT_TIMESTAMP_FORMAT, v.offsetMinutes); + if (isArray(v)) return v.value.map(sortedJSONValue); + if (isObject(v)) { + // Sort keys for deterministic output. + const obj: Record = {}; + const keys = [...v.value.keys()].sort(); + for (const k of keys) { + obj[k] = sortedJSONValue(v.value.get(k)!); + } + return obj; + } + return toJSON(v); +} + +// --------------------------------------------------------------------------- +// Base64 helpers (works in both browser and Node.js) +// --------------------------------------------------------------------------- + +function base64Encode(data: Uint8Array): string { + if (typeof Buffer !== "undefined") { + return Buffer.from(data).toString("base64"); + } + // Browser. + let binary = ""; + for (const byte of data) binary += String.fromCharCode(byte); + return btoa(binary); +} + +function base64Decode(s: string): Uint8Array | null { + try { + // Validate base64 characters before decoding (Node's Buffer silently ignores invalid chars). + if (!/^[A-Za-z0-9+/]*={0,2}$/.test(s)) { + return null; + } + if (typeof Buffer !== "undefined") { + return new Uint8Array(Buffer.from(s, "base64")); + } + const binary = atob(s); + const bytes = new Uint8Array(binary.length); + for (let i = 0; i < binary.length; i++) bytes[i] = binary.charCodeAt(i); + return bytes; + } catch { + return null; + } +} + +function base64UrlEncode(data: Uint8Array): string { + const standard = base64Encode(data); + return standard.replace(/\+/g, "-").replace(/\//g, "_"); +} + +function base64UrlDecode(s: string): Uint8Array | null { + // Add padding if missing. + let padded = s.replace(/-/g, "+").replace(/_/g, "/"); + while (padded.length % 4 !== 0) padded += "="; + return base64Decode(padded); +} + +function base64RawUrlEncode(data: Uint8Array): string { + return base64UrlEncode(data).replace(/=/g, ""); +} + +function base64RawUrlDecode(s: string): Uint8Array | null { + return base64UrlDecode(s); +} + +function hexEncode(data: Uint8Array): string { + return Array.from(data, (b) => b.toString(16).padStart(2, "0")).join(""); +} + +function hexDecode(s: string): Uint8Array | null { + if (s.length % 2 !== 0) return null; + const bytes = new Uint8Array(s.length / 2); + for (let i = 0; i < s.length; i += 2) { + const byte = parseInt(s.slice(i, i + 2), 16); + if (Number.isNaN(byte)) return null; + bytes[i / 2] = byte; + } + return bytes; +} + +// --------------------------------------------------------------------------- +// Registration +// --------------------------------------------------------------------------- + +export function registerEncoding(interp: Interpreter): void { + const m = ( + fn: (interp: Interpreter, receiver: Value, args: Value[]) => Value, + opts?: Partial, + ): MethodSpec => ({ + fn, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + ...opts, + }); + + // --- parse_json --- + interp.registerMethod( + "parse_json", + m((_i, recv) => { + let data: string; + if (isString(recv)) { + data = recv.value; + } else if (isBytes(recv)) { + data = new TextDecoder().decode(recv.value); + } else { + return mkError(`parse_json() requires string or bytes, got ${typeName(recv)}`); + } + try { + return parseJSONToValue(data); + } catch (e) { + return mkError("parse_json() failed: " + (e as Error).message); + } + }), + ); + + // --- format_json --- + interp.registerMethod("format_json", { + fn: (_interp: Interpreter, receiver: Value, args: Value[]): Value => { + let indent = ""; + let escapeHTML = true; + + if (args.length > 0 && isString(args[0]!)) { + indent = args[0]!.value; + } + if (args.length > 1 && isBool(args[1]!) && args[1]!.value) { + indent = ""; // no_indent overrides indent + } + if (args.length > 2 && isBool(args[2]!)) { + escapeHTML = args[2]!.value; + } + + const err = checkJSONSerializable(receiver); + if (err) return mkError(err); + + const jsValue = sortedJSONValue(receiver); + let result: string; + if (indent !== "") { + result = JSON.stringify(jsValue, null, indent); + } else { + result = JSON.stringify(jsValue); + } + + // HTML escaping: JSON.stringify doesn't escape <, >, & by default. + if (escapeHTML) { + result = result + .replace(/&/g, "\\u0026") + .replace(//g, "\\u003e"); + } + + return mkString(result); + }, + lambdaFn: null, + intrinsic: false, + acceptsNull: true, + params: [ + { name: "indent", default_: mkString(""), hasDefault: true }, + { name: "no_indent", default_: mkBool(false), hasDefault: true }, + { name: "escape_html", default_: mkBool(true), hasDefault: true }, + ], + }); + + // --- encode --- + interp.registerMethod("encode", { + fn: (_interp: Interpreter, receiver: Value, args: Value[]): Value => { + if (args.length !== 1) return mkError("encode() requires one argument (scheme)"); + const scheme = args[0]!; + if (!isString(scheme)) return mkError("encode() scheme must be string"); + + let data: Uint8Array; + if (isString(receiver)) { + data = new TextEncoder().encode(receiver.value); + } else if (isBytes(receiver)) { + data = receiver.value; + } else { + return mkError(`encode() requires string or bytes, got ${typeName(receiver)}`); + } + + switch (scheme.value) { + case "base64": + return mkString(base64Encode(data)); + case "base64url": + return mkString(base64UrlEncode(data)); + case "base64rawurl": + return mkString(base64RawUrlEncode(data)); + case "hex": + return mkString(hexEncode(data)); + default: + return mkError("encode(): unknown scheme " + scheme.value); + } + }, + lambdaFn: null, + intrinsic: false, + acceptsNull: false, + params: [{ name: "scheme", default_: null, hasDefault: false }], + }); + + // --- decode --- + interp.registerMethod("decode", { + fn: (_interp: Interpreter, receiver: Value, args: Value[]): Value => { + if (!isString(receiver)) { + return mkError(`decode() requires string, got ${typeName(receiver)}`); + } + if (args.length !== 1) return mkError("decode() requires one argument (scheme)"); + const scheme = args[0]!; + if (!isString(scheme)) return mkError("decode() scheme must be string"); + + const s = receiver.value; + switch (scheme.value) { + case "base64": { + const b = base64Decode(s); + if (b === null) return mkError("decode() base64 failed"); + return mkBytes(b); + } + case "base64url": { + const b = base64UrlDecode(s); + if (b === null) return mkError("decode() base64url failed"); + return mkBytes(b); + } + case "base64rawurl": { + const b = base64RawUrlDecode(s); + if (b === null) return mkError("decode() base64rawurl failed"); + return mkBytes(b); + } + case "hex": { + const b = hexDecode(s); + if (b === null) return mkError("decode() hex failed"); + return mkBytes(b); + } + default: + return mkError("decode(): unknown scheme " + scheme.value); + } + }, + lambdaFn: null, + intrinsic: false, + acceptsNull: false, + params: [{ name: "scheme", default_: null, hasDefault: false }], + }); +} diff --git a/internal/bloblang2/ts/src/stdlib/functions.ts b/internal/bloblang2/ts/src/stdlib/functions.ts new file mode 100644 index 000000000..9ae36deae --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/functions.ts @@ -0,0 +1,344 @@ +// Stdlib functions: deleted, throw, uuid_v4, now, random_int, range, +// timestamp_unix, timestamp_unix_milli, timestamp_unix_nano, second, minute, +// hour, day, timestamp. + +declare const crypto: { + randomUUID?: () => string; + getRandomValues?: (buf: Uint8Array) => Uint8Array; +}; + +import type { Interpreter, FunctionSpec } from "../interpreter.js"; +import { + type Value, + mkInt64, + mkFloat64, + mkString, + mkArray, + mkTimestamp, + mkError, + DELETED, + VOID, + isString, + isInt64, + isUint64, + isInt32, + isUint32, + isFloat32, + isFloat64, +} from "../value.js"; + +function toInt64(v: Value): bigint | null { + if (isInt64(v)) return v.value; + if (isInt32(v)) return BigInt(v.value); + if (isUint32(v)) return BigInt(v.value); + if (isUint64(v)) return v.value; + if (isFloat64(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + if (isFloat32(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + return null; +} + +export function registerFunctions(interp: Interpreter): void { + interp.registerFunction("deleted", { + fn: () => DELETED, + params: [], + }); + + interp.registerFunction("void", { + fn: () => VOID, + params: [], + }); + + interp.registerFunction("throw", { + fn: (args: Value[]): Value => { + if (args.length !== 1) { + return mkError("throw() requires exactly one string argument"); + } + const msg = args[0]!; + if (!isString(msg)) { + return mkError(`throw() requires a string argument, got ${msg.tag}`); + } + return mkError(msg.value); + }, + params: [{ name: "message", default_: null, hasDefault: false }], + }); + + interp.registerFunction("uuid_v4", { + fn: (): Value => { + // crypto.randomUUID() is available in modern browsers and Node 19+. + if (typeof crypto !== "undefined" && crypto.randomUUID) { + return mkString(crypto.randomUUID()); + } + // Fallback: manual v4 UUID. + const bytes = new Uint8Array(16); + if (typeof crypto !== "undefined" && crypto.getRandomValues) { + crypto.getRandomValues(bytes); + } else { + for (let i = 0; i < 16; i++) bytes[i] = Math.floor(Math.random() * 256); + } + bytes[6] = (bytes[6]! & 0x0f) | 0x40; + bytes[8] = (bytes[8]! & 0x3f) | 0x80; + const hex = Array.from(bytes, (b) => b.toString(16).padStart(2, "0")).join(""); + return mkString( + `${hex.slice(0, 8)}-${hex.slice(8, 12)}-${hex.slice(12, 16)}-${hex.slice(16, 20)}-${hex.slice(20)}`, + ); + }, + params: [], + }); + + interp.registerFunction("now", { + fn: (): Value => { + const ms = Date.now(); + return mkTimestamp(BigInt(ms) * 1000000n); + }, + params: [], + }); + + interp.registerFunction("random_int", { + fn: (args: Value[]): Value => { + if (args.length !== 2) { + return mkError("random_int() requires min and max arguments"); + } + const minVal = toInt64(args[0]!); + const maxVal = toInt64(args[1]!); + if (minVal === null || maxVal === null) { + return mkError("random_int() requires integer arguments"); + } + if (minVal > maxVal) { + return mkError("random_int(): min must be <= max"); + } + const range = maxVal - minVal + 1n; + const rand = BigInt(Math.floor(Math.random() * Number(range))); + return mkInt64(minVal + rand); + }, + params: [ + { name: "min", default_: null, hasDefault: false }, + { name: "max", default_: null, hasDefault: false }, + ], + }); + + interp.registerFunction("range", { + fn: (args: Value[]): Value => { + if (args.length < 2 || args.length > 3) { + return mkError("range() requires 2 or 3 arguments"); + } + const start = toInt64(args[0]!); + const stop = toInt64(args[1]!); + if (start === null || stop === null) { + return mkError("range() requires integer arguments"); + } + let step: bigint; + if (args.length === 3) { + const s = toInt64(args[2]!); + if (s === null) return mkError("range() step must be integer"); + if (s === 0n) return mkError("range() step cannot be zero"); + if ((start < stop && s < 0n) || (start > stop && s > 0n)) { + return mkError("range() step direction contradicts start/stop"); + } + step = s; + } else { + step = start <= stop ? 1n : -1n; + } + if (start === stop) return mkArray([]); + const result: Value[] = []; + if (step > 0n) { + for (let i = start; i < stop; i += step) { + result.push(mkInt64(i)); + } + } else { + for (let i = start; i > stop; i += step) { + result.push(mkInt64(i)); + } + } + return mkArray(result); + }, + params: [ + { name: "start", default_: null, hasDefault: false }, + { name: "stop", default_: null, hasDefault: false }, + { name: "step", default_: null, hasDefault: true }, + ], + }); + + // Duration constants (nanoseconds). + interp.registerFunction("second", { + fn: () => mkInt64(1_000_000_000n), + params: [], + }); + interp.registerFunction("minute", { + fn: () => mkInt64(60_000_000_000n), + params: [], + }); + interp.registerFunction("hour", { + fn: () => mkInt64(3_600_000_000_000n), + params: [], + }); + interp.registerFunction("day", { + fn: () => mkInt64(86_400_000_000_000n), + params: [], + }); + + interp.registerFunction("timestamp", { + fn: (args: Value[]): Value => { + if (args.length < 3) { + return mkError("timestamp() requires at least year, month, day"); + } + const year = toInt64(args[0]!); + const month = toInt64(args[1]!); + const day = toInt64(args[2]!); + if (year === null || month === null || day === null) { + return mkError("timestamp() requires integer year, month, day"); + } + let hour = 0n, + minute = 0n, + sec = 0n, + nano = 0n; + let tz = "UTC"; + if (args.length > 3) { + const h = toInt64(args[3]!); + if (h !== null) hour = h; + } + if (args.length > 4) { + const m = toInt64(args[4]!); + if (m !== null) minute = m; + } + if (args.length > 5) { + const s = toInt64(args[5]!); + if (s !== null) sec = s; + } + if (args.length > 6) { + const n = toInt64(args[6]!); + if (n !== null) nano = n; + } + if (args.length > 7) { + const tzArg = args[7]!; + if (isString(tzArg)) tz = tzArg.value; + } + + if (month < 1n || month > 12n) { + return mkError(`timestamp(): month ${month} out of range (1-12)`); + } + if (day < 1n || day > 31n) { + return mkError(`timestamp(): day ${day} out of range (1-31)`); + } + if (hour < 0n || hour > 23n) { + return mkError(`timestamp(): hour ${hour} out of range (0-23)`); + } + if (minute < 0n || minute > 59n) { + return mkError(`timestamp(): minute ${minute} out of range (0-59)`); + } + if (sec < 0n || sec > 59n) { + return mkError(`timestamp(): second ${sec} out of range (0-59)`); + } + if (nano < 0n || nano > 999999999n) { + return mkError( + `timestamp(): nano ${nano} out of range (0-999999999)`, + ); + } + + // Build the Date. For non-UTC, try Intl API. + // JavaScript Date doesn't natively support arbitrary IANA timezones + // for construction, so we build in UTC and adjust for offset. + let date: Date; + if (tz === "UTC") { + date = new Date( + Date.UTC( + Number(year), + Number(month) - 1, + Number(day), + Number(hour), + Number(minute), + Number(sec), + ), + ); + // Fix year < 100. + if (year >= 0n && year < 100n) { + date.setUTCFullYear(Number(year)); + } + } else { + // Use a best-effort approach: construct in UTC, then try to find offset. + try { + // Build an ISO string and parse with the timezone. + const isoStr = + `${String(year).padStart(4, "0")}-${String(month).padStart(2, "0")}-${String(day).padStart(2, "0")}T` + + `${String(hour).padStart(2, "0")}:${String(minute).padStart(2, "0")}:${String(sec).padStart(2, "0")}`; + const formatter = new Intl.DateTimeFormat("en-US", { + timeZone: tz, + year: "numeric", + month: "2-digit", + day: "2-digit", + hour: "2-digit", + minute: "2-digit", + second: "2-digit", + hour12: false, + }); + // Verify timezone is valid by formatting. + formatter.format(new Date()); + + // Build a UTC date then find the offset at that point in time. + const utcDate = new Date( + Date.UTC( + Number(year), + Number(month) - 1, + Number(day), + Number(hour), + Number(minute), + Number(sec), + ), + ); + // Get the offset by formatting the UTC date in the target timezone. + const parts = new Intl.DateTimeFormat("en-US", { + timeZone: tz, + year: "numeric", + month: "numeric", + day: "numeric", + hour: "numeric", + minute: "numeric", + second: "numeric", + hour12: false, + }).formatToParts(utcDate); + const getPart = (type: string) => + parseInt( + parts.find((p) => p.type === type)?.value ?? "0", + 10, + ); + const tzDate = new Date( + Date.UTC( + getPart("year"), + getPart("month") - 1, + getPart("day"), + getPart("hour"), + getPart("minute"), + getPart("second"), + ), + ); + const offsetMs = tzDate.getTime() - utcDate.getTime(); + // The local time in the tz is utcDate + offset. We want the UTC time + // such that utc + offset = desired local time. So utc = local - offset. + date = new Date(utcDate.getTime() - offsetMs); + const tzOffsetMinutes = Math.round(offsetMs / 60000); + + void isoStr; // suppress unused warning + + const ms = BigInt(date.getTime()); + const nanos = ms * 1000000n + nano; + return mkTimestamp(nanos, tzOffsetMinutes); + } catch { + return mkError("timestamp(): unknown timezone " + tz); + } + } + + const ms = BigInt(date.getTime()); + const nanos = ms * 1000000n + nano; + return mkTimestamp(nanos); + }, + params: [ + { name: "year", default_: null, hasDefault: false }, + { name: "month", default_: null, hasDefault: false }, + { name: "day", default_: null, hasDefault: false }, + { name: "hour", default_: mkInt64(0n), hasDefault: true }, + { name: "minute", default_: mkInt64(0n), hasDefault: true }, + { name: "second", default_: mkInt64(0n), hasDefault: true }, + { name: "nano", default_: mkInt64(0n), hasDefault: true }, + { name: "timezone", default_: mkString("UTC"), hasDefault: true }, + ], + }); +} diff --git a/internal/bloblang2/ts/src/stdlib/index.ts b/internal/bloblang2/ts/src/stdlib/index.ts new file mode 100644 index 000000000..5672af679 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/index.ts @@ -0,0 +1,112 @@ +// Stdlib entry point: registers all functions, methods, and lambda methods. + +import type { Interpreter, MethodSpec, FunctionSpec } from "../interpreter.js"; +import type { FunctionInfo, MethodInfo } from "../resolver.js"; + +import { registerFunctions } from "./functions.js"; +import { registerTypeConversion } from "./type_conversion.js"; +import { registerStringMethods } from "./string_methods.js"; +import { registerArrayMethods } from "./array_methods.js"; +import { registerObjectMethods } from "./object_methods.js"; +import { registerNumericMethods } from "./numeric_methods.js"; +import { registerEncoding } from "./encoding.js"; +import { registerTimestamp } from "./timestamp.js"; +import { registerLambdaMethods } from "./lambda_methods.js"; + +/** + * Register all standard library functions and methods on the interpreter. + */ +export function registerStdlib(interp: Interpreter): void { + // Functions. + registerFunctions(interp); + + // Regular methods. + registerTypeConversion(interp); + registerStringMethods(interp); + registerArrayMethods(interp); + registerObjectMethods(interp); + registerNumericMethods(interp); + registerEncoding(interp); + registerTimestamp(interp); + + // Lambda methods (higher-order). + registerLambdaMethods(interp); +} + +/** + * Return the method and function registries needed by the resolver. + * This creates a lightweight stub with just the registration surface, + * avoiding a dependency on the full Interpreter constructor. + * + * Method infos carry arity (required / total). `required === 0` with + * `total === -1` is used for methods that declare no params array + * (e.g. intrinsic methods like `.catch` / `.or` whose arity is + * handled specially at call sites). + */ +export function stdlibNames(): { + methods: Map; + functions: Map; +} { + // Minimal stub that satisfies registerStdlib's registration calls. + const methods = new Map(); + const functions = new Map(); + const stub = { + methods, + functions, + maps: new Map(), + namespaces: new Map(), + registerMethod(name: string, spec: MethodSpec) { + methods.set(name, spec); + }, + registerFunction(name: string, spec: FunctionSpec) { + functions.set(name, spec); + }, + } as unknown as Interpreter; + + registerStdlib(stub); + + const methodInfos = new Map(); + for (const [name, spec] of methods) { + const methodAcceptsLambda = spec.lambdaFn !== null || spec.acceptsLambda === true; + if (!spec.params) { + methodInfos.set(name, { + required: 0, + total: -1, + acceptsLambda: methodAcceptsLambda, + argFolder: spec.argFolder, + }); + continue; + } + let required = 0; + let total = 0; + const params = spec.params.map((p) => { + total++; + if (!p.hasDefault) required++; + return { + name: p.name, + hasDefault: p.hasDefault, + acceptsLambda: p.acceptsLambda === true, + }; + }); + methodInfos.set(name, { + required, + total, + acceptsLambda: methodAcceptsLambda, + params, + argFolder: spec.argFolder, + }); + } + + const functionInfos = new Map(); + for (const [name, spec] of functions) { + let required = 0; + let total = 0; + for (const p of spec.params) { + total++; + if (!p.hasDefault) required++; + } + functionInfos.set(name, { required, total, argFolder: spec.argFolder }); + } + + return { methods: methodInfos, functions: functionInfos }; +} diff --git a/internal/bloblang2/ts/src/stdlib/lambda_methods.ts b/internal/bloblang2/ts/src/stdlib/lambda_methods.ts new file mode 100644 index 000000000..f43c6a502 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/lambda_methods.ts @@ -0,0 +1,669 @@ +// Higher-order lambda methods: filter, map, sort, sort_by, fold, any, all, +// find, unique, map_values, map_keys, map_entries, filter_entries, +// for_each, group_by, without_index, index_of, slice, or, catch. + +import type { Interpreter, MethodSpec, LambdaMethodFunc, MethodParam } from "../interpreter.js"; +import type { CallArg, LambdaExpr } from "../ast.js"; +import { + type Value, + mkInt64, + mkBool, + mkString, + mkArray, + mkObject, + mkError, + VOID, + isString, + isBool, + isInt64, + isInt32, + isUint32, + isUint64, + isFloat32, + isFloat64, + isArray, + isObject, + isBytes, + isError as isErrorV, + isVoid, + isDeleted, + isNumeric, + typeName, + valuesEqual, +} from "../value.js"; +import { compareForSort, isSortable, isNaNValue } from "./array_methods.js"; + +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- + +function toInt64(v: Value): bigint | null { + if (isInt64(v)) return v.value; + if (isInt32(v)) return BigInt(v.value); + if (isUint32(v)) return BigInt(v.value); + if (isUint64(v)) return v.value; + if (isFloat64(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + if (isFloat32(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + return null; +} + +function clampSlice(low: bigint, high: bigint, length: bigint): [bigint, bigint] { + if (low < 0n) low += length; + if (high < 0n) high += length; + if (low < 0n) low = 0n; + if (high > length) high = length; + if (low > high) low = high; + return [low, high]; +} + +// --------------------------------------------------------------------------- +// Registration +// --------------------------------------------------------------------------- + +export function registerLambdaMethods(interp: Interpreter): void { + const lm = ( + fn: LambdaMethodFunc, + params?: MethodParam[], + ): MethodSpec => ({ + fn: null, + lambdaFn: fn, + intrinsic: false, + params: params ?? null, + acceptsNull: false, + }); + + const fnParam: MethodParam = { + name: "fn", + default_: null, + hasDefault: false, + acceptsLambda: true, + }; + + // --- filter --- + interp.registerMethod( + "filter", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`filter() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("filter() requires a lambda argument"); + + const result: Value[] = []; + for (const elem of receiver.value) { + const val = interp.callLambda(lambda, [elem]); + if (isErrorV(val)) return val; + if (isVoid(val)) return mkError("filter() lambda returned void"); + if (!isBool(val)) { + return mkError(`filter() lambda must return bool, got ${val.tag}`); + } + if (val.value) result.push(elem); + } + return mkArray(result); + }, [fnParam]), + ); + + // --- map --- + interp.registerMethod( + "map", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`map() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("map() requires a lambda argument"); + + const result: Value[] = []; + for (const elem of receiver.value) { + const val = interp.callLambda(lambda, [elem]); + if (isErrorV(val)) return val; + if (isVoid(val)) { + return mkError("map() lambda returned void (must return a value for every element)"); + } + if (isDeleted(val)) continue; + result.push(val); + } + return mkArray(result); + }, [fnParam]), + ); + + // --- sort (lambda version — no required params) --- + interp.registerMethod( + "sort", + lm((interp, receiver, _args) => { + if (!isArray(receiver)) { + return mkError(`sort() requires array, got ${typeName(receiver)}`); + } + const arr = receiver.value; + if (arr.length === 0) return mkArray([]); + if (!isSortable(arr[0]!)) { + return mkError(`sort(): ${arr[0]!.tag} is not a sortable type`); + } + + const sorted = [...arr]; + let sortErr: Value | null = null; + sorted.sort((a, b) => { + if (sortErr !== null) return 0; + const cmp = compareForSort(a, b); + if (isErrorV(cmp)) { + sortErr = cmp; + return 0; + } + return Number((cmp as { value: bigint }).value); + }); + if (sortErr !== null) return sortErr; + return mkArray(sorted); + }), + ); + + // --- sort_by --- + interp.registerMethod( + "sort_by", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`sort_by() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("sort_by() requires a lambda argument"); + + const arr = receiver.value; + // Extract keys. + const keys: Value[] = new Array(arr.length); + for (let i = 0; i < arr.length; i++) { + const key = interp.callLambda(lambda, [arr[i]!]); + if (isErrorV(key)) return key; + keys[i] = key; + } + + const indices = Array.from({ length: arr.length }, (_, i) => i); + let sortErr: Value | null = null; + indices.sort((i, j) => { + if (sortErr !== null) return 0; + const cmp = compareForSort(keys[i]!, keys[j]!); + if (isErrorV(cmp)) { + sortErr = cmp; + return 0; + } + return Number((cmp as { value: bigint }).value); + }); + if (sortErr !== null) return sortErr; + + return mkArray(indices.map((i) => arr[i]!)); + }, [fnParam]), + ); + + // --- any --- + interp.registerMethod( + "any", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`any() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("any() requires a lambda argument"); + + for (const elem of receiver.value) { + const val = interp.callLambda(lambda, [elem]); + if (isErrorV(val)) return val; + if (isVoid(val)) return mkError("any() lambda returned void"); + if (!isBool(val)) { + return mkError(`any() lambda must return bool, got ${val.tag}`); + } + if (val.value) return mkBool(true); + } + return mkBool(false); + }, [fnParam]), + ); + + // --- all --- + interp.registerMethod( + "all", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`all() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("all() requires a lambda argument"); + + for (const elem of receiver.value) { + const val = interp.callLambda(lambda, [elem]); + if (isErrorV(val)) return val; + if (isVoid(val)) return mkError("all() lambda returned void"); + if (!isBool(val)) { + return mkError(`all() lambda must return bool, got ${val.tag}`); + } + if (!val.value) return mkBool(false); + } + return mkBool(true); + }, [fnParam]), + ); + + // --- find --- + interp.registerMethod( + "find", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`find() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("find() requires a lambda argument"); + + for (const elem of receiver.value) { + const val = interp.callLambda(lambda, [elem]); + if (isErrorV(val)) return val; + if (isVoid(val)) return mkError("find() lambda returned void"); + if (!isBool(val)) { + return mkError(`find() lambda must return bool, got ${val.tag}`); + } + if (val.value) return elem; + } + return VOID; + }, [fnParam]), + ); + + // --- fold --- + interp.registerMethod( + "fold", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`fold() requires array, got ${typeName(receiver)}`); + } + if (args.length !== 2) { + return mkError("fold() requires initial value and lambda arguments"); + } + const initial = interp.evalExpr(args[0]!.value); + if (isErrorV(initial)) return initial; + + const lambdaArg = args[1]!.value; + if (lambdaArg.kind !== "lambda") { + return mkError("fold() second argument must be a lambda"); + } + const lambda = lambdaArg as LambdaExpr; + + let tally: Value = initial; + for (const elem of receiver.value) { + tally = interp.callLambda(lambda, [tally, elem]); + if (isErrorV(tally)) return tally; + if (isVoid(tally)) return mkError("fold() lambda returned void"); + } + return tally; + }, [ + { name: "initial", default_: null, hasDefault: false }, + fnParam, + ]), + ); + + // --- unique --- + interp.registerMethod( + "unique", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`unique() requires array, got ${typeName(receiver)}`); + } + + let keyFn: LambdaExpr | null = null; + if (args.length > 0) { + keyFn = interp.extractLambdaOrMapRef(args); + } + + const seenList: Value[] = []; + let seenNaN = false; + const contains = (key: Value): boolean => { + if (isNaNValue(key)) { + if (seenNaN) return true; + seenNaN = true; + return false; + } + for (const s of seenList) { + if (valuesEqual(s, key)) return true; + } + return false; + }; + + const result: Value[] = []; + for (const elem of receiver.value) { + let key: Value; + if (keyFn !== null) { + key = interp.callLambda(keyFn, [elem]); + if (isErrorV(key)) return key; + } else { + key = elem; + } + if (!contains(key)) { + seenList.push(key); + result.push(elem); + } + } + return mkArray(result); + }, [{ name: "fn", default_: null, hasDefault: true, acceptsLambda: true }]), + ); + + // --- without_index --- + interp.registerMethod( + "without_index", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`without_index() requires array, got ${typeName(receiver)}`); + } + if (args.length !== 1) return mkError("without_index() requires one argument"); + const idxVal = interp.evalExpr(args[0]!.value); + if (isErrorV(idxVal)) return idxVal; + let idx = toInt64(idxVal); + if (idx === null) return mkError("without_index() argument must be integer"); + const len = BigInt(receiver.value.length); + if (idx < 0n) idx += len; + if (idx < 0n || idx >= len) { + return mkError("without_index(): index out of bounds"); + } + const n = Number(idx); + return mkArray([ + ...receiver.value.slice(0, n), + ...receiver.value.slice(n + 1), + ]); + }, [{ name: "index", default_: null, hasDefault: false }]), + ); + + // --- index_of --- + interp.registerMethod( + "index_of", + lm((interp, receiver, args) => { + if (args.length !== 1) return mkError("index_of() requires one argument"); + const target = interp.evalExpr(args[0]!.value); + if (isErrorV(target)) return target; + + if (isString(receiver)) { + if (!isString(target)) { + return mkError("string index_of() requires string argument"); + } + // Codepoint-based index. + const runes = [...receiver.value]; + const targetRunes = [...target.value]; + for (let i = 0; i <= runes.length - targetRunes.length; i++) { + if (runes.slice(i, i + targetRunes.length).join("") === target.value) { + return mkInt64(BigInt(i)); + } + } + return mkInt64(-1n); + } + if (isArray(receiver)) { + for (let i = 0; i < receiver.value.length; i++) { + if (valuesEqual(receiver.value[i]!, target)) { + return mkInt64(BigInt(i)); + } + } + return mkInt64(-1n); + } + if (isBytes(receiver)) { + if (!isBytes(target)) { + return mkError("bytes index_of() requires bytes argument"); + } + const haystack = receiver.value; + const needle = target.value; + outer: for (let i = 0; i <= haystack.length - needle.length; i++) { + for (let j = 0; j < needle.length; j++) { + if (haystack[i + j] !== needle[j]) continue outer; + } + return mkInt64(BigInt(i)); + } + return mkInt64(-1n); + } + return mkError(`index_of() not supported on ${typeName(receiver)}`); + }, [{ name: "target", default_: null, hasDefault: false }]), + ); + + // --- slice --- + interp.registerMethod( + "slice", + lm((interp, receiver, args) => { + if (args.length < 1 || args.length > 2) { + return mkError("slice() requires 1 or 2 arguments"); + } + const lowVal = interp.evalExpr(args[0]!.value); + if (isErrorV(lowVal)) return lowVal; + const low = toInt64(lowVal); + if (low === null) return mkError("slice() low must be integer"); + + if (isString(receiver)) { + const runes = [...receiver.value]; + const length = BigInt(runes.length); + let high = length; + if (args.length === 2) { + const hVal = interp.evalExpr(args[1]!.value); + if (isErrorV(hVal)) return hVal; + const h = toInt64(hVal); + if (h === null) return mkError("slice() high must be integer"); + high = h; + } + const [lo, hi] = clampSlice(low, high, length); + return mkString(runes.slice(Number(lo), Number(hi)).join("")); + } + if (isArray(receiver)) { + const length = BigInt(receiver.value.length); + let high = length; + if (args.length === 2) { + const hVal = interp.evalExpr(args[1]!.value); + if (isErrorV(hVal)) return hVal; + const h = toInt64(hVal); + if (h === null) return mkError("slice() high must be integer"); + high = h; + } + const [lo, hi] = clampSlice(low, high, length); + return mkArray(receiver.value.slice(Number(lo), Number(hi))); + } + if (isBytes(receiver)) { + const length = BigInt(receiver.value.length); + let high = length; + if (args.length === 2) { + const hVal = interp.evalExpr(args[1]!.value); + if (isErrorV(hVal)) return hVal; + const h = toInt64(hVal); + if (h === null) return mkError("slice() high must be integer"); + high = h; + } + const [lo, hi] = clampSlice(low, high, length); + return { tag: "bytes", value: receiver.value.slice(Number(lo), Number(hi)) }; + } + return mkError(`slice() not supported on ${typeName(receiver)}`); + }, [ + { name: "low", default_: null, hasDefault: false }, + { name: "high", default_: null, hasDefault: true }, + ]), + ); + + // --- map_values --- + interp.registerMethod( + "map_values", + lm((interp, receiver, args) => { + if (!isObject(receiver)) { + return mkError(`map_values() requires object, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("map_values() requires a lambda argument"); + + const result = new Map(); + for (const [k, v] of receiver.value) { + const val = interp.callLambda(lambda, [v]); + if (isErrorV(val)) return val; + if (isVoid(val)) return mkError("map_values() lambda returned void"); + if (isDeleted(val)) continue; + result.set(k, val); + } + return mkObject(result); + }, [fnParam]), + ); + + // --- map_keys --- + interp.registerMethod( + "map_keys", + lm((interp, receiver, args) => { + if (!isObject(receiver)) { + return mkError(`map_keys() requires object, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("map_keys() requires a lambda argument"); + + const result = new Map(); + for (const [k, v] of receiver.value) { + const newKey = interp.callLambda(lambda, [mkString(k)]); + if (isErrorV(newKey)) return newKey; + if (isVoid(newKey)) return mkError("map_keys() lambda returned void"); + if (isDeleted(newKey)) continue; + if (!isString(newKey)) { + return mkError(`map_keys() lambda must return string, got ${newKey.tag}`); + } + result.set(newKey.value, v); + } + return mkObject(result); + }, [fnParam]), + ); + + // --- map_entries --- + interp.registerMethod( + "map_entries", + lm((interp, receiver, args) => { + if (!isObject(receiver)) { + return mkError(`map_entries() requires object, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("map_entries() requires a lambda argument"); + + const result = new Map(); + for (const [k, v] of receiver.value) { + const entry = interp.callLambda(lambda, [mkString(k), v]); + if (isErrorV(entry)) return entry; + if (isVoid(entry)) return mkError("map_entries() lambda returned void"); + if (isDeleted(entry)) continue; + if (!isObject(entry)) { + return mkError("map_entries() lambda must return {key, value} object"); + } + const keyVal = entry.value.get("key"); + if (keyVal === undefined || !isString(keyVal)) { + return mkError("map_entries() returned entry missing string 'key'"); + } + const valVal = entry.value.get("value"); + if (valVal === undefined) { + return mkError("map_entries() returned entry missing 'value'"); + } + result.set(keyVal.value, valVal); + } + return mkObject(result); + }, [fnParam]), + ); + + // --- filter_entries --- + interp.registerMethod( + "filter_entries", + lm((interp, receiver, args) => { + if (!isObject(receiver)) { + return mkError(`filter_entries() requires object, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("filter_entries() requires a lambda argument"); + + const result = new Map(); + for (const [k, v] of receiver.value) { + const val = interp.callLambda(lambda, [mkString(k), v]); + if (isErrorV(val)) return val; + if (isVoid(val)) return mkError("filter_entries() lambda returned void"); + if (!isBool(val)) { + return mkError(`filter_entries() lambda must return bool, got ${val.tag}`); + } + if (val.value) result.set(k, v); + } + return mkObject(result); + }, [fnParam]), + ); + + // --- for_each --- + interp.registerMethod( + "for_each", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`for_each() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("for_each() requires a lambda argument"); + + for (const elem of receiver.value) { + const val = interp.callLambda(lambda, [elem]); + if (isErrorV(val)) return val; + } + return receiver; // Return the original array. + }, [fnParam]), + ); + + // --- group_by --- + interp.registerMethod( + "group_by", + lm((interp, receiver, args) => { + if (!isArray(receiver)) { + return mkError(`group_by() requires array, got ${typeName(receiver)}`); + } + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("group_by() requires a lambda argument"); + + const groups = new Map(); + for (const elem of receiver.value) { + const key = interp.callLambda(lambda, [elem]); + if (isErrorV(key)) return key; + if (!isString(key)) { + return mkError(`group_by() lambda must return string, got ${key.tag}`); + } + const existing = groups.get(key.value); + if (existing !== undefined) { + existing.push(elem); + } else { + groups.set(key.value, [elem]); + } + } + const result = new Map(); + for (const [k, v] of groups) { + result.set(k, mkArray(v)); + } + return mkObject(result); + }, [fnParam]), + ); + + // --- into --- + // Pass the receiver to a single-parameter lambda and return the result. + // Errors / void / deleted() from the lambda propagate unchanged. + // Accepts null receivers (any value type is valid per spec §13.12); + // only void / deleted / error receivers are rejected, which the + // interpreter's dispatch already handles before the method runs. + interp.registerMethod( + "into", + { + fn: null, + lambdaFn: (interp, receiver, args) => { + const lambda = interp.extractLambdaOrMapRef(args); + if (lambda === null) return mkError("into() requires a lambda argument"); + if (lambda.params.length !== 1) { + return mkError( + `into() requires a one-parameter lambda, got ${lambda.params.length} parameters`, + ); + } + return interp.callLambda(lambda, [receiver]); + }, + intrinsic: false, + params: [fnParam], + acceptsNull: true, + }, + ); + + // --- Intrinsic methods (registered for name resolution only) --- + interp.registerMethod("catch", { + fn: null, + lambdaFn: null, + intrinsic: true, + params: [{ name: "fn", default_: null, hasDefault: false, acceptsLambda: true }], + acceptsNull: false, + }); + + interp.registerMethod("or", { + fn: null, + lambdaFn: null, + intrinsic: true, + params: [{ name: "default", default_: null, hasDefault: false }], + acceptsNull: false, + }); +} diff --git a/internal/bloblang2/ts/src/stdlib/numeric_methods.ts b/internal/bloblang2/ts/src/stdlib/numeric_methods.ts new file mode 100644 index 000000000..8624ccca1 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/numeric_methods.ts @@ -0,0 +1,123 @@ +// Numeric methods: abs, ceil, floor, round, min (scalar), max (scalar). + +import type { Interpreter, MethodSpec } from "../interpreter.js"; +import { + type Value, + mkInt32, + mkInt64, + mkFloat32, + mkFloat64, + mkError, + isInt32, + isInt64, + isUint32, + isUint64, + isFloat32, + isFloat64, + typeName, + MIN_INT64, + MIN_INT32, +} from "../value.js"; + +function toInt64(v: Value): bigint | null { + if (isInt64(v)) return v.value; + if (isInt32(v)) return BigInt(v.value); + if (isUint32(v)) return BigInt(v.value); + if (isUint64(v)) return v.value; + if (isFloat64(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + if (isFloat32(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + return null; +} + +function roundFloat(f: number, decimals: bigint): number { + const shift = Math.pow(10, Number(decimals)); + // Math.round with banker's rounding (round half to even). + const shifted = f * shift; + const floored = Math.floor(shifted); + const diff = shifted - floored; + let rounded: number; + if (diff > 0.5) { + rounded = floored + 1; + } else if (diff < 0.5) { + rounded = floored; + } else { + // Round to even. + rounded = floored % 2 === 0 ? floored : floored + 1; + } + return rounded / shift; +} + +export function registerNumericMethods(interp: Interpreter): void { + const m = ( + fn: (interp: Interpreter, receiver: Value, args: Value[]) => Value, + ): MethodSpec => ({ + fn, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + }); + + // --- abs --- + interp.registerMethod( + "abs", + m((_i, recv) => { + if (isInt64(recv)) { + if (recv.value === MIN_INT64) { + return mkError("int64 overflow in abs()"); + } + return mkInt64(recv.value < 0n ? -recv.value : recv.value); + } + if (isInt32(recv)) { + if (recv.value === MIN_INT32) { + return mkError("int32 overflow in abs()"); + } + return mkInt32(recv.value < 0 ? -recv.value : recv.value); + } + if (isFloat64(recv)) return mkFloat64(Math.abs(recv.value)); + if (isFloat32(recv)) return mkFloat32(Math.abs(recv.value)); + if (isUint32(recv)) return recv; + if (isUint64(recv)) return recv; + return mkError(`abs() requires numeric, got ${typeName(recv)}`); + }), + ); + + // --- floor --- + interp.registerMethod( + "floor", + m((_i, recv) => { + if (isFloat64(recv)) return mkFloat64(Math.floor(recv.value)); + if (isFloat32(recv)) return mkFloat32(Math.floor(recv.value)); + if (isInt32(recv) || isInt64(recv) || isUint32(recv) || isUint64(recv)) return recv; + return mkError(`floor() requires numeric, got ${typeName(recv)}`); + }), + ); + + // --- ceil --- + interp.registerMethod( + "ceil", + m((_i, recv) => { + if (isFloat64(recv)) return mkFloat64(Math.ceil(recv.value)); + if (isFloat32(recv)) return mkFloat32(Math.ceil(recv.value)); + if (isInt32(recv) || isInt64(recv) || isUint32(recv) || isUint64(recv)) return recv; + return mkError(`ceil() requires numeric, got ${typeName(recv)}`); + }), + ); + + // --- round --- + interp.registerMethod( + "round", + m((_i, recv, args) => { + let decimals = 0n; + if (args.length > 0) { + const d = toInt64(args[0]!); + if (d === null) return mkError("round() argument must be integer"); + decimals = d; + } + if (isFloat64(recv)) return mkFloat64(roundFloat(recv.value, decimals)); + if (isFloat32(recv)) return mkFloat32(roundFloat(recv.value, decimals)); + if (isInt32(recv) || isInt64(recv) || isUint32(recv) || isUint64(recv)) return recv; + return mkError(`round() requires numeric, got ${typeName(recv)}`); + }), + ); +} diff --git a/internal/bloblang2/ts/src/stdlib/object_methods.ts b/internal/bloblang2/ts/src/stdlib/object_methods.ts new file mode 100644 index 000000000..7344886b2 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/object_methods.ts @@ -0,0 +1,134 @@ +// Object methods: keys, values, has_key, merge, without, assign. + +import type { Interpreter, MethodSpec } from "../interpreter.js"; +import { + type Value, + mkString, + mkBool, + mkArray, + mkObject, + mkError, + isString, + isArray, + isObject, + typeName, +} from "../value.js"; + +export function registerObjectMethods(interp: Interpreter): void { + const m = ( + fn: (interp: Interpreter, receiver: Value, args: Value[]) => Value, + ): MethodSpec => ({ + fn, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + }); + + // --- keys --- + interp.registerMethod( + "keys", + m((_i, recv) => { + if (!isObject(recv)) { + return mkError(`keys() requires object, got ${typeName(recv)}`); + } + const keys = [...recv.value.keys()].sort(); + return mkArray(keys.map(mkString)); + }), + ); + + // --- values --- + interp.registerMethod( + "values", + m((_i, recv) => { + if (!isObject(recv)) { + return mkError(`values() requires object, got ${typeName(recv)}`); + } + // Sort by keys for deterministic order. + const keys = [...recv.value.keys()].sort(); + return mkArray(keys.map((k) => recv.value.get(k)!)); + }), + ); + + // --- has_key --- + interp.registerMethod( + "has_key", + m((_i, recv, args) => { + if (!isObject(recv)) { + return mkError(`has_key() requires object, got ${typeName(recv)}`); + } + if (args.length !== 1) return mkError("has_key() requires one argument"); + const key = args[0]!; + if (!isString(key)) return mkError("has_key() argument must be string"); + return mkBool(recv.value.has(key.value)); + }), + ); + + // --- merge --- + interp.registerMethod( + "merge", + m((_i, recv, args) => { + if (!isObject(recv)) { + return mkError(`merge() requires object, got ${typeName(recv)}`); + } + if (args.length !== 1) return mkError("merge() requires one argument"); + const other = args[0]!; + if (!isObject(other)) return mkError("merge() argument must be object"); + const result = new Map(recv.value); + for (const [k, v] of other.value) { + result.set(k, v); + } + return mkObject(result); + }), + ); + + // --- without (object version: takes array of string keys) --- + interp.registerMethod( + "without", + m((_i, recv, args) => { + if (!isObject(recv)) { + return mkError(`without() requires object, got ${typeName(recv)}`); + } + if (args.length !== 1) return mkError("without() requires one argument"); + const keys = args[0]!; + if (!isArray(keys)) { + return mkError("without() argument must be array of strings"); + } + const exclude = new Set(); + for (let i = 0; i < keys.value.length; i++) { + const k = keys.value[i]!; + if (!isString(k)) { + return mkError( + `without() keys must be strings, element ${i} is ${typeName(k)}`, + ); + } + exclude.add(k.value); + } + const result = new Map(); + for (const [k, v] of recv.value) { + if (!exclude.has(k)) result.set(k, v); + } + return mkObject(result); + }), + ); + + // --- assign (alias for merge, may accept multiple objects) --- + interp.registerMethod( + "assign", + m((_i, recv, args) => { + if (!isObject(recv)) { + return mkError(`assign() requires object, got ${typeName(recv)}`); + } + const result = new Map(recv.value); + for (const arg of args) { + if (!isObject(arg)) { + return mkError("assign() arguments must be objects"); + } + for (const [k, v] of arg.value) { + result.set(k, v); + } + } + return mkObject(result); + }), + ); +} diff --git a/internal/bloblang2/ts/src/stdlib/string_methods.ts b/internal/bloblang2/ts/src/stdlib/string_methods.ts new file mode 100644 index 000000000..4d8a840e7 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/string_methods.ts @@ -0,0 +1,362 @@ +// String methods: uppercase, lowercase, trim, trim_prefix, trim_suffix, +// has_prefix, has_suffix, split, replace_all, contains (string overload), +// repeat, re_match, re_find_all, re_replace_all, parse_int. + +import type { Interpreter, MethodSpec } from "../interpreter.js"; +import type { ArgFolder } from "../resolver.js"; +import type { CallArg } from "../ast.js"; +import { TokenType } from "../token.js"; + +/** + * Convert Go regex replacement syntax to JS replacement syntax. + * Go uses $0 for whole match, ${name} for named groups. + * JS uses $& for whole match, $ for named groups. + * We also need to escape $$ (literal $) properly. + */ +function goReplacementToJS(s: string): string { + let result = ""; + for (let i = 0; i < s.length; i++) { + if (s[i] === "$") { + if (i + 1 < s.length && s[i + 1] === "0") { + result += "$&"; + i++; // skip the '0' + } else if (i + 1 < s.length && s[i + 1] === "{") { + // Find closing brace. + const close = s.indexOf("}", i + 2); + if (close !== -1) { + const name = s.substring(i + 2, close); + result += "$<" + name + ">"; + i = close; // skip to '}' + } else { + result += s[i]; + } + } else { + result += s[i]; + } + } else { + result += s[i]; + } + } + return result; +} +import { + type Value, + mkString, + mkBool, + mkInt64, + mkArray, + mkError, + isString, + isInt64, + isInt32, + isUint32, + isUint64, + isFloat32, + isFloat64, + isFolded, + typeName, +} from "../value.js"; + +function toInt64(v: Value): bigint | null { + if (isInt64(v)) return v.value; + if (isInt32(v)) return BigInt(v.value); + if (isUint32(v)) return BigInt(v.value); + if (isUint64(v)) return v.value; + if (isFloat64(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + if (isFloat32(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + return null; +} + +function requireString( + methodName: string, + receiver: Value, +): string | Value { + if (!isString(receiver)) { + return mkError(`${methodName}() requires string, got ${typeName(receiver)}`); + } + return receiver.value; +} + +function requireStringArg( + methodName: string, + args: Value[], + index: number, +): string | Value { + const arg = args[index]; + if (arg === undefined || !isString(arg)) { + return mkError(`${methodName}() argument must be string`); + } + return arg.value; +} + +/** + * foldRegexPattern is the ArgFolder shared by re_match, re_find_all, + * and re_replace_all. If arg 0 is a string literal, it's compiled into + * a RegExp at parse time (using `flags` — "" for re_match, "g" for the + * two find/replace variants). Dynamic patterns (e.g. `.re_match($pat)`) + * are left untouched and compile on every call, matching the previous + * behaviour. Also applies the Go-to-JS syntax mapping for named + * capture groups: `(?P...)` -> `(?...)`. + */ +function foldRegexPattern(flags: string): ArgFolder { + return (args: CallArg[]): Array => { + const out: Array = new Array(args.length).fill(null); + if (args.length === 0) return out; + const lit = args[0]!.value; + if (lit.kind !== "literal") return out; + if (lit.tokenType !== TokenType.STRING && lit.tokenType !== TokenType.RAW_STRING) { + return out; + } + const jsPattern = String(lit.value).replace(/\(\?P Value, + ): MethodSpec => ({ + fn, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + }); + + interp.registerMethod( + "uppercase", + m((_i, recv) => { + const s = requireString("uppercase", recv); + if (typeof s !== "string") return s; + return mkString(s.toUpperCase()); + }), + ); + + interp.registerMethod( + "lowercase", + m((_i, recv) => { + const s = requireString("lowercase", recv); + if (typeof s !== "string") return s; + return mkString(s.toLowerCase()); + }), + ); + + interp.registerMethod( + "trim", + m((_i, recv) => { + const s = requireString("trim", recv); + if (typeof s !== "string") return s; + return mkString(s.trim()); + }), + ); + + interp.registerMethod( + "trim_prefix", + m((_i, recv, args) => { + const s = requireString("trim_prefix", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) return mkError("trim_prefix() requires one argument"); + const prefix = requireStringArg("trim_prefix", args, 0); + if (typeof prefix !== "string") return prefix; + return mkString(s.startsWith(prefix) ? s.slice(prefix.length) : s); + }), + ); + + interp.registerMethod( + "trim_suffix", + m((_i, recv, args) => { + const s = requireString("trim_suffix", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) return mkError("trim_suffix() requires one argument"); + const suffix = requireStringArg("trim_suffix", args, 0); + if (typeof suffix !== "string") return suffix; + return mkString( + s.endsWith(suffix) ? s.slice(0, s.length - suffix.length) : s, + ); + }), + ); + + interp.registerMethod( + "has_prefix", + m((_i, recv, args) => { + const s = requireString("has_prefix", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) return mkError("has_prefix() requires one argument"); + const prefix = requireStringArg("has_prefix", args, 0); + if (typeof prefix !== "string") return prefix; + return mkBool(s.startsWith(prefix)); + }), + ); + + interp.registerMethod( + "has_suffix", + m((_i, recv, args) => { + const s = requireString("has_suffix", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) return mkError("has_suffix() requires one argument"); + const suffix = requireStringArg("has_suffix", args, 0); + if (typeof suffix !== "string") return suffix; + return mkBool(s.endsWith(suffix)); + }), + ); + + interp.registerMethod( + "split", + m((_i, recv, args) => { + const s = requireString("split", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) return mkError("split() requires one argument"); + const delim = requireStringArg("split", args, 0); + if (typeof delim !== "string") return delim; + + if (delim === "") { + if (s === "") return mkArray([]); + // Split by codepoint. + const codepoints = [...s]; + return mkArray(codepoints.map(mkString)); + } + return mkArray(s.split(delim).map(mkString)); + }), + ); + + interp.registerMethod( + "replace_all", + m((_i, recv, args) => { + const s = requireString("replace_all", recv); + if (typeof s !== "string") return s; + if (args.length !== 2) { + return mkError("replace_all() requires old and new arguments"); + } + const old = requireStringArg("replace_all", args, 0); + if (typeof old !== "string") return old; + const new_ = requireStringArg("replace_all", args, 1); + if (typeof new_ !== "string") return new_; + return mkString(s.replaceAll(old, new_)); + }), + ); + + interp.registerMethod( + "repeat", + m((_i, recv, args) => { + const s = requireString("repeat", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) return mkError("repeat() requires one argument"); + const count = toInt64(args[0]!); + if (count === null) return mkError("repeat() argument must be integer"); + if (count < 0n) return mkError("repeat() count must be non-negative"); + if (count > 1_000_000n) return mkError("repeat() count too large"); + return mkString(s.repeat(Number(count))); + }), + ); + + interp.registerMethod( + "re_match", + { + fn: (_i, recv, args) => { + const s = requireString("re_match", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) return mkError("re_match() requires one argument"); + const re = resolveRegex("re_match", args[0]!, ""); + if (!(re instanceof RegExp)) return re; + return mkBool(re.test(s)); + }, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + argFolder: foldRegexPattern(""), + }, + ); + + interp.registerMethod( + "re_find_all", + { + fn: (_i, recv, args) => { + const s = requireString("re_find_all", recv); + if (typeof s !== "string") return s; + if (args.length !== 1) { + return mkError("re_find_all() requires one argument"); + } + const re = resolveRegex("re_find_all", args[0]!, "g"); + if (!(re instanceof RegExp)) return re; + const matches = s.match(re); + if (matches === null) return mkArray([]); + return mkArray(matches.map(mkString)); + }, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + argFolder: foldRegexPattern("g"), + }, + ); + + interp.registerMethod( + "re_replace_all", + { + fn: (_i, recv, args) => { + const s = requireString("re_replace_all", recv); + if (typeof s !== "string") return s; + if (args.length !== 2) { + return mkError( + "re_replace_all() requires pattern and replacement arguments", + ); + } + const re = resolveRegex("re_replace_all", args[0]!, "g"); + if (!(re instanceof RegExp)) return re; + const replacement = requireStringArg("re_replace_all", args, 1); + if (typeof replacement !== "string") return replacement; + // Convert Go replacement syntax to JS: + // $0 → $& (whole match), ${name} → $ (named group) + const jsReplacement = goReplacementToJS(replacement); + return mkString(s.replace(re, jsReplacement)); + }, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + argFolder: foldRegexPattern("g"), + }, + ); + + interp.registerMethod( + "parse_int", + m((_i, recv) => { + const s = requireString("parse_int", recv); + if (typeof s !== "string") return s; + try { + const n = BigInt(s.trim()); + return mkInt64(n); + } catch { + return mkError("parse_int() cannot parse: " + s); + } + }), + ); +} diff --git a/internal/bloblang2/ts/src/stdlib/timestamp.ts b/internal/bloblang2/ts/src/stdlib/timestamp.ts new file mode 100644 index 000000000..37d920826 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/timestamp.ts @@ -0,0 +1,500 @@ +// Timestamp methods: ts_unix, ts_unix_milli, ts_unix_nano, ts_format, +// ts_parse, ts_add, ts_from_unix, ts_from_unix_milli, ts_from_unix_nano, +// ts_from_unix_micro, ts_unix_micro. +// +// Also exports strftime helpers used by type_conversion.ts and encoding.ts. + +import type { Interpreter, MethodSpec } from "../interpreter.js"; +import { + type Value, + mkInt64, + mkFloat64, + mkString, + mkTimestamp, + mkError, + isString, + isInt64, + isInt32, + isUint32, + isUint64, + isFloat32, + isFloat64, + isTimestamp, + isNumeric, + typeName, + MAX_INT64, +} from "../value.js"; + +// --------------------------------------------------------------------------- +// Constants +// --------------------------------------------------------------------------- + +export const DEFAULT_TIMESTAMP_FORMAT = "%Y-%m-%dT%H:%M:%S%f%z"; +const NANOS_PER_SECOND = 1_000_000_000n; +const NANOS_PER_MILLI = 1_000_000n; +const NANOS_PER_MICRO = 1_000n; + +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- + +function toInt64(v: Value): bigint | null { + if (isInt64(v)) return v.value; + if (isInt32(v)) return BigInt(v.value); + if (isUint32(v)) return BigInt(v.value); + if (isUint64(v)) { + if (v.value > MAX_INT64) return null; + return v.value; + } + if (isFloat64(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + if (isFloat32(v)) return isFinite(v.value) ? BigInt(Math.trunc(v.value)) : null; + return null; +} + +function toFloat64(v: Value): number | null { + if (isFloat64(v)) return v.value; + if (isFloat32(v)) return v.value; + if (isInt64(v)) return Number(v.value); + if (isInt32(v)) return v.value; + if (isUint32(v)) return v.value; + if (isUint64(v)) return Number(v.value); + return null; +} + +/** Decompose bigint nanos into Date + remaining nanos. */ +function nanosToDateParts(nanos: bigint): { date: Date; subMilliNanos: bigint } { + const millis = nanos / NANOS_PER_MILLI; + const remainder = nanos - millis * NANOS_PER_MILLI; + return { + date: new Date(Number(millis)), + subMilliNanos: remainder < 0n ? remainder + NANOS_PER_MILLI : remainder, + }; +} + +// --------------------------------------------------------------------------- +// Strftime implementation +// --------------------------------------------------------------------------- + +function padN(n: number, width: number): string { + let s = String(n); + while (s.length < width) s = "0" + s; + return s; +} + +/** + * Format a timestamp (bigint nanos since epoch) with a strftime format. + * Supported directives: %Y, %m, %d, %H, %M, %S, %f, %z, %Z, %%. + */ +export function strftimeFormat(nanos: bigint, format: string, offsetMinutes: number = 0): string { + // Apply offset to get local time for display. + const displayNanos = nanos + BigInt(offsetMinutes) * 60n * NANOS_PER_SECOND; + const { date, subMilliNanos } = nanosToDateParts(displayNanos); + const totalNanos = Number( + (displayNanos % NANOS_PER_SECOND + NANOS_PER_SECOND) % NANOS_PER_SECOND, + ); + + let result = ""; + let i = 0; + while (i < format.length) { + if (format[i] === "%" && i + 1 < format.length) { + const directive = format[i + 1]!; + switch (directive) { + case "Y": + result += padN(date.getUTCFullYear(), 4); + break; + case "m": + result += padN(date.getUTCMonth() + 1, 2); + break; + case "d": + result += padN(date.getUTCDate(), 2); + break; + case "H": + result += padN(date.getUTCHours(), 2); + break; + case "M": + result += padN(date.getUTCMinutes(), 2); + break; + case "S": + result += padN(date.getUTCSeconds(), 2); + break; + case "f": { + // Fractional seconds: shortest with leading dot, trimmed trailing zeros. + // Empty when zero. + if (totalNanos === 0) { + // No fractional part. + } else { + let s = padN(totalNanos, 9); + // Trim trailing zeros. + s = s.replace(/0+$/, ""); + result += "." + s; + } + break; + } + case "z": { + if (offsetMinutes === 0) { + result += "Z"; + } else { + const sign = offsetMinutes >= 0 ? "+" : "-"; + const absOff = Math.abs(offsetMinutes); + const h = Math.floor(absOff / 60); + const m = absOff % 60; + result += sign + padN(h, 2) + ":" + padN(m, 2); + } + break; + } + case "Z": { + if (offsetMinutes === 0) { + result += "UTC"; + } else { + const sign = offsetMinutes >= 0 ? "+" : "-"; + const absOff = Math.abs(offsetMinutes); + const h = Math.floor(absOff / 60); + const m = absOff % 60; + result += sign + padN(h, 2) + ":" + padN(m, 2); + } + break; + } + case "%": + result += "%"; + break; + default: + // Unknown directive: pass through. + result += "%" + directive; + break; + } + i += 2; + } else { + result += format[i]!; + i++; + } + } + + void subMilliNanos; // used indirectly via totalNanos + return result; +} + +/** + * Parse a string with a strftime format into bigint nanos. + * Supported directives: %Y, %m, %d, %H, %M, %S, %f, %z, %%. + */ +export function strftimeParse(input: string, format: string): { nanos: bigint; offsetMinutes: number } | string { + let pos = 0; + let year = 0, + month = 1, + day = 1, + hour = 0, + minute = 0, + second = 0; + let fracNanos = 0; + let tzOffsetMinutes = 0; + let hasTz = false; + + let fi = 0; + while (fi < format.length) { + if (format[fi] === "%" && fi + 1 < format.length) { + const directive = format[fi + 1]!; + fi += 2; + + switch (directive) { + case "Y": { + const m = input.slice(pos).match(/^(\d{4})/); + if (!m) return "expected 4-digit year"; + year = parseInt(m[1]!, 10); + pos += m[1]!.length; + break; + } + case "m": { + const m = input.slice(pos).match(/^(\d{1,2})/); + if (!m) return "expected month"; + month = parseInt(m[1]!, 10); + pos += m[1]!.length; + break; + } + case "d": { + const m = input.slice(pos).match(/^(\d{1,2})/); + if (!m) return "expected day"; + day = parseInt(m[1]!, 10); + pos += m[1]!.length; + break; + } + case "H": { + const m = input.slice(pos).match(/^(\d{1,2})/); + if (!m) return "expected hour"; + hour = parseInt(m[1]!, 10); + pos += m[1]!.length; + break; + } + case "M": { + const m = input.slice(pos).match(/^(\d{1,2})/); + if (!m) return "expected minute"; + minute = parseInt(m[1]!, 10); + pos += m[1]!.length; + break; + } + case "S": { + const m = input.slice(pos).match(/^(\d{1,2})/); + if (!m) return "expected second"; + second = parseInt(m[1]!, 10); + pos += m[1]!.length; + break; + } + case "f": { + // Optional fractional seconds: '.' followed by 1-9 digits. + const m = input.slice(pos).match(/^\.(\d{1,9})/); + if (m) { + let digits = m[1]!; + while (digits.length < 9) digits += "0"; + fracNanos = parseInt(digits.slice(0, 9), 10); + pos += m[0]!.length; + } + // If no match, that's fine — %f is optional. + break; + } + case "z": { + hasTz = true; + const rest = input.slice(pos); + if (rest.startsWith("Z")) { + tzOffsetMinutes = 0; + pos += 1; + } else { + // Match ±HH:MM or ±HHMM. + const m = rest.match(/^([+-])(\d{2}):?(\d{2})/); + if (!m) return "expected timezone offset (Z, +HH:MM, or -HHMM)"; + const sign = m[1] === "+" ? 1 : -1; + const h = parseInt(m[2]!, 10); + const min = parseInt(m[3]!, 10); + tzOffsetMinutes = sign * (h * 60 + min); + pos += m[0]!.length; + } + break; + } + case "Z": { + // Named timezone — just consume alphabetic chars. + const m = input.slice(pos).match(/^([A-Za-z/_]+)/); + if (m) { + if (m[1] === "UTC" || m[1] === "GMT") { + tzOffsetMinutes = 0; + hasTz = true; + } + pos += m[1]!.length; + } + break; + } + case "%": { + if (input[pos] !== "%") return "expected literal %"; + pos++; + break; + } + default: + // Unknown: try to match literal. + if (input[pos] === directive) { + pos++; + } + break; + } + } else { + // Literal character. + if (input[pos] !== format[fi]) { + return `expected '${format[fi]}' at position ${pos}, got '${input[pos] ?? "EOF"}'`; + } + pos++; + fi++; + } + } + + // Build UTC date. + const d = Date.UTC(year, month - 1, day, hour, minute, second); + // Adjust for timezone offset. + const adjustedMs = d - tzOffsetMinutes * 60 * 1000; + const nanos = BigInt(adjustedMs) * NANOS_PER_MILLI + BigInt(fracNanos); + return { nanos, offsetMinutes: hasTz ? tzOffsetMinutes : 0 }; +} + +// --------------------------------------------------------------------------- +// Registration +// --------------------------------------------------------------------------- + +export function registerTimestamp(interp: Interpreter): void { + const m = ( + fn: (interp: Interpreter, receiver: Value, args: Value[]) => Value, + opts?: Partial, + ): MethodSpec => ({ + fn, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + ...opts, + }); + + // --- ts_unix --- + interp.registerMethod( + "ts_unix", + m((_i, recv) => { + if (!isTimestamp(recv)) { + return mkError(`ts_unix() requires timestamp, got ${typeName(recv)}`); + } + return mkInt64(recv.value / NANOS_PER_SECOND); + }), + ); + + // --- ts_unix_milli --- + interp.registerMethod( + "ts_unix_milli", + m((_i, recv) => { + if (!isTimestamp(recv)) { + return mkError(`ts_unix_milli() requires timestamp, got ${typeName(recv)}`); + } + return mkInt64(recv.value / NANOS_PER_MILLI); + }), + ); + + // --- ts_unix_micro --- + interp.registerMethod( + "ts_unix_micro", + m((_i, recv) => { + if (!isTimestamp(recv)) { + return mkError(`ts_unix_micro() requires timestamp, got ${typeName(recv)}`); + } + return mkInt64(recv.value / NANOS_PER_MICRO); + }), + ); + + // --- ts_unix_nano --- + interp.registerMethod( + "ts_unix_nano", + m((_i, recv) => { + if (!isTimestamp(recv)) { + return mkError(`ts_unix_nano() requires timestamp, got ${typeName(recv)}`); + } + return mkInt64(recv.value); + }), + ); + + // --- ts_from_unix --- + interp.registerMethod( + "ts_from_unix", + m((_i, recv) => { + if (!isNumeric(recv)) { + return mkError(`ts_from_unix() requires numeric, got ${typeName(recv)}`); + } + if (isUint64(recv) && recv.value > MAX_INT64) { + return mkError("ts_from_unix(): uint64 value exceeds int64 range"); + } + const f = toFloat64(recv); + if (f === null) return mkError("ts_from_unix() requires numeric"); + const sec = Math.trunc(f); + const nsec = Math.round((f - sec) * 1e9); + return mkTimestamp(BigInt(sec) * NANOS_PER_SECOND + BigInt(nsec)); + }), + ); + + // --- ts_from_unix_milli --- + interp.registerMethod( + "ts_from_unix_milli", + m((_i, recv) => { + const n = toInt64(recv); + if (n === null) { + return mkError(`ts_from_unix_milli() requires integer, got ${typeName(recv)}`); + } + return mkTimestamp(n * NANOS_PER_MILLI); + }), + ); + + // --- ts_from_unix_micro --- + interp.registerMethod( + "ts_from_unix_micro", + m((_i, recv) => { + const n = toInt64(recv); + if (n === null) { + return mkError(`ts_from_unix_micro() requires integer, got ${typeName(recv)}`); + } + return mkTimestamp(n * NANOS_PER_MICRO); + }), + ); + + // --- ts_from_unix_nano --- + interp.registerMethod( + "ts_from_unix_nano", + m((_i, recv) => { + const n = toInt64(recv); + if (n === null) { + return mkError(`ts_from_unix_nano() requires integer, got ${typeName(recv)}`); + } + return mkTimestamp(n); + }), + ); + + // --- ts_parse --- + interp.registerMethod("ts_parse", { + fn: (_interp: Interpreter, receiver: Value, args: Value[]): Value => { + if (!isString(receiver)) { + return mkError(`ts_parse() requires string, got ${typeName(receiver)}`); + } + let format = DEFAULT_TIMESTAMP_FORMAT; + if (args.length > 0 && isString(args[0]!)) { + format = args[0]!.value; + } + const result = strftimeParse(receiver.value, format); + if (typeof result === "string") { + return mkError("ts_parse() failed: " + result); + } + return mkTimestamp(result.nanos, result.offsetMinutes); + }, + lambdaFn: null, + intrinsic: false, + acceptsNull: false, + params: [ + { + name: "format", + default_: mkString(DEFAULT_TIMESTAMP_FORMAT), + hasDefault: true, + }, + ], + }); + + // --- ts_format --- + interp.registerMethod("ts_format", { + fn: (_interp: Interpreter, receiver: Value, args: Value[]): Value => { + if (!isTimestamp(receiver)) { + return mkError(`ts_format() requires timestamp, got ${typeName(receiver)}`); + } + let format = DEFAULT_TIMESTAMP_FORMAT; + if (args.length > 0 && isString(args[0]!)) { + format = args[0]!.value; + } + return mkString(strftimeFormat(receiver.value, format, receiver.offsetMinutes)); + }, + lambdaFn: null, + intrinsic: false, + acceptsNull: false, + params: [ + { + name: "format", + default_: mkString(DEFAULT_TIMESTAMP_FORMAT), + hasDefault: true, + }, + ], + }); + + // --- ts_add --- + interp.registerMethod("ts_add", { + fn: (_interp: Interpreter, receiver: Value, args: Value[]): Value => { + if (!isTimestamp(receiver)) { + return mkError(`ts_add() requires timestamp, got ${typeName(receiver)}`); + } + if (args.length !== 1) { + return mkError("ts_add() requires one argument (nanoseconds)"); + } + const nanos = toInt64(args[0]!); + if (nanos === null) { + return mkError("ts_add() argument must be integer nanoseconds"); + } + return mkTimestamp(receiver.value + nanos, receiver.offsetMinutes); + }, + lambdaFn: null, + intrinsic: false, + acceptsNull: false, + params: [{ name: "nanos", default_: null, hasDefault: false }], + }); +} diff --git a/internal/bloblang2/ts/src/stdlib/type_conversion.ts b/internal/bloblang2/ts/src/stdlib/type_conversion.ts new file mode 100644 index 000000000..f8fe68c87 --- /dev/null +++ b/internal/bloblang2/ts/src/stdlib/type_conversion.ts @@ -0,0 +1,441 @@ +// Type conversion methods: type, string, int32, int64, uint32, uint64, +// float32, float64, bool, bytes, not_null, char. + +declare const TextDecoder: { + new (label?: string, options?: { fatal?: boolean }): { + decode(input: Uint8Array): string; + }; +}; +declare const TextEncoder: { new (): { encode(s: string): Uint8Array } }; + +import type { Interpreter, MethodSpec } from "../interpreter.js"; +import { + type Value, + mkInt32, + mkInt64, + mkUint32, + mkUint64, + mkFloat32, + mkFloat64, + mkString, + mkBool, + mkBytes, + mkError, + isNull, + isString, + isBool, + isInt32, + isInt64, + isUint32, + isUint64, + isFloat32, + isFloat64, + isBytes, + isTimestamp, + isArray, + isObject, + isNumeric, + typeName, + toJSON, + MAX_INT32, + MIN_INT32, + MAX_UINT32, + MAX_INT64, + MAX_UINT64, +} from "../value.js"; +import { strftimeFormat, DEFAULT_TIMESTAMP_FORMAT } from "./timestamp.js"; + +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- + +/** Check if a Value tree contains bytes anywhere. */ +function containsBytes(v: Value): boolean { + if (isBytes(v)) return true; + if (isArray(v)) { + for (const elem of v.value) { + if (containsBytes(elem)) return true; + } + } + if (isObject(v)) { + for (const [, val] of v.value) { + if (containsBytes(val)) return true; + } + } + return false; +} + +function formatFloat(f: number): string { + if (Number.isNaN(f)) return "NaN"; + if (f === Infinity) return "Infinity"; + if (f === -Infinity) return "-Infinity"; + if (f === 0 && 1 / f === -Infinity) return "0.0"; // negative zero + let s = String(f); + // Ensure the string contains a decimal point or exponent. + if (!s.includes(".") && !s.includes("e") && !s.includes("E")) { + s += ".0"; + } + return s; +} + +/** Format a float32 value using the shortest representation that round-trips through float32. */ +function formatFloat32(f: number): string { + if (Number.isNaN(f)) return "NaN"; + if (f === Infinity) return "Infinity"; + if (f === -Infinity) return "-Infinity"; + if (f === 0 && 1 / f === -Infinity) return "0.0"; // negative zero + // Find shortest representation that round-trips through float32. + for (let prec = 1; prec <= 9; prec++) { + const s = f.toPrecision(prec); + if (Math.fround(parseFloat(s)) === f) { + let result = cleanupTrailingZeros(s); + // Ensure decimal point. + if (!result.includes(".") && !result.includes("e") && !result.includes("E")) { + result += ".0"; + } + return result; + } + } + let s = String(f); + if (!s.includes(".") && !s.includes("e") && !s.includes("E")) { + s += ".0"; + } + return s; +} + +/** Convert a Value to JSON with object keys sorted (matches Go's json.Marshal behavior). */ +function sortedToJSON(v: Value): unknown { + if (isArray(v)) return v.value.map(sortedToJSON); + if (isObject(v)) { + const obj: Record = {}; + const keys = [...v.value.keys()].sort(); + for (const k of keys) { + obj[k] = sortedToJSON(v.value.get(k)!); + } + return obj; + } + return toJSON(v); +} + +function cleanupTrailingZeros(s: string): string { + if (!s.includes(".")) return s; + s = s.replace(/(\.\d*?)0+$/, "$1"); + s = s.replace(/\.$/, ""); + return s; +} + +function valueToString(v: Value): Value { + if (isNull(v)) return mkString("null"); + if (isString(v)) return v; + if (isInt32(v)) return mkString(String(v.value)); + if (isInt64(v)) return mkString(String(v.value)); + if (isUint32(v)) return mkString(String(v.value)); + if (isUint64(v)) return mkString(String(v.value)); + if (isFloat32(v)) return mkString(formatFloat32(v.value)); + if (isFloat64(v)) return mkString(formatFloat(v.value)); + if (isBool(v)) return mkString(v.value ? "true" : "false"); + if (isTimestamp(v)) { + return mkString(formatTimestampValue(v.value, v.offsetMinutes)); + } + if (isBytes(v)) { + // Check valid UTF-8 — in JS, TextDecoder with fatal option. + try { + const decoder = new TextDecoder("utf-8", { fatal: true }); + return mkString(decoder.decode(v.value)); + } catch { + return mkError("bytes are not valid UTF-8"); + } + } + if (isArray(v)) { + if (containsBytes(v)) { + return mkError( + "cannot convert array to string: contains bytes value (convert bytes explicitly before embedding in containers)", + ); + } + return mkString(JSON.stringify(toJSON(v))); + } + if (isObject(v)) { + if (containsBytes(v)) { + return mkError( + "cannot convert object to string: contains bytes value (convert bytes explicitly before embedding in containers)", + ); + } + return mkString(JSON.stringify(sortedToJSON(v))); + } + return mkError(`cannot convert ${typeName(v)} to string`); +} + +function formatTimestampValue(nanos: bigint, offsetMinutes: number = 0): string { + return strftimeFormat(nanos, DEFAULT_TIMESTAMP_FORMAT, offsetMinutes); +} + +function valueToInt64(v: Value): Value { + if (isInt64(v)) return v; + if (isInt32(v)) return mkInt64(BigInt(v.value)); + if (isUint32(v)) return mkInt64(BigInt(v.value)); + if (isUint64(v)) { + if (v.value > BigInt(MAX_INT64)) { + return mkError("uint64 value exceeds int64 range"); + } + return mkInt64(BigInt(v.value)); + } + if (isFloat64(v)) return isFinite(v.value) ? mkInt64(BigInt(Math.trunc(v.value))) : mkError("cannot convert NaN/Infinity to int64"); + if (isFloat32(v)) return isFinite(v.value) ? mkInt64(BigInt(Math.trunc(v.value))) : mkError("cannot convert NaN/Infinity to int64"); + if (isString(v)) { + try { + const n = BigInt(v.value); + if (n > MAX_INT64 || n < BigInt("-9223372036854775808")) { + return mkError("cannot convert string to int64: value out of range"); + } + return mkInt64(n); + } catch { + return mkError("cannot convert string to int64: " + v.value); + } + } + if (isBool(v)) return mkError("cannot convert bool to int64"); + return mkError(`cannot convert ${typeName(v)} to int64`); +} + +// --------------------------------------------------------------------------- +// Registration +// --------------------------------------------------------------------------- + +export function registerTypeConversion(interp: Interpreter): void { + const m = ( + fn: (interp: Interpreter, receiver: Value, args: Value[]) => Value, + opts?: Partial, + ): MethodSpec => ({ + fn, + lambdaFn: null, + intrinsic: false, + params: null, + acceptsNull: false, + ...opts, + }); + + interp.registerMethod( + "type", + m((_interp, receiver) => mkString(typeName(receiver)), { + acceptsNull: true, + }), + ); + + interp.registerMethod( + "string", + m((_interp, receiver) => valueToString(receiver), { acceptsNull: true }), + ); + + interp.registerMethod( + "int64", + m((_interp, receiver) => valueToInt64(receiver)), + ); + + interp.registerMethod( + "int32", + m((_interp, receiver) => { + const i64 = valueToInt64(receiver); + if (i64.tag === "error") return i64; + const n = (i64 as { tag: "int64"; value: bigint }).value; + if (n > BigInt(MAX_INT32) || n < BigInt(MIN_INT32)) { + return mkError("int32 overflow"); + } + return mkInt32(Number(n)); + }), + ); + + interp.registerMethod( + "uint32", + m((_interp, receiver) => { + if (isUint32(receiver)) return receiver; + if (isInt64(receiver)) { + if (receiver.value < 0n || receiver.value > BigInt(MAX_UINT32)) { + return mkError("uint32 overflow"); + } + return mkUint32(Number(receiver.value)); + } + if (isString(receiver)) { + try { + const n = BigInt(receiver.value); + if (n < 0n || n > BigInt(MAX_UINT32)) { + return mkError("cannot convert string to uint32: value out of range"); + } + return mkUint32(Number(n)); + } catch { + return mkError("cannot convert string to uint32: " + receiver.value); + } + } + const i64 = valueToInt64(receiver); + if (i64.tag === "error") return i64; + const n = (i64 as { tag: "int64"; value: bigint }).value; + if (n < 0n || n > BigInt(MAX_UINT32)) { + return mkError("uint32 overflow"); + } + return mkUint32(Number(n)); + }), + ); + + interp.registerMethod( + "uint64", + m((_interp, receiver) => { + if (isUint64(receiver)) return receiver; + if (isInt64(receiver)) { + if (receiver.value < 0n) { + return mkError("uint64 overflow: negative value"); + } + return mkUint64(receiver.value); + } + if (isString(receiver)) { + try { + const n = BigInt(receiver.value); + if (n < 0n || n > MAX_UINT64) { + return mkError("uint64 overflow: " + receiver.value); + } + return mkUint64(n); + } catch { + return mkError("uint64 overflow: " + receiver.value); + } + } + const i64 = valueToInt64(receiver); + if (i64.tag === "error") return i64; + const n = (i64 as { tag: "int64"; value: bigint }).value; + if (n < 0n) { + return mkError("uint64 overflow: negative value"); + } + return mkUint64(n); + }), + ); + + interp.registerMethod( + "float64", + m((_interp, receiver) => { + if (isFloat64(receiver)) return receiver; + if (isFloat32(receiver)) return mkFloat64(receiver.value); + if (isInt64(receiver)) return mkFloat64(Number(receiver.value)); + if (isInt32(receiver)) return mkFloat64(receiver.value); + if (isUint32(receiver)) return mkFloat64(receiver.value); + if (isUint64(receiver)) return mkFloat64(Number(receiver.value)); + if (isString(receiver)) { + const f = Number(receiver.value); + if (receiver.value.trim() === "" || Number.isNaN(f)) { + return mkError( + "cannot convert string to float64: " + receiver.value, + ); + } + return mkFloat64(f); + } + return mkError(`cannot convert ${typeName(receiver)} to float64`); + }), + ); + + interp.registerMethod( + "float32", + m((_interp, receiver) => { + if (isFloat32(receiver)) return receiver; + // Go through float64 first. + if (isFloat64(receiver)) return mkFloat32(receiver.value); + if (isInt64(receiver)) return mkFloat32(Number(receiver.value)); + if (isInt32(receiver)) return mkFloat32(receiver.value); + if (isUint32(receiver)) return mkFloat32(receiver.value); + if (isUint64(receiver)) return mkFloat32(Number(receiver.value)); + if (isString(receiver)) { + const f = Number(receiver.value); + if (receiver.value.trim() === "" || Number.isNaN(f)) { + return mkError( + "cannot convert string to float32: " + receiver.value, + ); + } + return mkFloat32(f); + } + return mkError(`cannot convert ${typeName(receiver)} to float32`); + }), + ); + + interp.registerMethod( + "bool", + m((_interp, receiver) => { + if (isBool(receiver)) return receiver; + if (isString(receiver)) { + if (receiver.value === "true") return mkBool(true); + if (receiver.value === "false") return mkBool(false); + return mkError( + `cannot convert string "${receiver.value}" to bool`, + ); + } + if (isInt64(receiver)) return mkBool(receiver.value !== 0n); + if (isInt32(receiver)) return mkBool(receiver.value !== 0); + if (isUint32(receiver)) return mkBool(receiver.value !== 0); + if (isUint64(receiver)) return mkBool(receiver.value !== 0n); + if (isFloat64(receiver)) { + if (Number.isNaN(receiver.value)) { + return mkError("NaN cannot be converted to bool"); + } + return mkBool(receiver.value !== 0); + } + if (isFloat32(receiver)) { + if (Number.isNaN(receiver.value)) { + return mkError("NaN cannot be converted to bool"); + } + return mkBool(receiver.value !== 0); + } + return mkError(`cannot convert ${typeName(receiver)} to bool`); + }), + ); + + interp.registerMethod( + "bytes", + m( + (_interp, receiver) => { + if (isBytes(receiver)) return receiver; + if (isString(receiver)) { + return mkBytes(new TextEncoder().encode(receiver.value)); + } + // Fall through to string conversion then to bytes. + const s = valueToString(receiver); + if (s.tag === "error") return s; + return mkBytes( + new TextEncoder().encode((s as { tag: "string"; value: string }).value), + ); + }, + { acceptsNull: true }, + ), + ); + + interp.registerMethod( + "char", + m((_interp, receiver) => { + let n: bigint | null = null; + if (isInt64(receiver)) n = receiver.value; + else if (isInt32(receiver)) n = BigInt(receiver.value); + else if (isUint32(receiver)) n = BigInt(receiver.value); + else if (isUint64(receiver)) n = receiver.value; + else + return mkError(`char() requires integer, got ${typeName(receiver)}`); + + if (n < 0n || n > 0x10ffffn) { + return mkError("codepoint out of valid Unicode range"); + } + return mkString(String.fromCodePoint(Number(n))); + }), + ); + + interp.registerMethod("not_null", { + fn: (_interp: Interpreter, receiver: Value, args: Value[]): Value => { + if (!isNull(receiver)) return receiver; + let msg = "unexpected null value"; + if (args.length > 0 && isString(args[0]!)) { + msg = args[0]!.value; + } + return mkError(msg); + }, + lambdaFn: null, + intrinsic: false, + acceptsNull: true, + params: [ + { + name: "message", + default_: mkString("unexpected null value"), + hasDefault: true, + }, + ], + }); +} diff --git a/internal/bloblang2/ts/src/token.ts b/internal/bloblang2/ts/src/token.ts new file mode 100644 index 000000000..1544d7a7a --- /dev/null +++ b/internal/bloblang2/ts/src/token.ts @@ -0,0 +1,167 @@ +// Token types for the Bloblang V2 lexer. + +export enum TokenType { + ILLEGAL = "ILLEGAL", + EOF = "EOF", + NL = "NL", + + // Literals + INT = "INT", + FLOAT = "FLOAT", + STRING = "STRING", + RAW_STRING = "RAW_STRING", + + // Identifiers and variables + IDENT = "IDENT", + VAR = "VAR", + + // Keywords + INPUT = "input", + OUTPUT = "output", + IF = "if", + ELSE = "else", + MATCH = "match", + AS = "as", + MAP = "map", + IMPORT = "import", + TRUE = "true", + FALSE = "false", + NULL = "null", + UNDERSCORE = "_", + + // Reserved function names + DELETED = "deleted", + THROW = "throw", + VOID = "void", + + // Operators + DOT = ".", + QDOT = "?.", + AT = "@", + DCOLON = "::", + ASSIGN = "=", + PLUS = "+", + MINUS = "-", + STAR = "*", + SLASH = "/", + PERCENT = "%", + BANG = "!", + GT = ">", + GE = ">=", + EQ = "==", + NE = "!=", + LT = "<", + LE = "<=", + AND = "&&", + OR = "||", + FATARROW = "=>", + THINARROW = "->", + + // Delimiters + LPAREN = "(", + RPAREN = ")", + LBRACE = "{", + RBRACE = "}", + LBRACKET = "[", + RBRACKET = "]", + QLBRACKET = "?[", + COMMA = ",", + COLON = ":", +} + +const keywords: Record = { + input: TokenType.INPUT, + output: TokenType.OUTPUT, + if: TokenType.IF, + else: TokenType.ELSE, + match: TokenType.MATCH, + as: TokenType.AS, + map: TokenType.MAP, + import: TokenType.IMPORT, + true: TokenType.TRUE, + false: TokenType.FALSE, + null: TokenType.NULL, + _: TokenType.UNDERSCORE, +}; + +const reservedNames: Record = { + deleted: TokenType.DELETED, + throw: TokenType.THROW, + void: TokenType.VOID, +}; + +export function isReservedName(name: string): boolean { + return name in reservedNames; +} + +export function lookupIdent(word: string): TokenType { + return keywords[word] ?? reservedNames[word] ?? TokenType.IDENT; +} + +const KEYWORD_SET: ReadonlySet = new Set(Object.values(keywords)); + +export function isKeyword(t: TokenType): boolean { + return KEYWORD_SET.has(t); +} + +/** Reports whether this token suppresses a following newline. */ +export function suppressesFollowingNL(t: TokenType): boolean { + switch (t) { + case TokenType.PLUS: + case TokenType.MINUS: + case TokenType.STAR: + case TokenType.SLASH: + case TokenType.PERCENT: + case TokenType.EQ: + case TokenType.NE: + case TokenType.GT: + case TokenType.GE: + case TokenType.LT: + case TokenType.LE: + case TokenType.AND: + case TokenType.OR: + case TokenType.BANG: + case TokenType.ASSIGN: + case TokenType.FATARROW: + case TokenType.THINARROW: + case TokenType.COLON: + return true; + default: + return false; + } +} + +/** Reports whether this token triggers postfix continuation. */ +export function isPostfixContinuation(t: TokenType): boolean { + switch (t) { + case TokenType.DOT: + case TokenType.QDOT: + case TokenType.LBRACKET: + case TokenType.QLBRACKET: + case TokenType.ELSE: + return true; + default: + return false; + } +} + +export interface Pos { + file: string; + line: number; + column: number; +} + +export function posToString(p: Pos): string { + return p.file ? `${p.file}:${p.line}:${p.column}` : `${p.line}:${p.column}`; +} + +export interface Token { + type: TokenType; + literal: string; + pos: Pos; +} + +export interface PosError { + pos: Pos; + msg: string; +} diff --git a/internal/bloblang2/ts/src/value.ts b/internal/bloblang2/ts/src/value.ts new file mode 100644 index 000000000..87781eb3c --- /dev/null +++ b/internal/bloblang2/ts/src/value.ts @@ -0,0 +1,593 @@ +// Tagged value type for the Bloblang V2 runtime. +// +// TypeScript doesn't have native int32/int64/uint32/uint64, so we use a +// discriminated union. 32-bit integers use `number`, 64-bit use `bigint`. +// float32 values are stored as `number` but rounded with Math.fround(). + +// --- Value types --- + +export type Value = + | NullValue + | BoolValue + | Int32Value + | Int64Value + | Uint32Value + | Uint64Value + | Float32Value + | Float64Value + | StringValue + | BytesValue + | ArrayValue + | ObjectValue + | TimestampValue + | VoidValue + | DeletedValue + | ErrorValue + | FoldedValue; + +export interface NullValue { + tag: "null"; +} +export interface BoolValue { + tag: "bool"; + value: boolean; +} +export interface Int32Value { + tag: "int32"; + value: number; +} +export interface Int64Value { + tag: "int64"; + value: bigint; +} +export interface Uint32Value { + tag: "uint32"; + value: number; +} +export interface Uint64Value { + tag: "uint64"; + value: bigint; +} +export interface Float32Value { + tag: "float32"; + value: number; +} +export interface Float64Value { + tag: "float64"; + value: number; +} +export interface StringValue { + tag: "string"; + value: string; +} +export interface BytesValue { + tag: "bytes"; + value: Uint8Array; +} +export interface ArrayValue { + tag: "array"; + value: Value[]; +} +export interface ObjectValue { + tag: "object"; + value: Map; +} +export interface TimestampValue { + tag: "timestamp"; + /** Nanoseconds since Unix epoch (always UTC). */ + value: bigint; + /** Original timezone offset in minutes (0 = UTC). Used by ts_format %z. */ + offsetMinutes: number; +} +export interface VoidValue { + tag: "void"; +} +export interface DeletedValue { + tag: "deleted"; +} +export interface ErrorValue { + tag: "error"; + message: string; +} +/** + * FoldedValue carries a parse-time-precomputed native value (e.g. a + * compiled RegExp) through the arg-evaluation path. Produced by the + * interpreter when a CallArg has .folded set; consumed by the specific + * method/function that opted into folding via its ArgFolder hook. + * Methods that don't know about a folded value naturally reject it + * with a type error, which is the correct behaviour — an argFolder + * should only be registered on methods that can consume the folded + * form. + */ +export interface FoldedValue { + tag: "folded"; + value: unknown; +} + +// --- Singletons --- + +export const NULL: NullValue = { tag: "null" }; +export const TRUE: BoolValue = { tag: "bool", value: true }; +export const FALSE: BoolValue = { tag: "bool", value: false }; +export const VOID: VoidValue = { tag: "void" }; +export const DELETED: DeletedValue = { tag: "deleted" }; + +// --- Constructors --- + +export function mkBool(v: boolean): BoolValue { + return v ? TRUE : FALSE; +} + +export function mkInt32(v: number): Int32Value { + return { tag: "int32", value: v | 0 }; +} + +export function mkInt64(v: bigint): Int64Value { + return { tag: "int64", value: v }; +} + +export function mkUint32(v: number): Uint32Value { + return { tag: "uint32", value: v >>> 0 }; +} + +export function mkUint64(v: bigint): Uint64Value { + return { tag: "uint64", value: v }; +} + +export function mkFloat32(v: number): Float32Value { + return { tag: "float32", value: Math.fround(v) }; +} + +export function mkFloat64(v: number): Float64Value { + return { tag: "float64", value: v }; +} + +export function mkString(v: string): StringValue { + return { tag: "string", value: v }; +} + +export function mkBytes(v: Uint8Array): BytesValue { + return { tag: "bytes", value: v }; +} + +export function mkArray(v: Value[]): ArrayValue { + return { tag: "array", value: v }; +} + +export function mkObject(v: Map): ObjectValue { + return { tag: "object", value: v }; +} + +export function mkTimestamp(nanos: bigint, offsetMinutes: number = 0): TimestampValue { + return { tag: "timestamp", value: nanos, offsetMinutes }; +} + +export function mkError(msg: string): ErrorValue { + return { tag: "error", message: msg }; +} + +// --- Type guards --- + +export function isNull(v: Value): v is NullValue { + return v.tag === "null"; +} +export function isBool(v: Value): v is BoolValue { + return v.tag === "bool"; +} +export function isInt32(v: Value): v is Int32Value { + return v.tag === "int32"; +} +export function isInt64(v: Value): v is Int64Value { + return v.tag === "int64"; +} +export function isUint32(v: Value): v is Uint32Value { + return v.tag === "uint32"; +} +export function isUint64(v: Value): v is Uint64Value { + return v.tag === "uint64"; +} +export function isFloat32(v: Value): v is Float32Value { + return v.tag === "float32"; +} +export function isFloat64(v: Value): v is Float64Value { + return v.tag === "float64"; +} +export function isString(v: Value): v is StringValue { + return v.tag === "string"; +} +export function isBytes(v: Value): v is BytesValue { + return v.tag === "bytes"; +} +export function isArray(v: Value): v is ArrayValue { + return v.tag === "array"; +} +export function isObject(v: Value): v is ObjectValue { + return v.tag === "object"; +} +export function isTimestamp(v: Value): v is TimestampValue { + return v.tag === "timestamp"; +} +export function isVoid(v: Value): v is VoidValue { + return v.tag === "void"; +} +export function isDeleted(v: Value): v is DeletedValue { + return v.tag === "deleted"; +} +export function isError(v: Value): v is ErrorValue { + return v.tag === "error"; +} +export function isFolded(v: Value): v is FoldedValue { + return v.tag === "folded"; +} +export function mkFolded(value: unknown): FoldedValue { + return { tag: "folded", value }; +} + +export function isNumeric(v: Value): boolean { + switch (v.tag) { + case "int32": + case "int64": + case "uint32": + case "uint64": + case "float32": + case "float64": + return true; + default: + return false; + } +} + +// --- Type name --- + +export function typeName(v: Value): string { + return v.tag; +} + +// --- Deep clone --- + +export function deepClone(v: Value): Value { + switch (v.tag) { + case "array": + return mkArray(v.value.map(deepClone)); + case "object": { + const m = new Map(); + for (const [k, val] of v.value) { + m.set(k, deepClone(val)); + } + return mkObject(m); + } + case "bytes": + return mkBytes(new Uint8Array(v.value)); + default: + // Immutable types: null, bool, numbers, string, timestamp, void, deleted, error. + return v; + } +} + +// --- JSON conversion --- + +const MIN_SAFE_INT64 = -9007199254740991n; + +/** Convert a JSON-compatible JavaScript value to a Bloblang Value. */ +export function fromJSON(v: unknown): Value { + if (v === null || v === undefined) return NULL; + if (typeof v === "boolean") return mkBool(v); + if (typeof v === "string") return mkString(v); + if (typeof v === "number") { + // Integer or float? + if (Number.isInteger(v)) { + return mkInt64(BigInt(v)); + } + return mkFloat64(v); + } + if (Array.isArray(v)) { + return mkArray(v.map(fromJSON)); + } + if (typeof v === "object") { + const m = new Map(); + for (const [key, val] of Object.entries(v as Record)) { + m.set(key, fromJSON(val)); + } + return mkObject(m); + } + return mkError(`cannot convert ${typeof v} to Bloblang value`); +} + +/** Convert a Bloblang Value to a JSON-compatible JavaScript value. */ +export function toJSON(v: Value): unknown { + switch (v.tag) { + case "null": + return null; + case "bool": + return v.value; + case "int32": + return v.value; + case "int64": + return Number(v.value); + case "uint32": + return v.value; + case "uint64": + return Number(v.value); + case "float32": + return v.value; + case "float64": + return v.value; + case "string": + return v.value; + case "bytes": + // Bytes are not directly JSON-serializable. Return base64 or array. + return Array.from(v.value); + case "array": + return v.value.map(toJSON); + case "object": { + const obj: Record = {}; + for (const [k, val] of v.value) { + obj[k] = toJSON(val); + } + return obj; + } + case "timestamp": + // ISO 8601 string representation. + return new Date(Number(v.value / 1000000n)).toISOString(); + case "void": + return undefined; + case "deleted": + return undefined; + case "error": + return `error: ${v.message}`; + } +} + +/** Deep equality comparison following Bloblang semantics. */ +export function valuesEqual(a: Value, b: Value): boolean { + if (a.tag === "null" && b.tag === "null") return true; + if (a.tag === "null" || b.tag === "null") return false; + + // Numeric equality with promotion. + if (isNumeric(a) && isNumeric(b)) { + return numericEqual(a, b); + } + + // Timestamp equality. + if (a.tag === "timestamp" && b.tag === "timestamp") { + return a.value === b.value; + } + + // Bytes equality. + if (a.tag === "bytes" && b.tag === "bytes") { + if (a.value.length !== b.value.length) return false; + for (let i = 0; i < a.value.length; i++) { + if (a.value[i] !== b.value[i]) return false; + } + return true; + } + + // Same tag required for non-numeric. + if (a.tag !== b.tag) return false; + + switch (a.tag) { + case "string": + return a.value === (b as StringValue).value; + case "bool": + return a.value === (b as BoolValue).value; + case "array": { + const bArr = (b as ArrayValue).value; + if (a.value.length !== bArr.length) return false; + for (let i = 0; i < a.value.length; i++) { + if (!valuesEqual(a.value[i]!, bArr[i]!)) return false; + } + return true; + } + case "object": { + const bObj = (b as ObjectValue).value; + if (a.value.size !== bObj.size) return false; + for (const [k, v] of a.value) { + const bv = bObj.get(k); + if (bv === undefined || !valuesEqual(v, bv)) return false; + } + return true; + } + default: + return false; + } +} + +// --- Numeric helpers --- + +function numericEqual(a: Value, b: Value): boolean { + // Same type: direct comparison. + if (a.tag === b.tag) { + switch (a.tag) { + case "int32": + return a.value === (b as Int32Value).value; + case "int64": + return a.value === (b as Int64Value).value; + case "uint32": + return a.value === (b as Uint32Value).value; + case "uint64": + return a.value === (b as Uint64Value).value; + case "float32": + return !isNaN(a.value) && !isNaN((b as Float32Value).value) && a.value === (b as Float32Value).value; + case "float64": + return !isNaN(a.value) && !isNaN((b as Float64Value).value) && a.value === (b as Float64Value).value; + } + } + + // Different numeric types: checked promotion. + const result = promoteChecked(a, b); + if (result === null) return false; // promotion failed + + const [pa, pb, kind] = result; + switch (kind) { + case "int64": + return (pa as Int64Value).value === (pb as Int64Value).value; + case "int32": + return (pa as Int32Value).value === (pb as Int32Value).value; + case "uint32": + return (pa as Uint32Value).value === (pb as Uint32Value).value; + case "uint64": + return (pa as Uint64Value).value === (pb as Uint64Value).value; + case "float64": { + const af = (pa as Float64Value).value; + const bf = (pb as Float64Value).value; + return !isNaN(af) && !isNaN(bf) && af === bf; + } + case "float32": { + const af = (pa as Float32Value).value; + const bf = (pb as Float32Value).value; + return !isNaN(af) && !isNaN(bf) && af === bf; + } + default: + return false; + } +} + +// --- Numeric promotion --- + +type PromoteKind = "int32" | "int64" | "uint32" | "uint64" | "float32" | "float64"; + +function numericKind(v: Value): PromoteKind | null { + return isNumeric(v) ? (v.tag as PromoteKind) : null; +} + +function isFloatKind(k: PromoteKind): boolean { + return k === "float32" || k === "float64"; +} + +export const MAX_SAFE_FLOAT64 = 9007199254740992n; // 2^53 +export const MAX_INT64 = 9223372036854775807n; +export const MIN_INT64 = -9223372036854775808n; +export const MAX_INT32 = 2147483647; +export const MIN_INT32 = -2147483648; +export const MAX_UINT32 = 4294967295; +export const MAX_UINT64 = 18446744073709551615n; + +/** + * Promote two numeric values to a common type. + * Returns [promoted_a, promoted_b, kind] or null on failure. + */ +export const promoteChecked = promote; + +/** + * Returns a specific error message for promotion failure, or null on success. + */ +export function promoteWithError( + a: Value, + b: Value, +): { promoted: [Value, Value, PromoteKind] } | { error: string } { + const result = promote(a, b); + if (result !== null) return { promoted: result }; + + const ak = numericKind(a); + const bk = numericKind(b); + if ( + (ak === "uint64" || bk === "uint64") && + ak !== null && + bk !== null && + !isFloatKind(ak) && + !isFloatKind(bk) + ) { + return { error: "uint64 value exceeds int64 range" }; + } + return { error: "integer exceeds float64 exact range (magnitude > 2^53)" }; +} + +function promote( + a: Value, + b: Value, +): [Value, Value, PromoteKind] | null { + const ak = numericKind(a); + const bk = numericKind(b); + if (ak === null || bk === null) return null; + + // Same type: no promotion needed. + if (ak === bk) return [a, b, ak]; + + // Same signedness, different width: widen. + // uint32 + uint64 → uint64. + if ( + (ak === "uint32" && bk === "uint64") || + (ak === "uint64" && bk === "uint32") + ) { + return [toU64(a), toU64(b), "uint64"]; + } + + // Any float involved → float64. + if (isFloatKind(ak) || isFloatKind(bk)) { + const af = checkedToFloat64(a); + const bf = checkedToFloat64(b); + if (af === null || bf === null) return null; + return [af, bf, "float64"]; + } + + // Both integers: widen to int64. + const ai = toI64(a); + const bi = toI64(b); + if (ai === null || bi === null) return null; + return [ai, bi, "int64"]; +} + +function toU64(v: Value): Uint64Value { + switch (v.tag) { + case "uint32": + return mkUint64(BigInt(v.value)); + case "uint64": + return v; + default: + throw new Error(`toU64: unexpected tag ${v.tag}`); + } +} + +function toI64(v: Value): Int64Value | null { + switch (v.tag) { + case "int32": + return mkInt64(BigInt(v.value)); + case "int64": + return v; + case "uint32": + return mkInt64(BigInt(v.value)); + case "uint64": + if (v.value > MAX_INT64) return null; + return mkInt64(BigInt(v.value)); + default: + return null; + } +} + +function checkedToFloat64(v: Value): Float64Value | null { + switch (v.tag) { + case "int64": + if (v.value > MAX_SAFE_FLOAT64 || v.value < -MAX_SAFE_FLOAT64) + return null; + return mkFloat64(Number(v.value)); + case "int32": + return mkFloat64(v.value); + case "uint32": + return mkFloat64(v.value); + case "uint64": + if (v.value > MAX_SAFE_FLOAT64) return null; + return mkFloat64(Number(v.value)); + case "float64": + return v; + case "float32": + return mkFloat64(v.value); + default: + return null; + } +} + +/** Convert any numeric Value to float64, unchecked. */ +export function toFloat64Unchecked(v: Value): number { + switch (v.tag) { + case "int32": + case "uint32": + case "float32": + case "float64": + return v.value; + case "int64": + case "uint64": + return Number(v.value); + default: + return NaN; + } +} diff --git a/internal/bloblang2/ts/test/argfold.test.ts b/internal/bloblang2/ts/test/argfold.test.ts new file mode 100644 index 000000000..3c1e9e881 --- /dev/null +++ b/internal/bloblang2/ts/test/argfold.test.ts @@ -0,0 +1,111 @@ +// Verifies the parse-time argument-folding mechanism works end-to-end +// in the TS implementation. Mirrors the Go unit tests in +// ../../go/pratt/eval/argfold_test.go. +// +// Each test compiles a small mapping, then either inspects the +// resolved AST to confirm the folder fired, checks the resolver +// produced an error for an invalid literal, or runs the mapping and +// asserts the expected output. + +import { describe, it, expect } from "vitest"; +import { parse } from "../src/parser.js"; +import { optimize } from "../src/optimizer.js"; +import { resolve } from "../src/resolver.js"; +import { Interpreter } from "../src/interpreter.js"; +import { registerStdlib, stdlibNames } from "../src/stdlib/index.js"; +import { mkString, toJSON } from "../src/value.js"; +import type { PathExpr, Assignment } from "../src/ast.js"; + +function compile(src: string) { + const { program, errors } = parse(src, "", null); + if (errors.length > 0) throw new Error("parse: " + JSON.stringify(errors)); + optimize(program); + const { methods, functions } = stdlibNames(); + const rerrs = resolve(program, methods, functions); + return { program, rerrs }; +} + +function fresh(): Interpreter { + const i = new Interpreter({ stmts: [], maps: [], imports: [], namespaces: new Map(), maxSlots: 0, readsOutput: false }); + return i; +} + +describe("ArgFolder", () => { + it("folds a literal regex pattern into a RegExp on the AST", () => { + const { program, rerrs } = compile(`output = input.re_match("[0-9]+")`); + expect(rerrs).toEqual([]); + + // Walk: Assignment.value is a PathExpr whose last segment is + // .re_match(...). Expect seg.args[0].folded to be a RegExp. + let found = false; + const walk = (node: unknown) => { + if (found || node === null || typeof node !== "object") return; + const n = node as { kind?: string }; + if (n.kind === "path") { + const p = node as PathExpr; + for (const seg of p.segments) { + if (seg.segKind === "method" && seg.name === "re_match") { + if (seg.args.length > 0 && seg.args[0]!.folded instanceof RegExp) { + found = true; + return; + } + } + } + } + if (n.kind === "assignment") { + walk((node as Assignment).value); + } + }; + for (const s of program.stmts) walk(s); + expect(found).toBe(true); + + // Runtime correctness. + const interp = fresh(); + Object.assign(interp, new Interpreter(program)); + registerStdlib(interp); + const result = interp.run(mkString("abc123"), new Map()); + expect(result.error).toBeFalsy(); + expect(toJSON(result.output)).toBe(true); + }); + + it("rejects an invalid regex literal at parse time", () => { + const { rerrs } = compile(`output = input.re_match("[unclosed")`); + expect(rerrs.length).toBeGreaterThan(0); + const allMsgs = rerrs.map((e) => e.msg).join(" "); + expect(allMsgs).toMatch(/re_match|invalid regex|Invalid regular expression/); + }); + + it("leaves dynamic patterns unfolded (compiled per call)", () => { + const src = `$pat = "[0-9]+"\noutput = input.re_match($pat)`; + const { program, rerrs } = compile(src); + expect(rerrs).toEqual([]); + + // The re_match arg's folded should remain undefined because the + // pattern came from a variable reference, not a literal. + let sawArg = false; + let argWasFolded = false; + const walk = (node: unknown) => { + if (node === null || typeof node !== "object") return; + const n = node as { kind?: string }; + if (n.kind === "path") { + const p = node as PathExpr; + for (const seg of p.segments) { + if (seg.segKind === "method" && seg.name === "re_match" && seg.args.length > 0) { + sawArg = true; + if (seg.args[0]!.folded !== undefined) argWasFolded = true; + } + } + } + if (n.kind === "assignment") walk((node as Assignment).value); + }; + for (const s of program.stmts) walk(s); + expect(sawArg).toBe(true); + expect(argWasFolded).toBe(false); + + const interp = new Interpreter(program); + registerStdlib(interp); + const result = interp.run(mkString("abc123"), new Map()); + expect(result.error).toBeFalsy(); + expect(toJSON(result.output)).toBe(true); + }); +}); diff --git a/internal/bloblang2/ts/test/exec.test.ts b/internal/bloblang2/ts/test/exec.test.ts new file mode 100644 index 000000000..b10670e1d --- /dev/null +++ b/internal/bloblang2/ts/test/exec.test.ts @@ -0,0 +1,584 @@ +import { describe, it, expect } from "vitest"; +import { readFileSync, readdirSync, statSync } from "fs"; +import { join } from "path"; +import { parse as parseYaml } from "yaml"; +import { parse } from "../src/parser.js"; +import { optimize } from "../src/optimizer.js"; +import { resolve } from "../src/resolver.js"; +import { Interpreter } from "../src/interpreter.js"; +import { registerStdlib, stdlibNames } from "../src/stdlib/index.js"; +import { + fromJSON, toJSON, Value, mkInt32, mkInt64, mkUint32, mkUint64, + mkFloat32, mkFloat64, mkBytes, mkTimestamp, mkString, mkBool, mkArray, mkObject, + isTimestamp, isBytes, isFloat64, isFloat32, isInt64, isInt32, isUint32, isUint64, + NULL, VOID, DELETED, mkError, +} from "../src/value.js"; + +const SPEC_DIR = join(__dirname, "..", "..", "spec", "tests"); + +/** + * Convert a spec JSON value (which may contain _type annotations) to a Bloblang Value. + */ +function specFromJSON(v: unknown): Value { + if (v === null || v === undefined) return NULL; + if (typeof v === "boolean") return mkBool(v); + if (typeof v === "string") return mkString(v); + if (typeof v === "number") { + if (Number.isInteger(v)) return mkInt64(BigInt(v)); + return mkFloat64(v); + } + if (Array.isArray(v)) { + return mkArray(v.map(specFromJSON)); + } + if (typeof v === "object") { + const obj = v as Record; + // Check for _type annotation. + if (typeof obj._type === "string" && "value" in obj) { + const strVal = String(obj.value); + switch (obj._type) { + case "int32": return mkInt32(parseInt(strVal, 10)); + case "int64": return mkInt64(BigInt(strVal)); + case "uint32": return mkUint32(parseInt(strVal, 10) >>> 0); + case "uint64": return mkUint64(BigInt(strVal)); + case "float32": return mkFloat32(parseFloat(strVal)); + case "float64": { + if (strVal === "NaN") return mkFloat64(NaN); + if (strVal === "Infinity" || strVal === "+Infinity") return mkFloat64(Infinity); + if (strVal === "-Infinity") return mkFloat64(-Infinity); + return mkFloat64(parseFloat(strVal)); + } + case "bytes": { + // Base64 decode. + const buf = Buffer.from(strVal, "base64"); + return mkBytes(new Uint8Array(buf)); + } + case "timestamp": { + // Parse RFC 3339 to nanoseconds since epoch. + const d = new Date(strVal); + return mkTimestamp(BigInt(d.getTime()) * 1000000n); + } + } + } + const m = new Map(); + for (const [key, val] of Object.entries(obj)) { + m.set(key, specFromJSON(val)); + } + return mkObject(m); + } + return mkError(`cannot convert ${typeof v} to Bloblang value`); +} + +/** + * Convert a Bloblang Value to a spec JSON value (with _type annotations for non-trivial types). + */ +function specToJSON(v: Value): unknown { + switch (v.tag) { + case "null": return null; + case "bool": return v.value; + case "int32": return { _type: "int32", value: String(v.value) }; + case "int64": { + // Plain integers within safe range are just numbers. + const n = Number(v.value); + if (Number.isSafeInteger(n)) return n; + return { _type: "int64", value: String(v.value) }; + } + case "uint32": return { _type: "uint32", value: String(v.value) }; + case "uint64": { + const n = Number(v.value); + if (Number.isSafeInteger(n) && n >= 0) return n; + return { _type: "uint64", value: String(v.value) }; + } + case "float32": return { _type: "float32", value: float32Str(v.value) }; + case "float64": { + if (isNaN(v.value)) return { _type: "float64", value: "NaN" }; + if (v.value === Infinity) return { _type: "float64", value: "Infinity" }; + if (v.value === -Infinity) return { _type: "float64", value: "-Infinity" }; + if (Object.is(v.value, -0)) return { _type: "float64", value: "-0" }; + return v.value; + } + case "string": return v.value; + case "bytes": { + const b64 = Buffer.from(v.value).toString("base64"); + return { _type: "bytes", value: b64 }; + } + case "array": return v.value.map(specToJSON); + case "object": { + const obj: Record = {}; + for (const [k, val] of v.value) { + obj[k] = specToJSON(val); + } + return obj; + } + case "timestamp": { + // Format as RFC 3339 with full nanosecond precision. + const nanos = v.value; + const ms = Number(nanos / 1000000n); + const d = new Date(ms); + // Build base: YYYY-MM-DDTHH:MM:SS + const pad2 = (n: number) => String(n).padStart(2, "0"); + const base = `${d.getUTCFullYear()}-${pad2(d.getUTCMonth()+1)}-${pad2(d.getUTCDate())}T${pad2(d.getUTCHours())}:${pad2(d.getUTCMinutes())}:${pad2(d.getUTCSeconds())}`; + // Fractional seconds from nanoseconds. + let subSecNanos = Number(((nanos % 1000000000n) + 1000000000n) % 1000000000n); + let frac = ""; + if (subSecNanos > 0) { + // Format as up to 9 digits, trimming trailing zeros. + const nanoStr = String(subSecNanos).padStart(9, "0"); + frac = "." + nanoStr.replace(/0+$/, ""); + } + return { _type: "timestamp", value: `${base}${frac}Z` }; + } + case "void": return undefined; + case "deleted": return undefined; + case "error": return `error: ${v.message}`; + } +} + +function floatStr(v: number): string { + if (isNaN(v)) return "NaN"; + if (v === Infinity) return "Infinity"; + if (v === -Infinity) return "-Infinity"; + if (Object.is(v, -0)) return "-0"; + // Go-style shortest representation: ensure decimal point for whole numbers. + const s = String(v); + if (Number.isInteger(v) && !s.includes(".") && !s.includes("e")) { + return s + ".0"; + } + return s; +} + +function float32Str(v: number): string { + if (isNaN(v)) return "NaN"; + if (v === Infinity) return "Infinity"; + if (v === -Infinity) return "-Infinity"; + if (Object.is(v, -0)) return "-0"; + // For float32, use Go-style shortest representation for the float32 value. + // Math.fround ensures it's the nearest float32, then format the float32 value. + const f = Math.fround(v); + // Use toPrecision to find shortest representation that round-trips. + // Go's strconv.FormatFloat with 'G' format for float32 uses up to 8 significant digits. + let s = formatShortest32(f); + if (Number.isInteger(f) && !s.includes(".") && !s.includes("e")) { + return s + ".0"; + } + return s; +} + +function formatShortest32(v: number): string { + // Find shortest decimal representation that round-trips through float32. + for (let prec = 1; prec <= 9; prec++) { + const s = v.toPrecision(prec); + if (Math.fround(parseFloat(s)) === v) { + // Remove trailing zeros after decimal point, but keep at least one digit. + return cleanupFloat(s); + } + } + return String(v); +} + +function cleanupFloat(s: string): string { + if (!s.includes(".")) return s; + // Remove trailing zeros. + s = s.replace(/(\.\d*?)0+$/, "$1"); + // Remove trailing dot. + s = s.replace(/\.$/, ""); + return s; +} + +function buildMeta(input_metadata?: Record): Value { + const m = new Map(); + if (input_metadata) { + for (const [k, v] of Object.entries(input_metadata)) { + m.set(k, specFromJSON(v)); + } + } + return mkObject(m); +} + +interface SpecFile { + description?: string; + files?: Record; + tests: SpecTest[]; +} + +interface SpecCase { + name: string; + input?: unknown; + input_metadata?: Record; + output?: unknown; + output_metadata?: Record; + output_type?: string; + no_output_check?: boolean; + deleted?: boolean; + error?: string; +} + +interface SpecTest { + name: string; + mapping?: string; + input?: unknown; + input_metadata?: Record; + output?: unknown; + output_metadata?: Record; + output_type?: string; + no_output_check?: boolean; + deleted?: boolean; + error?: string; + compile_error?: string; + runtime_error?: string; + files?: Record; + cases?: SpecCase[]; +} + +function collectYamlFiles(dir: string): string[] { + const result: string[] = []; + for (const entry of readdirSync(dir)) { + const full = join(dir, entry); + if (statSync(full).isDirectory()) { + result.push(...collectYamlFiles(full)); + } else if (entry.endsWith(".yaml")) { + result.push(full); + } + } + return result; +} + +/** + * Normalize a _type-annotated value to a plain JS value for comparison. + * Returns the original value if not a _type annotation. + */ +function normalizeTyped(v: unknown): unknown { + if (v === null || v === undefined || typeof v !== "object" || Array.isArray(v)) return v; + const obj = v as Record; + if (typeof obj._type !== "string" || !("value" in obj)) return v; + const strVal = String(obj.value); + switch (obj._type) { + case "int32": return { _type: "int32", value: strVal }; + case "int64": return { _type: "int64", value: strVal }; + case "uint32": return { _type: "uint32", value: strVal }; + case "uint64": return { _type: "uint64", value: strVal }; + case "float32": return { _type: "float32", value: strVal }; + case "float64": return { _type: "float64", value: strVal }; + case "bytes": return { _type: "bytes", value: strVal }; + case "timestamp": return { _type: "timestamp", value: strVal }; + default: return v; + } +} + +/** + * Check if two values are equivalent considering _type annotations. + * A _type-annotated integer can match a plain number if numeric values are equal. + */ +function typedNumericEqual(a: unknown, b: unknown): boolean | null { + // One is a _type annotation, the other is a plain number. + const aTyped = isTypedAnnotation(a); + const bTyped = isTypedAnnotation(b); + if (!aTyped && !bTyped) return null; // both plain, use regular comparison + if (aTyped && bTyped) return null; // both typed, use regular comparison + + const typed = aTyped ? a : b; + const plain = aTyped ? b : a; + if (typeof plain !== "number") return null; + + const obj = typed as Record; + const strVal = String(obj.value); + const typeName = obj._type as string; + + switch (typeName) { + case "int32": + case "int64": + case "uint32": + case "uint64": { + const n = Number(strVal); + return Math.abs(n - plain) < 1e-9; + } + case "float32": + case "float64": { + const n = parseFloat(strVal); + if (isNaN(n) && isNaN(plain)) return true; + return Math.abs(n - plain) < 1e-9; + } + default: + return null; + } +} + +function isTypedAnnotation(v: unknown): boolean { + if (v === null || v === undefined || typeof v !== "object" || Array.isArray(v)) return false; + const obj = v as Record; + return typeof obj._type === "string" && "value" in obj && Object.keys(obj).length === 2; +} + +/** + * Compare two timestamp _type annotations, treating them as equal if they + * represent the same instant in time (ignoring timezone presentation). + */ +function timestampEqual(a: unknown, b: unknown): boolean | null { + if (!isTypedAnnotation(a) || !isTypedAnnotation(b)) return null; + const aObj = a as Record; + const bObj = b as Record; + if (aObj._type !== "timestamp" || bObj._type !== "timestamp") return null; + const aTime = new Date(String(aObj.value)).getTime(); + const bTime = new Date(String(bObj.value)).getTime(); + if (isNaN(aTime) || isNaN(bTime)) return false; + return aTime === bTime; +} + +/** + * Compare two float32 _type annotations with tolerance for representation differences. + */ +function float32Equal(a: unknown, b: unknown): boolean | null { + if (!isTypedAnnotation(a) || !isTypedAnnotation(b)) return null; + const aObj = a as Record; + const bObj = b as Record; + if (aObj._type !== "float32" || bObj._type !== "float32") return null; + const aVal = parseFloat(String(aObj.value)); + const bVal = parseFloat(String(bObj.value)); + if (isNaN(aVal) && isNaN(bVal)) return true; + return Math.abs(aVal - bVal) < 1e-9; +} + +function deepEqual(a: unknown, b: unknown): boolean { + if (a === b) return true; + if (a === null || b === null) return a === b; + + // Check typed numeric equality (e.g., {_type: "int64", value: "42"} vs 42) + const typedResult = typedNumericEqual(a, b); + if (typedResult !== null) return typedResult; + + // Check timestamp equality (timezone-insensitive) + const tsResult = timestampEqual(a, b); + if (tsResult !== null) return tsResult; + + // Check float32 equality (representation-insensitive) + const f32Result = float32Equal(a, b); + if (f32Result !== null) return f32Result; + + if (typeof a !== typeof b) return false; + if (typeof a === "number" && typeof b === "number") { + if (isNaN(a) && isNaN(b)) return true; + if (Math.abs(a - b) < 1e-9) return true; + return false; + } + if (Array.isArray(a) && Array.isArray(b)) { + if (a.length !== b.length) return false; + for (let i = 0; i < a.length; i++) { + if (!deepEqual(a[i], b[i])) return false; + } + return true; + } + if (typeof a === "object" && typeof b === "object") { + const aObj = a as Record; + const bObj = b as Record; + const aKeys = Object.keys(aObj); + const bKeys = Object.keys(bObj); + if (aKeys.length !== bKeys.length) return false; + for (const key of aKeys) { + if (!deepEqual(aObj[key], bObj[key])) return false; + } + return true; + } + return false; +} + +const { methods, functions } = stdlibNames(); +const files = collectYamlFiles(SPEC_DIR); + +describe("spec compatibility — execute", () => { + for (const file of files) { + const relPath = file.slice(SPEC_DIR.length + 1); + const data = readFileSync(file, "utf-8"); + const spec = parseYaml(data) as SpecFile; + + for (const tc of spec.tests) { + if (!tc.mapping) continue; + + const testName = `${relPath}/${tc.name}`; + + // Build files map for imports. + const filesMap = new Map(); + if (spec.files) { + for (const [name, content] of Object.entries(spec.files)) { + filesMap.set(name, content); + } + } + if (tc.files) { + for (const [name, content] of Object.entries(tc.files)) { + filesMap.set(name, content); + } + } + + if (tc.compile_error) { + it(`compile error: ${testName}`, () => { + const { program, errors: parseErrors } = parse(tc.mapping!, "", filesMap); + if (parseErrors.length > 0) return; // parse error counts as compile error + + optimize(program); + const resolveErrors = resolve(program, methods, functions); + expect(resolveErrors.length).toBeGreaterThan(0); + }); + continue; + } + + // Multi-case test: compile once, run each case. + if (tc.cases && tc.cases.length > 0) { + const { program, errors: parseErrors } = parse(tc.mapping!, "", filesMap); + if (parseErrors.length > 0) { + it(`exec: ${testName}/${tc.cases[0].name}`, () => { + const msgs = parseErrors.map(e => `${e.pos.line}:${e.pos.column}: ${e.msg}`); + expect.fail(`Parse errors:\n${msgs.join("\n")}`); + }); + continue; + } + optimize(program); + const resolveErrors = resolve(program, methods, functions); + if (resolveErrors.length > 0) { + it(`exec: ${testName}/${tc.cases[0].name}`, () => { + const msgs = resolveErrors.map(e => `${e.pos.line}:${e.pos.column}: ${e.msg}`); + expect.fail(`Resolve errors:\n${msgs.join("\n")}`); + }); + continue; + } + + for (const c of tc.cases) { + const caseName = `${testName}/${c.name}`; + + if (c.error !== undefined) { + it(`runtime error: ${caseName}`, () => { + const interp = new Interpreter(program); + registerStdlib(interp); + const input = c.input !== undefined ? specFromJSON(c.input) : NULL; + const meta = buildMeta(c.input_metadata); + const { error } = interp.run(input, meta); + if (error === null) { + expect.fail(`Expected runtime error containing "${c.error}" but execution succeeded`); + } + }); + continue; + } + + if (c.deleted) { + it(`exec: ${caseName}`, () => { + const interp = new Interpreter(program); + registerStdlib(interp); + const input = c.input !== undefined ? specFromJSON(c.input) : NULL; + const { deleted, error } = interp.run(input, buildMeta(c.input_metadata)); + if (error) expect.fail(`Runtime error: ${error}`); + expect(deleted).toBe(true); + }); + continue; + } + + it(`exec: ${caseName}`, () => { + const interp = new Interpreter(program); + registerStdlib(interp); + const input = c.input !== undefined ? specFromJSON(c.input) : NULL; + const meta = buildMeta(c.input_metadata); + const { output, error, deleted } = interp.run(input, meta); + + if (error) expect.fail(`Runtime error: ${error}`); + if (c.no_output_check) return; + if (deleted) { + expect(c.output).toBeUndefined(); + return; + } + + const actual = specToJSON(output); + if (!deepEqual(actual, c.output)) { + expect.fail( + `Output mismatch:\n expected: ${JSON.stringify(c.output, null, 2)}\n actual: ${JSON.stringify(actual, null, 2)}`, + ); + } + }); + } + continue; + } + + if (tc.runtime_error !== undefined || tc.error !== undefined) { + const errorSubstring = tc.runtime_error ?? tc.error!; + it(`runtime error: ${testName}`, () => { + const { program, errors: parseErrors } = parse(tc.mapping!, "", filesMap); + expect(parseErrors).toHaveLength(0); + + optimize(program); + const resolveErrors = resolve(program, methods, functions); + expect(resolveErrors).toHaveLength(0); + + const interp = new Interpreter(program); + registerStdlib(interp); + + const input = tc.input !== undefined ? specFromJSON(tc.input) : NULL; + const meta = buildMeta(tc.input_metadata); + const { error } = interp.run(input, meta); + if (error === null) { + expect.fail(`Expected runtime error containing "${errorSubstring}" but execution succeeded`); + } + }); + continue; + } + + if (tc.deleted) { + it(`exec: ${testName}`, () => { + const { program, errors: parseErrors } = parse(tc.mapping!, "", filesMap); + if (parseErrors.length > 0) { + const msgs = parseErrors.map(e => `${e.pos.line}:${e.pos.column}: ${e.msg}`); + expect.fail(`Parse errors:\n${msgs.join("\n")}`); + } + optimize(program); + const resolveErrors = resolve(program, methods, functions); + if (resolveErrors.length > 0) { + const msgs = resolveErrors.map(e => `${e.pos.line}:${e.pos.column}: ${e.msg}`); + expect.fail(`Resolve errors:\n${msgs.join("\n")}`); + } + const interp = new Interpreter(program); + registerStdlib(interp); + const input = tc.input !== undefined ? specFromJSON(tc.input) : NULL; + const { deleted, error } = interp.run(input, buildMeta(tc.input_metadata)); + if (error) expect.fail(`Runtime error: ${error}`); + expect(deleted).toBe(true); + }); + continue; + } + + // Normal test: execute and compare output. + it(`exec: ${testName}`, () => { + const { program, errors: parseErrors } = parse(tc.mapping!, "", filesMap); + if (parseErrors.length > 0) { + const msgs = parseErrors.map(e => `${e.pos.line}:${e.pos.column}: ${e.msg}`); + expect.fail(`Parse errors:\n${msgs.join("\n")}`); + } + + optimize(program); + const resolveErrors = resolve(program, methods, functions); + if (resolveErrors.length > 0) { + const msgs = resolveErrors.map(e => `${e.pos.line}:${e.pos.column}: ${e.msg}`); + expect.fail(`Resolve errors:\n${msgs.join("\n")}`); + } + + const interp = new Interpreter(program); + registerStdlib(interp); + + const input = tc.input !== undefined ? specFromJSON(tc.input) : NULL; + const meta = buildMeta(tc.input_metadata); + const { output, error, deleted } = interp.run(input, meta); + + if (error) { + expect.fail(`Runtime error: ${error}`); + } + + if (tc.no_output_check) return; + + if (deleted) { + expect(tc.output).toBeUndefined(); + return; + } + + const actual = specToJSON(output); + if (!deepEqual(actual, tc.output)) { + expect.fail( + `Output mismatch:\n expected: ${JSON.stringify(tc.output, null, 2)}\n actual: ${JSON.stringify(actual, null, 2)}`, + ); + } + }); + } + } +}); diff --git a/internal/bloblang2/ts/test/parse.test.ts b/internal/bloblang2/ts/test/parse.test.ts new file mode 100644 index 000000000..0f0e2644d --- /dev/null +++ b/internal/bloblang2/ts/test/parse.test.ts @@ -0,0 +1,72 @@ +import { describe, it, expect } from "vitest"; +import { readFileSync, readdirSync, statSync } from "fs"; +import { join } from "path"; +import { parse as parseYaml } from "yaml"; +import { parse } from "../src/parser.js"; + +const SPEC_DIR = join(__dirname, "..", "..", "spec", "tests"); + +interface SpecFile { + description?: string; + files?: Record; + tests: SpecTest[]; +} + +interface SpecTest { + name: string; + mapping?: string; + compile_error?: string; + files?: Record; +} + +function collectYamlFiles(dir: string): string[] { + const result: string[] = []; + for (const entry of readdirSync(dir)) { + const full = join(dir, entry); + if (statSync(full).isDirectory()) { + result.push(...collectYamlFiles(full)); + } else if (entry.endsWith(".yaml")) { + result.push(full); + } + } + return result; +} + +const files = collectYamlFiles(SPEC_DIR); + +describe("spec compatibility — parse", () => { + for (const file of files) { + const relPath = file.slice(SPEC_DIR.length + 1); + const data = readFileSync(file, "utf-8"); + const spec = parseYaml(data) as SpecFile; + + for (const tc of spec.tests) { + if (!tc.mapping || tc.compile_error) continue; + + const testName = `${relPath}/${tc.name}`; + + it(`parses: ${testName}`, () => { + const filesMap = new Map(); + if (spec.files) { + for (const [name, content] of Object.entries(spec.files)) { + filesMap.set(name, content); + } + } + if (tc.files) { + for (const [name, content] of Object.entries(tc.files)) { + filesMap.set(name, content); + } + } + const { errors } = parse(tc.mapping!, "", filesMap); + if (errors.length > 0) { + const msgs = errors.map( + (e) => ` ${e.pos.line}:${e.pos.column}: ${e.msg}`, + ); + expect.fail( + `Parse errors in "${testName}":\n${msgs.join("\n")}\n\nMapping:\n${tc.mapping}`, + ); + } + }); + } + } +}); diff --git a/internal/bloblang2/ts/tsconfig.json b/internal/bloblang2/ts/tsconfig.json new file mode 100644 index 000000000..d5b94c4a7 --- /dev/null +++ b/internal/bloblang2/ts/tsconfig.json @@ -0,0 +1,15 @@ +{ + "compilerOptions": { + "target": "ES2022", + "module": "ES2022", + "moduleResolution": "bundler", + "lib": ["ES2022"], + "outDir": "dist", + "rootDir": "src", + "declaration": true, + "strict": true, + "noUncheckedIndexedAccess": true, + "skipLibCheck": true + }, + "include": ["src"] +} diff --git a/internal/bloblang2/ts/vitest.config.ts b/internal/bloblang2/ts/vitest.config.ts new file mode 100644 index 000000000..c4700dc40 --- /dev/null +++ b/internal/bloblang2/ts/vitest.config.ts @@ -0,0 +1,13 @@ +import { defineConfig } from "vitest/config"; + +export default defineConfig({ + test: { + testTimeout: 30_000, + pool: "forks", + poolOptions: { + forks: { + execArgv: ["--stack-size=65536"], + }, + }, + }, +}); From 45106488ab34b9e0d564b8276e4126d62de1fdaa Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Wed, 22 Apr 2026 12:16:20 +0100 Subject: [PATCH 10/20] bloblang(v2): Add V1 reference specification and conformance corpus Adds internal/bloblang2/migrator/bloblang_v1_spec.md, a reference specification for Bloblang V1 derived from the existing V1 implementation and tightened via adversarial review and a test-driven verification pass. This is the source-of-truth document the migrator's translator rules are written against. Adds internal/bloblang2/migrator/v1spec/, a V1 conformance test corpus organised by topic (access, case_studies, control_flow, edge_cases, error_handling, imports, input_output, lambdas, maps, operators, optimizations, stdlib, types, variables) together with a runner that exercises the V1 corpus against internal/impl/pure for typed-numeric coverage. --- internal/bloblang2/migrator/README.md | 159 +++ .../bloblang2/migrator/bloblang_v1_spec.md | 1100 +++++++++++++++++ internal/bloblang2/migrator/v1spec/README.md | 46 + internal/bloblang2/migrator/v1spec/interp.go | 113 ++ internal/bloblang2/migrator/v1spec/runner.go | 129 ++ .../v1spec/tests/access/dynamic_access.yaml | 158 +++ .../v1spec/tests/access/field_access.yaml | 161 +++ .../tests/access/negative_indexing.yaml | 103 ++ .../v1spec/tests/access/null_safe.yaml | 149 +++ .../v1spec/tests/access/out_of_bounds.yaml | 172 +++ .../cloudformation_inventory.yaml | 160 +++ .../tests/case_studies/debezium_cdc.yaml | 143 +++ .../tests/case_studies/ecommerce_order.yaml | 142 +++ .../tests/case_studies/ga4_clickstream.yaml | 197 +++ .../tests/case_studies/github_webhook.yaml | 96 ++ .../tests/case_studies/kubernetes_pod.yaml | 141 +++ .../tests/case_studies/nlp_enrichment.yaml | 82 ++ .../tests/case_studies/otel_traces.yaml | 189 +++ .../tests/case_studies/stripe_invoice.yaml | 114 ++ .../case_studies/v2_feature_showcase.yaml | 18 + .../tests/case_studies/vpc_flow_logs.yaml | 145 +++ .../tests/control_flow/block_scoping.yaml | 149 +++ .../tests/control_flow/if_else_chains.yaml | 115 ++ .../tests/control_flow/if_expression.yaml | 133 ++ .../tests/control_flow/if_statement.yaml | 175 +++ .../v1spec/tests/control_flow/match_as.yaml | 47 + .../tests/control_flow/match_block_body.yaml | 95 ++ .../tests/control_flow/match_boolean.yaml | 133 ++ .../tests/control_flow/match_edge_cases.yaml | 138 +++ .../tests/control_flow/match_equality.yaml | 150 +++ .../v1spec/tests/control_flow/match_void.yaml | 137 ++ .../tests/edge_cases/deeply_nested.yaml | 108 ++ .../tests/edge_cases/empty_collections.yaml | 117 ++ .../v1spec/tests/edge_cases/infinity.yaml | 145 +++ .../tests/edge_cases/integer_overflow.yaml | 121 ++ .../edge_cases/integer_overflow_ops.yaml | 134 ++ .../tests/edge_cases/interpreter_reuse.yaml | 39 + .../v1spec/tests/edge_cases/nan_behavior.yaml | 140 +++ .../tests/edge_cases/precision_loss.yaml | 89 ++ .../tests/edge_cases/string_codepoints.yaml | 158 +++ .../v1spec/tests/edge_cases/unicode.yaml | 114 ++ .../tests/edge_cases/whitespace_newlines.yaml | 144 +++ .../v1spec/tests/error_handling/catch.yaml | 177 +++ .../v1spec/tests/error_handling/not_null.yaml | 146 +++ .../v1spec/tests/error_handling/or.yaml | 172 +++ .../error_handling/or_catch_composition.yaml | 129 ++ .../tests/error_handling/propagation.yaml | 162 +++ .../v1spec/tests/error_handling/throw.yaml | 135 ++ .../v1spec/tests/imports/basic_import.yaml | 144 +++ .../v1spec/tests/imports/circular_import.yaml | 54 + .../tests/imports/duplicate_namespace.yaml | 89 ++ .../v1spec/tests/imports/nested_import.yaml | 106 ++ .../input_output/conditional_deletion.yaml | 172 +++ .../v1spec/tests/input_output/deletion.yaml | 209 ++++ .../tests/input_output/dynamic_metadata.yaml | 109 ++ .../tests/input_output/input_access.yaml | 168 +++ .../v1spec/tests/input_output/metadata.yaml | 215 ++++ .../tests/input_output/output_assignment.yaml | 171 +++ .../tests/input_output/output_root.yaml | 149 +++ .../migrator/v1spec/tests/lambdas/basic.yaml | 128 ++ .../tests/lambdas/complex_iterators.yaml | 111 ++ .../v1spec/tests/lambdas/defaults.yaml | 56 + .../v1spec/tests/lambdas/discard_params.yaml | 95 ++ .../v1spec/tests/lambdas/fold_patterns.yaml | 87 ++ .../v1spec/tests/lambdas/outer_capture.yaml | 91 ++ .../migrator/v1spec/tests/lambdas/purity.yaml | 112 ++ .../v1spec/tests/lambdas/return_values.yaml | 173 +++ .../migrator/v1spec/tests/maps/basic.yaml | 183 +++ .../migrator/v1spec/tests/maps/defaults.yaml | 167 +++ .../v1spec/tests/maps/discard_params.yaml | 68 + .../v1spec/tests/maps/higher_order.yaml | 125 ++ .../migrator/v1spec/tests/maps/isolation.yaml | 141 +++ .../v1spec/tests/maps/named_args.yaml | 79 ++ .../tests/maps/parameter_shadowing.yaml | 66 + .../migrator/v1spec/tests/maps/recursion.yaml | 158 +++ .../v1spec/tests/maps/recursion_advanced.yaml | 173 +++ .../tests/maps/recursive_with_iterators.yaml | 106 ++ .../v1spec/tests/maps/transitive_calls.yaml | 97 ++ .../v1spec/tests/maps/void_returns.yaml | 117 ++ .../v1spec/tests/operators/arithmetic.yaml | 205 +++ .../v1spec/tests/operators/comparison.yaml | 210 ++++ .../tests/operators/division_modulo.yaml | 123 ++ .../v1spec/tests/operators/equality.yaml | 216 ++++ .../v1spec/tests/operators/logical.yaml | 266 ++++ .../tests/operators/numeric_promotion.yaml | 189 +++ .../operators/numeric_promotion_edge.yaml | 179 +++ .../v1spec/tests/operators/precedence.yaml | 265 ++++ .../v1spec/tests/operators/string_concat.yaml | 201 +++ .../tests/optimizations/constant_folding.yaml | 240 ++++ .../optimizations/dead_code_elimination.yaml | 150 +++ .../tests/optimizations/path_collapse.yaml | 171 +++ .../v1spec/tests/stdlib/any_all_methods.yaml | 98 ++ .../v1spec/tests/stdlib/array_modify.yaml | 158 +++ .../v1spec/tests/stdlib/array_query.yaml | 331 +++++ .../v1spec/tests/stdlib/array_transform.yaml | 273 ++++ .../v1spec/tests/stdlib/collect_method.yaml | 36 + .../v1spec/tests/stdlib/core_functions.yaml | 210 ++++ .../v1spec/tests/stdlib/encoding.yaml | 343 +++++ .../v1spec/tests/stdlib/enumerate_method.yaml | 131 ++ .../v1spec/tests/stdlib/find_method.yaml | 74 ++ .../tests/stdlib/iter_chain_patterns.yaml | 63 + .../tests/stdlib/method_composition.yaml | 137 ++ .../v1spec/tests/stdlib/numeric_methods.yaml | 273 ++++ .../v1spec/tests/stdlib/object_methods.yaml | 220 ++++ .../v1spec/tests/stdlib/object_transform.yaml | 153 +++ .../v1spec/tests/stdlib/sequence_methods.yaml | 300 +++++ .../v1spec/tests/stdlib/sort_edge_cases.yaml | 126 ++ .../v1spec/tests/stdlib/string_methods.yaml | 267 ++++ .../v1spec/tests/stdlib/string_regex.yaml | 158 +++ .../tests/stdlib/timestamp_methods.yaml | 188 +++ .../v1spec/tests/stdlib/type_conversion.yaml | 276 +++++ .../v1spec/tests/stdlib/unique_flatten.yaml | 94 ++ .../migrator/v1spec/tests/types/array.yaml | 172 +++ .../v1spec/tests/types/bool_null.yaml | 255 ++++ .../migrator/v1spec/tests/types/bytes.yaml | 274 ++++ .../migrator/v1spec/tests/types/floats.yaml | 260 ++++ .../migrator/v1spec/tests/types/integers.yaml | 290 +++++ .../migrator/v1spec/tests/types/object.yaml | 204 +++ .../migrator/v1spec/tests/types/string.yaml | 306 +++++ .../v1spec/tests/types/timestamp.yaml | 247 ++++ .../tests/types/timestamp_arithmetic.yaml | 195 +++ .../tests/types/type_introspection.yaml | 187 +++ .../migrator/v1spec/tests/types/void.yaml | 194 +++ .../variables/bare_ident_resolution.yaml | 93 ++ .../v1spec/tests/variables/copy_on_write.yaml | 203 +++ .../v1spec/tests/variables/declaration.yaml | 156 +++ .../tests/variables/dynamic_assignment.yaml | 80 ++ .../variables/expr_body_path_assign.yaml | 109 ++ .../variables/nested_scope_mutations.yaml | 164 +++ .../tests/variables/path_assignment.yaml | 189 +++ .../v1spec/tests/variables/reassignment.yaml | 208 ++++ .../tests/variables/scope_boundaries.yaml | 140 +++ .../v1spec/tests/variables/shadowing.yaml | 235 ++++ .../migrator/v1spec/v1quirks_test.go | 443 +++++++ .../bloblang2/migrator/v1spec/v1spec_test.go | 14 + 135 files changed, 22080 insertions(+) create mode 100644 internal/bloblang2/migrator/README.md create mode 100644 internal/bloblang2/migrator/bloblang_v1_spec.md create mode 100644 internal/bloblang2/migrator/v1spec/README.md create mode 100644 internal/bloblang2/migrator/v1spec/interp.go create mode 100644 internal/bloblang2/migrator/v1spec/runner.go create mode 100644 internal/bloblang2/migrator/v1spec/tests/access/dynamic_access.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/access/field_access.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/access/negative_indexing.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/access/null_safe.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/access/out_of_bounds.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/cloudformation_inventory.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/debezium_cdc.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/ecommerce_order.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/ga4_clickstream.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/github_webhook.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/kubernetes_pod.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/nlp_enrichment.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/otel_traces.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/stripe_invoice.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/v2_feature_showcase.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/case_studies/vpc_flow_logs.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/block_scoping.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/if_else_chains.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/if_expression.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/if_statement.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/match_as.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/match_block_body.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/match_boolean.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/match_edge_cases.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/match_equality.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/control_flow/match_void.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/deeply_nested.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/empty_collections.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/infinity.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow_ops.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/interpreter_reuse.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/nan_behavior.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/precision_loss.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/string_codepoints.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/unicode.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/edge_cases/whitespace_newlines.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/error_handling/catch.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/error_handling/not_null.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/error_handling/or.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/error_handling/or_catch_composition.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/error_handling/propagation.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/error_handling/throw.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/imports/basic_import.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/imports/circular_import.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/imports/duplicate_namespace.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/imports/nested_import.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/input_output/conditional_deletion.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/input_output/deletion.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/input_output/dynamic_metadata.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/input_output/input_access.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/input_output/metadata.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/input_output/output_assignment.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/input_output/output_root.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/basic.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/complex_iterators.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/defaults.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/discard_params.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/fold_patterns.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/outer_capture.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/purity.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/lambdas/return_values.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/basic.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/defaults.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/discard_params.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/higher_order.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/isolation.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/named_args.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/parameter_shadowing.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/recursion.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/recursion_advanced.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/recursive_with_iterators.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/transitive_calls.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/maps/void_returns.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/arithmetic.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/comparison.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/division_modulo.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/equality.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/logical.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion_edge.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/precedence.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/operators/string_concat.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/optimizations/constant_folding.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/optimizations/dead_code_elimination.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/optimizations/path_collapse.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/any_all_methods.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/array_modify.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/array_query.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/array_transform.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/collect_method.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/core_functions.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/encoding.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/enumerate_method.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/find_method.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/iter_chain_patterns.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/method_composition.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/numeric_methods.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/object_methods.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/object_transform.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/sequence_methods.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/sort_edge_cases.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/string_methods.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/string_regex.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/timestamp_methods.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/type_conversion.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/stdlib/unique_flatten.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/array.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/bool_null.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/bytes.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/floats.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/integers.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/object.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/string.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/timestamp.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/timestamp_arithmetic.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/type_introspection.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/types/void.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/bare_ident_resolution.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/copy_on_write.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/declaration.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/dynamic_assignment.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/expr_body_path_assign.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/nested_scope_mutations.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/path_assignment.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/reassignment.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/scope_boundaries.yaml create mode 100644 internal/bloblang2/migrator/v1spec/tests/variables/shadowing.yaml create mode 100644 internal/bloblang2/migrator/v1spec/v1quirks_test.go create mode 100644 internal/bloblang2/migrator/v1spec/v1spec_test.go diff --git a/internal/bloblang2/migrator/README.md b/internal/bloblang2/migrator/README.md new file mode 100644 index 000000000..5686e3d04 --- /dev/null +++ b/internal/bloblang2/migrator/README.md @@ -0,0 +1,159 @@ +# Bloblang V1 → V2 Migrator + +A Go library that takes a Bloblang V1 mapping and produces an equivalent +Bloblang V2 mapping plus a Report describing every semantic divergence it +had to introduce. 100% fidelity is not a goal — V2 is a deliberate redesign +that fixes V1 ambiguities, so some mappings will intentionally shift +semantics. The migrator's job is to make every shift visible so a human can +audit before cutover. + +## Usage + +```go +import "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" + +rep, err := translator.Migrate(v1Source, translator.Options{ + Verbose: true, // include Info-severity notes + Files: imports, // optional virtual filesystem for imports +}) +if err != nil { + // *CoverageError when the weighted translation ratio drops below + // Options.MinCoverage (default 0.75). The Report is still reachable + // via err.(*translator.CoverageError).Report. + return err +} + +fmt.Println(rep.V2Mapping) // translated mapping +for _, c := range rep.Changes { // per-site divergences + fmt.Printf("%d:%d %s [%s %s] %s\n", c.Line, c.Column, c.Severity, c.Category, c.RuleID, c.Explanation) +} +fmt.Printf("coverage: %.2f (%d/%d translated exactly)\n", + rep.Coverage.Ratio, rep.Coverage.Translated, rep.Coverage.Total) +``` + +A zero-valued `Options` means defaults (75% minimum coverage, terse +reporting). Imports declared in the V1 source must be supplied via +`Options.Files`; the migrator translates each imported file to V2 too and +surfaces them on `Report.V2Files`. + +## Pipeline + +``` +V1 source + │ + ├─► v1ast.Parse — V1 parser (full reimplementation, handles the lenient + │ grammar as it is, not as the docs describe it) + ├─► translator walk — each V1 node → V2 node; every shift records a Change + ├─► syntax.Print — V2 AST → V2 source text + └─► syntax.Parse — non-fatal sanity check; failures become Changes + tagged RuleEmittedInvalidV2 rather than errors + (V1-invalid inputs produce V2-invalid outputs, and + real translator bugs still surface at the caller's + Compile) +``` + +## What is in the box + +| Path | What it holds | +|---|---| +| `bloblang_v1_spec.md` | Reference spec for V1 reconciled from the parser source, the config-test corpus, and the official docs. Includes a `§14` catalogue of migration-critical quirks. | +| `v1ast/` | Self-contained V1 parser and AST. Exports `NodePos()` so translation rules can cite source positions on every Change. | +| `v1spec/` | V1 spec-compliance corpus — V2 spec tests translated to V1, run against the V1 interpreter. Acts as a pin on V1 behaviour so the translator has a stable target. | +| `translator/` | The migrator itself. `Migrate`, `Options`, `Report`, `Change`, `RuleID`. Translation rules live in `translate.go` (expression / statement walker) and `methods.go` (V2 method-shape rewrites). | + +## Testing + +Five layers. Run with `task test` or `go test ./internal/bloblang2/migrator/...`. + +- **Layer 1 — V1 parser conformance** (`v1ast/parser_test.go`): parse every + non-skipped YAML case in `v1spec/tests/`, then re-print → re-parse to + check round-trip integrity. +- **Layer 2 — Per-rule unit tests** (`translator/rules_test.go`): one case + per core RuleID. A regression here pinpoints the affected rule directly + rather than showing up as an anonymous corpus drop. +- **Layer 3 — Contract tests** (`translator/migrate_test.go`, + `change_test.go`): the `Migrate` surface and `Change` serialisation. +- **Layer 4 — End-to-end corpus regression** + (`translator/corpus_test.go`): translate every V1 mapping in + `v1spec/tests/`, compile the V2 output, run it through the V2 + interpreter, compare against the test's expected output. + - **OK** — V2 matched V1's expected output. + - **Flagged** — V1 and V2 diverged, but the translator warned via a + `SemanticChange` or `Unsupported` Change, so the caller was told. + - **Unexpected** — V2 diverged silently. Test failure territory. + - Pass rate (OK + Flagged) is pinned via a floor in the test so real + regressions trip the gate. +- **Layer 5 — Property tests** (`translator/property_test.go`): + never-panic on junk input, valid-V2 output always parses, Coverage.Ratio + always in [0,1], every Change has a non-zero position and non-empty + explanation. + +## Design choices worth knowing about + +- **Adopt V2, flag the shift**. Where V1 and V2 diverge, the translator + picks V2 semantics and records a `SemanticChange` Change pointing at the + `§14` anchor in the spec. Faithfully preserving V1 quirks (e.g. `.or()` + catching errors, `%` truncating floats, bare-ident shadowing) would mean + writing V2 code that exists to ape the old language's mistakes. +- **Null-safe by default on bare paths**. V2 errors on field access over + non-object receivers; V1 returned null. Every bare-ident rewrite and + translated numeric path segment emits `?.` / `?[]` so V1's silent-null + behaviour carries over, and the divergence is still flagged for audits + where the receiver type matters. +- **Non-fatal sanity parse**. The final `syntax.Parse` on the emitted V2 + text is recorded as a Change rather than an error. Several V1 compile + errors (chained comparisons, missing imports, duplicate namespaces) echo + as V2 parse errors — treating them as translator bugs was noisy and + wrong. +- **Fixpoint import translation**. Imported V1 files are translated in + dependency order so transitively imported files finish before their + dependants; cycles take a final pass with whatever siblings did + translate. Nested imports that aren't statically resolvable stay + unqualified and get a `RuleImportStatement` note — V2's namespaces don't + re-export transitively the way V1's flat map table did. +- **`v1ast` is deliberately separate**. The official V1 parser is buried + in Benthos's `bloblang/parser/` and isn't easy to use as an AST source. + Reimplementing it as a standalone package here keeps the migrator + independent of runtime concerns and lets us round-trip V1 source + verbatim for printing. + +## Contributing + +- **Spec corrections**: edit `bloblang_v1_spec.md` and cite the source + (`bloblang/parser/...` or a config-test fixture). The spec is authoritative + for the translator — if the spec is wrong, the translator encodes a bug. +- **New rules**: add the RuleID to `translator/change.go` (append only; never + reuse values), emit the Change from the translation site, and add a case to + `translator/rules_test.go` asserting both the V2 substring and the RuleID. +- **New V2 output behaviours**: update `go/pratt/syntax/print.go`, not the + translator — the translator emits V2 AST nodes, not text. + +## V1 spec provenance + +The V1 spec was reconciled from three sources, with the implementation +winning whenever the implementation and docs disagreed: + +1. **Reference implementation** — `../../bloblang/` in this repo. Key + files: `parser/mapping_parser.go`, `parser/query_*.go`, + `query/arithmetic.go`, `mapping/assignment.go`, `environment.go`. +2. **Conformance-ish corpus** — `../../../config/test/bloblang/`: real + mappings and expected outputs. Inline `*_test.go` files next to the + parser packages were the best source for *rejected* syntax. +3. **Official documentation** — `docs.redpanda.com/redpanda-connect/` + pages under `guides/bloblang/`. Treated as a strong prior, superseded + by the implementation where they differed. + +Documentation disagreements with the implementation are called out in +`bloblang_v1_spec.md` itself. + +## Known gaps + +- **No `bloblang2:` benthos processor yet.** `internal/bloblang2/` ships + a full parallel language implementation (Go engine, TypeScript + engine, tree-sitter grammar, LSP) but it isn't wired into the + pipeline runtime as a registerable processor. Migrated V2 mappings + can be compiled and run through `internal/bloblang2` directly, and + the `demo/` directory has a local web playground, but there's no + config-level `processors: [{ bloblang2: "..." }]` entry. Adding a + minimal processor wrapper so V2 can replace `bloblang` / `mapping` / + `mutation` inside a real Benthos config is outstanding work. diff --git a/internal/bloblang2/migrator/bloblang_v1_spec.md b/internal/bloblang2/migrator/bloblang_v1_spec.md new file mode 100644 index 000000000..4ae367b2c --- /dev/null +++ b/internal/bloblang2/migrator/bloblang_v1_spec.md @@ -0,0 +1,1100 @@ +# Bloblang V1 — Migration Reference Specification + +This document is a complete, self-contained description of Bloblang V1 as implemented in `internal/bloblang/` of `redpanda-data/benthos`. It is intended as the source of truth for tooling that reads, analyses, or rewrites V1 mappings — in particular, V1 → V2 migration. It is deliberately descriptive (what V1 *does*), not prescriptive (what V1 *should* do): every accepted construct, quirk, and legacy form should be documented here even when it is undesirable, because a migration tool must recognise it. + +Sources reconciled for this spec: + +- Official documentation: the `about/`, `arithmetic/`, `walkthrough/`, `advanced/` pages under `docs.redpanda.com/redpanda-connect/guides/bloblang/`, plus `configuration/interpolation/`. +- Reference implementation: `../bloblang/` (parser, query, mapping, field packages) and `public/bloblang/` (host API surface). +- Conformance-ish corpus: `../../../config/test/bloblang/*.yaml` and `*.blobl`, plus inline `_test.go` files alongside each parser and query package. + +Where the implementation and docs disagree, the implementation wins and the docs claim is called out. See `README.md` in this directory for the full source list. + +--- + +## 1. Overview + +A Bloblang V1 source file is a **mapping**: an ordered sequence of statements that are executed once per input message to produce an output message. The output has two channels: + +- **`root`** — the message payload (structured data: any JSON-compatible value, or raw bytes). +- **`meta`** — the message metadata (a map of keys to values). + +Mappings read from: + +- **`this`** — the (structured) input payload. +- **`@` / `meta(...)`** — input metadata. +- **`content()`** — the raw byte content of the input. +- **`$var`** — locally bound variables (`let name = expr`). +- Environment (`env`), clock (`now`, `timestamp_*`), batch (`batch_index`, `batch_size`), etc. + +A mapping is **the whole file**. There is no module system beyond `import`/`from` for named-map reuse. + +### 1.1 Two dialects: full mapping vs. field interpolation + +Bloblang appears in two grammatically distinct contexts: + +1. **Full mapping**, used in the `bloblang` processor, `mapping` fields, test files, and `.blobl` files. Multiple statements, assignments, `let`, `map`, `import`. +2. **Field interpolation**, used inside string config fields as `${! expression }`. A single query expression only — no statements. Literal `$` is kept verbatim unless followed by `{!`; the escape for a literal `${!` is `${{!...}}`. + +Both share the same expression grammar. Everything below is in the full-mapping grammar unless explicitly marked *"interpolation only"*. + +--- + +## 2. Lexical Structure + +### 2.1 Encoding and whitespace + +- Source is UTF-8 (runes). The parser operates on `[]rune`. +- **Whitespace** is spaces and tabs. +- **Newlines** are `\n` or `\r\n`. +- Newlines **are significant as statement separators** in mappings (see §7). +- Newlines are **conditionally significant inside expressions** — the parser is not uniformly newline-tolerant. The rules: + + **Positions that accept newlines freely** (between tokens, effectively ignored): + - Inside array literals between elements (`[a,\n b,\n c]`). + - Inside object literals between members (`{"a": 1,\n "b": 2}`). + - Inside function/method argument lists between arguments and around commas (`fn(\n a,\n b\n)`). + - After a binary operator in an arithmetic/comparison/logical chain (`a +\n b`). + + **Positions that reject newlines**: + - **Immediately before a binary operator** (`a\n + b` is a parse error; the arithmetic chain delimiter uses `SpacesAndTabs` before the operator and `DiscardedWhitespaceNewlineComments` only after). + - **Around the `->` in a lambda** (`x\n-> body` and `x ->\n body` are parse errors; both sides of `->` use `SpacesAndTabs` only — `parser/query_expression_parser.go:222-228`). + - **Immediately before a `.` in a path or method chain** (`a\n.b` is a parse error; `parseWithTails` in `parser/query_function_parser.go:95-137` expects the `.` with no leading whitespace). Newlines AFTER the `.` are fine: `a.\n b` and `a.\n method()` both work. + - **Any whitespace before a `.`** (`a .b` also fails — the parser requires the dot to immediately follow the preceding expression). Space/newline after the `.` is fine. + + **Migration implications**: + - Long method chains must break *after* the `.`, never before. `root = items.\n filter(...).\n map_each(...)` is idiomatic V1; `root = items\n .filter(...)` is a parse error. + - Lambda bodies must stay on the same line as `->`. Wrap long bodies in parens + named capture (`foo.(name -> body)`) or use a `let` binding to split work. + - Long expressions break *after* the binary operator. `a +\n b` works; `a\n + b` does not. + +- **Assignment `=` requires whitespace on both sides.** `root.a = 1` works; `root.a=1`, `root.a =1`, `root.a= 1`, and `let x=5` all parse-error with `expected whitespace`. This is specific to `=` at statement position — binary operators (`+`, `==`, `&&`, etc.) have no such requirement (`1+2` is fine). + +### 2.2 Comments + +Line comments start with `#` and run to the end of the line. + +```blobl +# top-level comment +root.foo = this.bar # trailing comment +``` + +Comments are allowed wherever whitespace is allowed, including between the tokens of a single expression and between arguments. + +A leading `#!` (shebang-style) on line 1 is simply a line comment — the parser does not treat it specially, but `.blobl` files in the wild use it for tooling. + +### 2.3 Identifiers + +Two overlapping lexical classes are used for different positions: + +``` +IDENT = [A-Za-z0-9_]+ ; lenient (path segments, lambda params, @meta shortforms) +SNAKE_CASE = [a-z0-9_]+ ; lowercase-only (function/method names, named args, meta bare keys) +``` + +- **`IDENT`** — any run of ASCII letters, digits, and underscores (`fieldReferencePattern` in `parser/query_function_parser.go:152`; `contextNameParser` at `parser/query_expression_parser.go:207-215`). Permits digits as the leading character — this is how `this.0` works for array indexing (see §6.3), and it also allows uppercase letters in path segments. +- **`SNAKE_CASE`** — lowercase letters, digits, underscores (`parser/combinators.go:417`). Used for function/method names, named-arg names, and bare meta keys. Note: a source comment claims "very strict: no double underscores, prefix or suffix underscores" but the actual parser does not enforce these — `_foo`, `foo__bar`, `foo_` all match. Migration tools should treat the lenient pattern as authoritative. + +### 2.4 Reserved keywords + +The following identifiers are recognised as keywords in at least one position: + +``` +true false null # literals +this root # context references +if else # conditionals (else if is two tokens) +match # pattern expression +let # variable binding +map # named-map definition +import from # module loading +meta # metadata keyword + function +_ # wildcard in match cases and lambda params +``` + +Keywords are not reserved from being *field names*: `this.match` and `root.if` are valid path accesses because the path segment parser accepts any identifier. + +### 2.5 Operator and punctuation tokens + +``` ++ - * / % ; arithmetic +== != < <= > >= ; comparison +&& || ; logical (word-level not; !expr for prefix) +! ; prefix logical not (see §5.3) +| ; coalesce (high precedence; see §5.2) += ; assignment (statement level only) +-> ; lambda arrow +=> ; match-case arrow +( ) [ ] { } ; grouping / literals / blocks +. , : ; selector / separator / named-arg +$ @ ; variable / metadata sigils +# ; comment +``` + +There is no `;` statement separator — use a newline. + +--- + +## 3. Types + +Bloblang values at runtime are one of: + +| Type | Notes | +|------------|-----------------------------------------------------------------------| +| `null` | The literal `null`. | +| `bool` | `true` / `false`. | +| `number` | Either integer (`int64`) or floating (`float64`) at runtime. A single runtime type in user-facing sense; internally degrades between `int64` / `uint64` / `float64` (`query/arithmetic.go`). JSON numbers land as `json.Number` and are coerced as needed. **`.number()` as a coercion method always returns `float64`** — even for `"42".number()` — so use `.floor()` / `.round()` if an integer is required. Arithmetic preserves integer-ness (`2 + 3 → 5` as int64), but integer `/` always promotes to `float64` (see §5.3). | +| `string` | Go `string`, arbitrary UTF-8. | +| `bytes` | Raw `[]byte`. Returned by `content()` and some decode methods; assignable to `root`. Implicit coercions to/from `string` occur in several methods. | +| `timestamp`| `time.Time`. Produced by `ts_parse()` and some `ts_*` methods. **Note**: `now()` returns a `string` (RFC3339Nano-formatted), not a timestamp — see Quirk #57 in §14. | +| `array` | Ordered heterogeneous list. | +| `object` | Map from string keys to values (sometimes called "structured" data). | +| `delete` | Sentinel from `deleted()` (see §9.4). | +| `nothing` | Sentinel from `nothing()` / omitted match arms (see §9.4). | +| `error` | Not a first-class value, but a separate runtime channel (see §11). | + +The type test method `.type()` returns one of `"null"`, `"bool"`, `"number"`, `"string"`, `"bytes"`, `"timestamp"`, `"array"`, `"object"`, plus `"delete"` and `"nothing"` when called directly on the sentinel values (`deleted().type()` → `"delete"`, `nothing().type()` → `"nothing"`). There is no user-level distinction between int and float for `.type()`; both report `"number"`. + +--- + +## 4. Literals + +### 4.1 Null, booleans + +```blobl +null true false +``` + +Lowercase only. + +### 4.2 Numbers + +```blobl +0 42 -7 +3.14 -0.5 +``` + +- Integer literals (no sign) parse as `int64`. A leading `-` is parsed not as part of the literal but as a unary minus (`parser/query_arithmetic_parser.go:76`), applied to the operand it precedes. +- Float literals require a decimal point (`.`) **with digits on both sides** — `.5` and `5.` are both parse errors (`parser/combinators.go:233`; digits are consumed, then `.` + digits is optional but *both* must be present if either is). +- **No hex (`0x...`), octal, or binary literals.** +- **No scientific/exponent notation** (`1e10`, `1.5E-3`) — the number parser only consumes `[0-9]+` optionally followed by `.[0-9]+`. +- `NaN` and `Infinity` are not literals; they can only arise from arithmetic (and division by zero is an error, not an `Inf` — see §12). + +### 4.3 Strings + +Two forms: + +**Double-quoted** (escaped): +```blobl +"hello" "line\nbreak" "quote: \"x\"" "é" +``` +Processed via `strconv.Unquote`. Supported escapes: `\a \b \f \n \r \t \v \\ \' \" \xHH \uHHHH \UHHHHHHHH \NNN` (the full set `strconv.Unquote` accepts). Notably, `\/` is **not** supported — `"\/"` is a parse error (unlike JSON). + +**Triple-quoted** (raw, multi-line): +```blobl +"""first line +second line with a literal \n in it""" +``` +Everything between the opening `"""` and closing `"""` is taken verbatim — backslash escapes are **not** processed, and newlines are preserved. There is no escape for `"""` within a triple-quoted string; if you need that, use the double-quoted form. + +There is no single-quote string form. There is no string interpolation — concatenate with `+` or use `format("...%v...", args...)` / `${!...}` (interpolation context only). + +### 4.4 Arrays + +```blobl +[] +[1, 2, 3] +["a", null, true, {"k": "v"}] +[ this.a, this.b + 1 ] # elements are full expressions +``` + +Any expression is allowed as an element. Whitespace and newlines are permitted between elements. A trailing comma after the last element is tolerated by the parser. + +### 4.5 Objects + +```blobl +{} +{"k": "v"} +{ + "a": 1, + "b": this.x, + foo: this.y, # bare ident key — treated as `this.foo` at runtime + ("dyn_" + this.suffix): this.y # computed key (parens recommended for clarity) +} +``` + +Keys are parsed by `OneOf(QuotedString, queryParser)` in `parser/query_literal_parser.go:42-45`. At build time, `NewMapLiteral` (`query/literals.go:20-38`) classifies each key: + +- **Quoted string literal** (`"foo"`) → static key, used verbatim. +- **Non-string literal** (`5`, `true`, `null`, array/object literals) → **parse error**: `object keys must be strings, received: ` where `` is the Go type name (e.g. `int64`, `bool`, `` for null). +- **Any other expression** (bare ident, path access, function call, parenthesised expression, etc.) → **dynamic key**; evaluated at runtime, result must be a string (runtime error otherwise). + +Consequences for migration tools: + +- `{a: 1}` **parses** — `a` is the legacy bare-ident form for `this.a`, and if `this.a` is a string at runtime the key works. This is not idiomatic and should be rewritten to `{"a": 1}` (if the intent was a literal key `a`) or `{(this.a): 1}` (if truly dynamic). +- Computed keys **do not strictly require outer parens** in the object literal — `{foo.bar(): 1}` parses fine. Parens are only needed when the expression starts with a quoted string literal that would otherwise be consumed as the key alone — e.g. `{("foo".uppercase()): 1}` needs parens; without them, the parser commits to `"foo"` as the key and then fails on the `.uppercase()` tail. +- Values are any expression. +- Duplicate keys: last wins. +- Empty-string key (`""`) is permitted. + +### 4.6 Literal composition in expressions + +Array and object literals can appear anywhere an expression is valid — as method arguments, right-hand sides, match arms, lambda bodies, etc. + +--- + +## 5. Operators + +### 5.1 Unary operators + +- **Prefix `!`** — logical not. Applies to an entire **method-chained term** and may appear anywhere that term can (left of `fn`, left of a lambda body, inside a `match` case expression, etc.). Parsed in `parseWithTails` at `parser/query_function_parser.go:98`. Operand must be `bool`; non-bool is a type error (`query/methods.go:388-398`, the `notMethod`). **Only one `!` can be stacked** — `!!x` is a parse error because the parser uses `Optional(!)` (zero or one), not repetition. For double negation, write `!(!x)`. +- **Prefix `-`** — unary minus. Parsed as an optional prefix of **any operand** in an arithmetic chain (`parser/query_arithmetic_parser.go:74-77`, `Optional(charMinus)` is applied per term inside `Delimited`). `1 + -2`, `true && -this.x > 0`, and `-fn()` are all legal. Implemented as `0 - operand`. + +There is no prefix `+` and no postfix operators. + +### 5.2 Binary operators and precedence + +V1 parses arithmetic-level expressions **flat** — a sequence of operands separated by any of the binary operators — and then resolves precedence in a four-pass reduction at build time (`query.NewArithmeticExpression` in `query/arithmetic.go:457`). The effective precedence table is: + +| Level | Operators | Associativity | Notes | +|-------|-----------------------|---------------|------------------------------------------------| +| 1 (tightest) | `*` `/` `%` `\|` | left | Coalesce `\|` binds as tightly as multiplicative ops — surprising | +| 2 | `+` `-` | left | `+` also works for strings (concatenation) | +| 3 | `<` `<=` `>` `>=` `==` `!=` | left | Chains (`a == b == c`) parse but rarely make sense — see note below | +| 4 (loosest) | `&&` `\|\|` | left (flat) | `&&` and `\|\|` share a level — see warning below | + +Warning on level 4: unlike almost every other language, `&&` does **not** bind tighter than `||`. They are resolved in a single left-to-right pass (`query/arithmetic.go:524-540`). `a || b && c` parses as `(a || b) && c`, not the conventional `a || (b && c)`. Migration tools must preserve original parenthesisation to avoid semantic drift. + +Warning on level 1: the coalesce operator `|` is **high precedence**. `a + b | c` parses as `a + (b | c)`. This is the opposite of, for example, Kotlin's `?:` (low precedence) or Rust's `||` fallback pattern. Parenthesise when in doubt. + +Comparison operators at level 3 are technically left-associative (`a == b == c` parses as `(a == b) == c`), but such chains rarely make semantic sense and migration tools should treat them as likely bugs. + +### 5.3 Semantics per operator + +| Operator | Accepts | Result | +|----------|--------------------------------------------|----------------| +| `+` | number+number, string+string, string/bytes pair | number, or **string** (see note) | +| `-` | number only | number | +| `*` | number only | number | +| `/` | number only | **always `float64`** | +| `%` | any numeric pair — **floats are truncated to `int64`** silently via `IGetInt` (`value/type_helpers.go:175-178`). `7.5 % 2.5` evaluates as `7 % 2 == 1`, not a type error. | integer; divide-by-zero errors like `/` | +| `==` | any two values | bool; **asymmetric in coercion** (see note); `5 == 5.0` is true; structural for object/array; `null == null` is true | +| `!=` | any two values | bool; inverse of `==` | +| `<`, `<=`, `>`, `>=` | number–number or string–string pair | bool; non-numeric/non-string operands are type errors. **Booleans are not orderable.** Timestamps compare as RFC3339Nano strings (works for well-formed timestamps). Also asymmetric — see note. | +| `&&`, `\|\|` | bool — or any **numeric** that coerces via `IGetBool` (non-zero → true). Strings / null / arrays / objects are type errors. | bool; **Short-circuit is guaranteed** (`boolAnd` / `boolOr` in `query/arithmetic.go:396-442` return before evaluating RHS). So `true && 1` → `true`, `0 \|\| true` → `true`, `true && "x"` → error. | +| `\|` (coalesce) | any pair | left if not `null` and not error; otherwise right. The `deleted()` and `nothing()` sentinels register as null for this test, so `deleted() \| "x" == "x"`. | +| `!` | bool | bool | + +Note on `+` with bytes: when either operand is `[]byte`, both are coerced to `string` via `IGetString` and concatenated (`query/arithmetic.go:231-240`). The result is `string`, not `bytes`. + +**Note on comparison asymmetry** (`query/arithmetic.go:346-392`, `compareOp`): all comparison operators (`==`, `!=`, `<`, `<=`, `>`, `>=`) dispatch on the **left operand's restricted type** and then attempt to coerce the right operand to that type before comparing. Only if that coercion fails do they fall back to a generic structural compare (which usually returns `false` for type mismatches). The key consequences: + +- **`bool == number` is not symmetric**. Left side `bool`: the number is coerced to `bool` via `IGetBool` (non-zero → true), so `true == 1`, `false == 0`, `true == 3.14` all return **`true`**. Left side `number`: `IGetNumber` rejects bools, so `1 == true` returns **`false`**. Migration tools must preserve operand order. +- **`bool == string`**: no coercion either way, falls to generic compare → **`false`** (e.g. `true == "true"` is `false`). +- **`string == number`**: no coercion either way, returns **`false`** (e.g. `"5" == 5` is `false`). +- **`null == null`** is `true`; **`null == `** is `false` (including `null == false` and `null == 0`). + +The asymmetry matters most for `==`/`!=` with bool-number pairs. For `<`, `<=`, `>`, `>=` the asymmetry is less visible because those operators don't admit bool coercion — a bool operand on either side errors. + +Integer overflow in `+`, `-`, `*` is **not checked** — results wrap per Go int64 semantics. Division and modulo by zero return an `ErrDivideByZero` (`query/arithmetic.go:188`, `204`), not `Inf`/`NaN`. + +**Compile-time vs. runtime errors**. Constant folding applies to **arithmetic** (`+ - * / %`) and **comparison** (`== != < <= > >=`) operators only. When both operands are literals (including literal expressions like `(2+1)` that fold to a literal first), V1 evaluates at parse time and any type mismatch, divide-by-zero, etc. is raised as a **compile-time (fatal) parse error**. Examples: `root = 5 + "foo"`, `root = 5 / 0`, `root = null < 3` fail at parse. The same operations with non-literal operands (e.g. `this.x + "foo"`) fail at runtime. The logical operators **`&&`, `||`, and the coalesce `|` are NOT constant-folded** — `root = false || "x"`, `root = true && null`, `root = null | "fallback"` all defer to runtime (and the coalesce case succeeds with `"fallback"`). + +### 5.4 Path-scoped sub-expression: `foo.(expr)` and `foo.(name -> expr)` + +Inside a path, `.(...)` introduces a **sub-expression with rebound context** (`parser/query_function_parser.go:53`, `query.NewMapMethod`). Two forms: + +**Plain form** — `this` is rebound to the preceding value: + +```blobl +root.type = this.thing.(article | comment | this).type +``` + +Within the parentheses, `this` refers to `this.thing`. Inside that scope, bare identifiers (`article`, `comment`) are the legacy shorthand for `this.article`, `this.comment` (§6.1). This is the canonical place to use `|` coalesce. + +**Named-capture form** — a named context is introduced, but `this` is **unchanged**: + +```blobl +root.sum = this.foo.bar.(thing -> thing.baz + thing.buz) +``` + +Here `thing` binds to `this.foo.bar`, and inside the body `this` is still the outer top-level value. Useful when you need both the capture and the outer `this` in the same expression. The lambda-like `name -> body` is a generic expression; it may appear anywhere `.(...)` accepts an expression, including nested chains. Names must not collide with `this`/`root` or an enclosing named context. + +Outside of a path, `this.x | this.y` works identically to the plain form due to `|`'s high precedence. + +### 5.5 Assignment operator `=` + +`=` is a **statement-level** token only (§7). It never appears inside expressions. There are no compound assignments (`+=`, `||=`, etc.) and no destructuring. + +--- + +## 6. Paths and References + +### 6.1 Root context references + +| Token | Meaning | +|-------|---------| +| `this` | Current query context. At the top level of a mapping this is the input document. Inside a lambda, match arm, `.(...)` scope, or `.apply(...)` body, `this` rebinds to the local value (see §6.5). | +| `root` | The output payload being constructed. Read-only in expressions (reads the partial result so far); write via assignment statements. | +| `$name` | Reference to a variable bound by `let name = ...`. | +| `@` (bare) | The whole metadata object as a value. | +| `@name` | Shorthand for `meta("name")`. Also written `@"quoted name"`. Parsed in `metadataReferenceParser`, `parser/query_function_parser.go:230`. | +| `meta("key")` | Function-call form for metadata; takes an expression key. | +| *any other ident at root* | **Legacy**: treated as `this.`. E.g. `foo` alone parses as `this.foo`. `parser/query_function_parser.go:271` — "TODO V5: Remove this and force this, root, or named contexts". Migration tools should normalise these to explicit `this.` references. | +| *named context identifier* | Introduced by any lambda (`x -> body`) or by `Environment.WithNamedContext`. E.g. inside `things.map_each(x -> x.name)`, `x` is a first-class root reference. Unlike plain `this`, named-context references survive further lambda nesting without being popped. | + +### 6.2 Field path syntax + +Paths are chains of `.`-separated segments after a root reference: + +```blobl +this.foo.bar.baz +root."foo bar"."baz.qux" # quoted segments for special chars +this.0 # numeric segment — array index +this.foo."weird.key".0.name # mix freely +``` + +Segment grammar (`fieldReferenceParser` + `quotedPathSegmentParser`): + +- **Unquoted segment** — one or more characters from `[A-Za-z0-9_]`. Leading digits are allowed. This enables `this.0` style indexing. +- **Quoted segment** — a standard double-quoted string literal. Internally `.` is encoded as `~1` and `~` as `~0` to survive the dot-joined JSON-pointer-ish representation; user code never sees this. + +There is **no bracket-indexing syntax**: `this[0]` is a **parse error**. For dynamic indexing use `.index(i)` (`this.items.index(i)`) or build a path via `.get(path_expr)` style methods. + +### 6.3 Path access on arrays and numeric-segment writes + +**Reading**: an unquoted numeric segment on an array value is treated as an index: + +```blobl +this.items.0 # first element +this.items.5 # sixth element; null if out of range +``` + +**Negative indices via path are not supported** — the identifier character class is `[A-Za-z0-9_]`, so `this.items.-1` is a parse error. Use `.index(-1)` for negative indices. `.index()` errors out-of-range rather than returning null — use `.or(null)` / `.catch(null)` for null-on-OOB. + +**Writing with a numeric segment** has two cases: + +- **Parent does not yet exist**: the path creates an **object** with the numeric segment as a string key. `root.items.0 = "x"` (when `root.items` is unset) produces `{"items": {"0": "x"}}`. There is **no automatic array creation** and no array gap-filling. +- **Parent is already an array**: the numeric segment **indexes into the existing array**. `root.items = [1, 2, 3]; root.items.0 = "x"` produces `{"items": ["x", 2, 3]}`. Out-of-range numeric writes error with `"found array but index 'N' exceeded target array size of 'M'"`. + +To build arrays from scratch, seed `root.items = []` then either reassign the whole array (`root.items = root.items.append(x)`) or build it in a single literal. Because numeric-path writes never *grow* an existing array, the `.append()` / full-assignment pattern is the only option. + +### 6.4 Writable paths (assignment targets) + +Targets in `target = expr` statements are a **restricted** subset of path expressions. The full grammar is in `parser/mapping_parser.go` (`assignmentTargetParser`); the accepted forms are: + +```blobl +root # replace payload wholesale +root.(.)* # write a nested field (quoted segments allowed on subsequent positions) +meta # replace metadata wholesale +meta # single metadata entry; bare identifier or quoted string ONLY +(.)* # legacy bare-path form: equivalent to root.. + # FIRST segment must be a bare (unquoted) identifier; + # only subsequent segments may be quoted. +``` + +**`this` and `this.` are NOT aliased to `root`/`root.` at target position.** The parser only strips a leading `root` segment (`mapping_parser.go:471-473`); `this.foo = "bar"` produces `{"this": {"foo": "bar"}}`, with `this` as a literal top-level key. Migration tools must not assume the legacy `this = …` form works. + +**`meta() = v` is NOT a valid assignment target** even though `meta("key")` reads work in expressions. The parser (`parser/mapping_parser.go` `metaStatementParser`) accepts only a bare identifier or a quoted string after `meta` at the target position. To write a dynamic metadata key, assemble an object and do a wholesale replacement: `meta = @.merge({(dynamic_key): value})` or `meta = {(key_expr): value}` (the latter wipes other entries). + +Variable bindings (`let name = ...`) are a **separate statement form** (§7.2), not an assignment with a target — they are not interchangeable with `=`-targets. + +Key constraints: + +- **No dynamic fields in assignment paths**: `root.(expr)` and `root[expr]` are not assignment targets. To write a dynamic key, use `root = root.merge({(key_expr): value_expr})` or build an object literal. +- **No variable reassignment via `=`**: `$x = ...` is invalid. Use `let x = ...` to re-bind (which overwrites — see §7.2). +- `meta "foo" = deleted()` is the canonical way to remove a single metadata entry. +- `meta = deleted()` wipes all metadata. +- `root = deleted()` marks the whole message for deletion (the processor drops it). +- **Whole-meta assignment requires an object**: `meta = "string"`, `meta = 5`, `meta = [1,2]` all raise a runtime error `"setting root meta object requires object value, received: "` — the `` is a Go type name (`string`, `int64`, `[]interface {}`, etc.), not a Bloblang type name. The only permitted right-hand sides are `deleted()` (clears meta) and an object value. + +### 6.5 `this` rebinding + +`this` is reassigned by many constructs. Lambdas interact with `this` in a surprising way (see the warning below). + +- **Iterator-method argument** — for collection methods like `.map_each`, `.filter`, `.fold`, `.sort`, `.any`, `.all`, the argument is evaluated once per element with `this` **rebound to the current element**. This applies whether or not the argument is a lambda. +- **Non-iterator method argument** — for methods like `.slice(start, end)`, `.format(...)`, `.index(n)`, `.get(path)`, the argument is evaluated once with `this` as the *outer* context — NOT rebound to the receiver. `this.foo.slice(this.start, this.end)` reads `start` / `end` from the top-level `this`, not from `this.foo`. + - Rule of thumb: only methods that iterate rebind `this` in their arguments. If in doubt, consult the method's docs or test empirically. +- **Lambda argument** — `items.map_each(x -> body)`: the lambda is parsed as a `NamedContextFunction`. Its `Exec` (`query/expression.go:166-175`) pops the current value off the context stack and binds it to `x`, so `body` executes with **`this` reverted to the outer (parent) context** and `x` holding the element. Idiomatic code inside a lambda always references the named parameter, never `this`. +- **`_` lambda parameter** — `items.filter(_ -> expr)`: pops the value but binds no name. Inside `expr`, `this` has reverted to the outer context and there is no name for the element. This form is rarely useful; prefer naming the parameter. +- **`.apply("map_name")`** — inside the named map's body, `this` is the receiver value. Variables are cleared on entry (see §10.2). +- **`.(expr)` plain form** — inside `this.foo.(bar | baz)`, `this` is `this.foo` (no context pop; inner expression sees the new context directly). +- **`.(name -> expr)` named-capture form** — the inner lambda pops the just-rebound value, so `this` effectively reverts to the outer context and `name` is bound to `this.foo`. See §5.4. +- **`match subject { ... }`** — inside each case (the pattern *and* the result), `this` is `subject`. For subject-less `match { ... }`, `this` is unchanged. +- **`if cond { ... } else { ... }` blocks** (statement form) — `this` is unchanged; blocks execute with the outer context. + +**Warning for migration tools**: the iterator-vs-non-iterator and lambda-vs-non-lambda splits mean semantically different rewrites. `items.map_each(this.value)` (element rebound) and `items.map_each(x -> this.value)` (lambda pops, outer `this`) have different meanings. Never mechanically wrap a non-lambda iterator argument in `x -> ...`. + +--- + +## 7. Statements + +A mapping is a sequence of statements separated by newlines. Each statement is one of: + +``` +Statement + = Assignment + | LetBinding + | MapDefinition + | Import + | FromImport + | RootLevelIf + | BareExpression +``` + +### 7.1 Assignment + +```blobl + = +``` + +Targets are the writable paths of §6.4. The expression is any query (§8). Multiple assignments to `root.a.b` across separate statements are **incremental**: each is a write to the (evolving) output document. + +### 7.2 `let` bindings + +```blobl +let = +let "quoted name with spaces" = +``` + +Quoted names are permitted for the binding target (matching the metadata pattern). Access is always `$` — a dollar sign followed by `[A-Za-z0-9_]+`. **Quoted bindings with non-identifier characters become write-only**: `let "has space" = 5` parses, but `$"has space"` is a parse error, so the value is unreachable. Avoid quoted names unless the name is a valid identifier. Variables: + +- Are stored in a **flat per-execution map** — `ctx.Vars` of type `map[string]any` (`query/package.go:50`). There is **no block scope**: `let` statements inside `if` or other blocks mutate the same map, and the re-binding remains visible after the block exits. +- Are **overwritten on re-binding**: `let x = 1` followed by `let x = 2` leaves `$x == 2`. There is no shadowing stack. +- Are **deleted by `let x = deleted()`** — `VarAssignment.Apply` in `mapping/assignment.go:57-61` explicitly checks for the delete sentinel and calls `delete(ctx.Vars, name)`. This is the way to remove a variable binding; a subsequent read of `$x` then errors as unset. +- Are **cleared at `apply` boundaries** — inside a `.apply("foo")` body, the variable environment is reset to an empty map (`query/methods.go:64`). Bindings set inside the apply do not leak out; bindings set outside are not visible inside. +- Are evaluated **eagerly at the `let` statement's position** — the right-hand side runs once at that point, not on each read. + +### 7.3 Root-level `if` / `else if` / `else` + +At the statement level an `if` block groups multiple statements: + +```blobl +if this.type == "cat" { + root.sound = "meow" + meta category = "pet" +} else if this.type == "dog" { + root.sound = "woof" +} else { + root.sound = "?" +} +``` + +Semantics: + +- Exactly one branch executes. +- Each block body is itself a sequence of mapping statements (recursive). +- Conditions must be boolean. **A `null` condition errors** in statement form with `"null literal resolved to a non-boolean value"` — unlike the expression form (§8.3) which treats `null` as falsy. Non-bool non-null values also error. +- `else` is optional; `else if` chains freely. + +Distinct from the expression form (§8.3) — statement `if` has `{...}` blocks of statements; expression `if` has `{...}` blocks of a single expression. + +### 7.4 Bare expression statements + +A standalone expression at the top level is treated as `root = `: + +```blobl +this.foo.uppercase() # equivalent to: root = this.foo.uppercase() +``` + +This is a **legacy convenience** and has one wart: it may only appear as the *sole* statement in the mapping. If any other statement follows, the bare form is rejected — the parser requires an explicit `root = ...`. Migration tools should normalise to explicit `root = `. + +### 7.5 `map` definition + +See §10. + +### 7.6 `import` / `from` + +See §10.4 and §10.5. + +### 7.7 Statement separators + +- Newlines separate statements. +- Multiple statements on one line are **not allowed** — there is no `;`. +- Blank lines are allowed anywhere. +- Trailing comments are allowed on any line. + +--- + +## 8. Expressions + +### 8.1 Primary expressions + +``` +Primary + = Literal # §4 + | "this" | "root" | Ident # root references (Ident legacy-captures this.ident) + | "$" Ident # variable + | "@" (Ident | QuotedString)? # metadata (bare @ = whole meta object) + | "(" Expression ")" # grouping + | ArrayLiteral | ObjectLiteral + | FunctionCall # ident "(" args ")" + | IfExpression # §8.3 + | MatchExpression # §8.4 + | LambdaExpression # §8.5 +``` + +### 8.2 Tails (method chains, field access, map expression) + +Any primary expression can be followed by a chain of: + +``` +Tail + = "." Ident # field access + | "." QuotedString # quoted field access + | "." Ident "(" args ")" # method call + | "." "(" Expression ")" # map expression (rebinds this; often used with |) +``` + +Combined, these compose the full expression grammar. `parseWithTails` (`parser/query_function_parser.go:95`) loops building a left-to-right chain. + +### 8.3 `if` expression form + +```blobl +root.label = if this.score > 50 { "high" } else { "low" } +``` + +- Each branch body is a **single expression** (not a statement list). +- `else if` chains work the same as in statement form. +- If there is no `else` and no branch matches, the expression produces the **`nothing` sentinel** (`value.Nothing`), not `null`. When this sentinel is assigned to a target, **the assignment is silently skipped** (prior value preserved, target absent if never set). See §9.4. + - Example: `root.a = 1; root.a = if false { "x" }` leaves `root.a == 1` (the second assignment is elided). + - This also means `.catch()` cannot rescue a non-matching `if` — there is no error to catch; the whole assignment just vanishes. +- A **null** condition is treated as falsy (branch doesn't fire) — V1 does not error on a null condition, only on non-bool non-null values. + +### 8.4 `match` expression + +**With subject**: +```blobl +root.kind = match this.thing { + this.type == "article" => this.article + this.type == "comment" => this.comment + _ => this +} +``` + +**Without subject** — `this` context is unchanged: +```blobl +root.kind = match { + this.doc.type == "article" => this.doc + this.doc.type == "comment" => this.doc +} +``` + +**Literal equality pattern**: +```blobl +root.kind = match this.type { + "doc" => "document" + "art" => "article" + _ => this +} +``` + +Matching rules (`parser/query_expression_parser.go:44-58` and the match executor): + +- Each arm is ` => `. Arms are separated by newlines **or** commas (both accepted, and they may be mixed). A trailing comma after the last arm is tolerated. +- The pattern is classified **at parse time**, not runtime, based on its AST shape: + - `_` — wildcard, always matches. + - A **literal** value (string, number, bool, null, literal object, literal array) — converted to an equality check against the subject via `value.ICompare` (representation-agnostic, so `5` matches `5.0`). + - **Any other expression** — evaluated per-arm; the **result must be `bool`**, and a `true` result indicates a match. If a non-literal expression evaluates to a non-bool (e.g. a string or number), the match simply does not fire — no case-specific error is raised, and the next arm is tried. +- Evaluation is top-to-bottom; first match wins. **All arm patterns are evaluated eagerly** until one matches — an earlier `throw()` in a later arm still fires if a prior arm didn't match. V1 does not short-circuit arm evaluation at the pattern level. +- If no case matches, the expression produces the **`nothing` sentinel**, not `null`. When assigned, the assignment is silently skipped (same semantics as a non-matching `if` — see §8.3). +- Inside each arm (both pattern and result), `this` is the subject (when present) or the outer `this` (subject-less form). This is a common footgun: `match this.user { this.name == "alice" => this.email }` evaluates `this.name` and `this.email` against the **subject** (i.e. `user.name`, `user.email`) — not against the outer top-level document. To reference the outer `this` from inside an arm, capture it first: `let u = this; match this.user { u.role == "admin" => ... }`. +- A **null** subject (or null condition via boolean pattern) is tolerated, not an error. V1 does not raise on `match null { ... }`. + +Note: the literal/expression classification happens **after constant folding**. Parenthesised literals (`(5) => ...`) and arithmetic over literals (`(2+1) => ...`) collapse to `*Literal` during parse and are therefore treated as literal-equality patterns. Only patterns that cannot be fully constant-folded — function calls (`some_fn() => ...`), variable references (`$target => ...`), field references (`this.threshold => ...`), and method chains on non-literals — are treated as expression patterns that must evaluate to `bool`. + +### 8.5 Lambdas + +```blobl +items.map_each(x -> x.value + 1) +items.filter(x -> x.active) +``` + +Lambda grammar: + +``` +Lambda = (Ident | "_") "->" Expression +``` + +- A lambda is a **first-class query expression** — the parser lists `lambdaExpressionParser` among the top-level alternatives in `queryParser` (`parser/query_parser.go:14`). In practice they are almost always used as method arguments, but they can appear anywhere a query can, and the `.(name -> body)` form (§5.4) exploits this. +- **Context handling** — a lambda compiles to a `NamedContextFunction`. At runtime, executing it *pops* the current value off the context stack and (if the name is not `_`) binds that value to the named parameter. The body therefore executes with `this` reverted to the **parent** context, not with `this` = the element. See §6.5 for the full rule and the migration warning. +- Named parameters cannot be `root` or `this` — the parser rejects those names (`parser/query_expression_parser.go:246-251`). +- Named parameters cannot **shadow** a parent lambda's parameter in the same chain — the parser tracks named contexts via `pCtx.HasNamedContext` and rejects collisions with a "would shadow a parent context" error. +- The `_` parameter is special: the context pop still happens, but no name is bound, so inside the body there is no way to reference the popped element. This is only useful when the body doesn't need the element (e.g. `items.map_each(_ -> uuid_v4())` generates a list of UUIDs with one per element, ignoring the element itself). +- Some methods pre-declare named parameters via their params spec rather than using user-named lambdas. `.sort(left > right)` passes a comparison **query** that references the implicit `left` and `right` identifiers injected by the method. V1 does **not** accept `.sort(left, right -> left > right)` — that multi-param lambda form is a parse error (`wrong number of arguments`). The exact invocation shape per method is declared in `query/docs.go` / the method's `Params`; migration tools should treat the argument as method-specific and not assume general lambda shape. + +### 8.6 Named arguments + +Most functions and methods accept **positional** arguments. Some also accept **named** arguments, distinguished syntactically by `name: value`: + +```blobl +range(start: 0, stop: 10, step: 2) +range(0, 10, 2) # equivalent positional form +``` + +- Positional and named arguments **cannot be mixed** in a single call (`parser/query_function_parser.go:299`). It's all-positional or all-named. +- Argument names use the stricter `snake_case` identifier form. +- Named-arg availability per function/method is declared in its `params` spec (`query/params.go`). + +--- + +## 9. Built-in Functions and Methods + +The full, authoritative list is too long to inline and is subject to version drift. Migration tools should treat these catalogues as the source of truth: + +- **Functions**: registered via `registerFunction` in `query/functions*.go`, aggregated in `query.AllFunctions`. Documented at `docs.redpanda.com/redpanda-connect/guides/bloblang/functions/`. +- **Methods**: registered via `registerMethod` in `query/methods*.go` (split by category — general, strings, numbers, structured, time, regexp, encoding, coercion, parsing, jwt, geoip). Aggregated in `query.AllMethods`. Documented at `docs.redpanda.com/redpanda-connect/guides/bloblang/methods/`. + +### 9.0 Core vs. impure-package methods + +V1 has **two tiers** of built-in methods, which confuses migration: + +- **Core methods** are registered directly in `internal/bloblang/query/*.go` (`methods.go`, `methods_structured.go`, `methods_general.go`, `methods_numbers.go`, `methods_strings.go`, etc.) and are available in every environment returned by `bloblang.NewEnvironment()` / `bloblang.GlobalEnvironment()`. Notable core-only entries include `round` (zero-arg), `ceil`, `floor`, and `count(name)`. +- **Extension methods** are registered in `internal/impl/pure/*.go` and only become available when that package is imported (e.g. via `_ "github.com/redpanda-data/benthos/v4/internal/impl/pure"` or through the full `cmd/benthos` binary). Notable extension-only entries include `abs`, the typed numeric coercers `int64`, `int32`, `int16`, `int8`, `uint64`, `uint32`, `uint16`, `uint8`, `float64`, `float32`, plus `pow`, `ts_parse`, `ts_format`, `ts_strptime`, `ts_strftime`, `ts_add_iso8601`, `ts_sub`, most `ts_*` formatters, `concat` (object merge variant), and `counter()`. +- **Methods that do NOT exist in V1 at all** (despite appearing in older docs or V2): `sqrt`, `map_values`, `map_entries`, `filter_entries`, `collect`, `chunk`, `char`, `ts_add_duration`, and `.round(N)` with a precision argument. `.reverse()` exists only as a string method; array-reverse does not exist in either tier — use `.sort(left > right)` to reverse an array. + +A migration tool that parses a V1 mapping against the bare `public/bloblang` environment will report extension methods as **unknown methods**. The tool should preserve those references verbatim and assume the host binary registers `internal/impl/pure` (as `cmd/benthos` and Redpanda Connect do). For methods in the "do not exist" list, the tool should flag them as unmigratable rather than preserve verbatim. + +Each function/method spec (`query/docs.go`) carries: + +- A **status**: `StatusStable`, `StatusBeta`, `StatusExperimental`, or `StatusDeprecated`. Deprecated entries are the primary signal for migration. +- A **category**: used for documentation grouping (see below). +- **Impure** flag: whether the function has side effects or non-deterministic output (affects optimiser and pure-environment restrictions). + +### 9.1 Function categories (`FunctionCategory*`) + +``` +General, Message, Environment, FakeData, Deprecated, Plugin +``` + +### 9.2 Method categories (`MethodCategory*`) + +``` +General, Strings, Numbers, Time, Regexp, Encoding, Coercion, Parsing, +ObjectAndArray, JWT, GeoIP, Deprecated +``` + +### 9.3 Migration-critical idioms + +Rather than enumerate everything, the following idiomatic clusters appear almost universally in real mappings and should be recognisable to any migration tool: + +- **Presence / defaults**: `.or(default)`, `.exists()`, `.not_null()`, `.catch(default)`. +- **Type checks**: `.type() == "array"`, `.type() == "object"`, etc. Also `.array()`, `.object()`, `.string()`, `.number()`, `.bool()` coercers. +- **Collections**: `.map_each(x -> ...)`, `.filter(x -> ...)`, `.fold(init, tally -> value -> ...)`, `.sort(a, b -> ...)`, `.sort_by(field_expr)`, `.unique()`, `.flatten()`, `.length()`, `.sum()`, `.min()`, `.max()`, `.keys()`, `.values()`, `.key_values()`, `.enumerated()`, `.index(i)`, `.slice(start, end)`. +- **Object manipulation**: `.merge(other)`, `.assign(other)`, `.without("a", "b.c", ...)`, `.get(path_expr)`, `.not_empty()`. +- **Strings**: `.uppercase()`, `.lowercase()`, `.capitalize()`, `.trim()`, `.split(sep)`, `.join(sep)`, `.replace_all(old, new)`, `.re_replace_all(pattern, repl)`, `.contains(s)`, `.has_prefix(s)`, `.has_suffix(s)`, `.quote()`, `.format(...)`, `.escape_html()` / `.unescape_html()`, `.escape_url_query()` / `.unescape_url_query()`. +- **Encoding/parsing**: `.parse_json()`, `.parse_yaml()`, `.parse_csv()`, `.format_json()`, `.format_yaml()`, `.encode("base64"|"hex"|...)`, `.decode(...)`, `.hash("sha256", key?)`. +- **Time**: `now()` (returns a string, not a timestamp), `ts_parse(format)`, `ts_format(format)`, `ts_unix()`, `ts_sub(t)`, `ts_add_iso8601(duration)`. (`ts_add_duration` does not exist — use `ts_add_iso8601` with an ISO-8601 duration string like `"PT1H"`.) Most `ts_*` methods are extension-only (§9.0). +- **Message / batch**: `content()`, `batch_index()`, `batch_size()`, `.from(idx)`, `.from_all()`. `.from(idx)` with negative or out-of-range `idx` is **not clearly defined** — behaviour depends on the `MsgBatch` implementation; tools should treat such calls as suspect. +- **State / stateful**: `count("name")` (core; named counter shared across messages — **impure**, tracks state externally). `counter()` (extension-only, in `internal/impl/pure`; monotonic per-mapping). Both make a mapping non-re-runnable in isolation. +- **Env / identity / random**: `env("FOO")`, `hostname()`, `uuid_v4()`, `uuid_v7()`, `nanoid()`, `random_int()`. +- **Errors**: `error()` (last error in chain), `throw("msg")`. + +### 9.4 Sentinel-returning functions + +Two functions return special sentinel values, not regular data. Both are recognised as "null-like" by `value.IIsNull` (`internal/value/type_helpers.go:302`), which drives the behaviour of `.or(...)` and `|` below. + +- **`deleted()`** — the *delete* sentinel (`value.Delete`). + - Assigned to `root` → drops the whole message. + - Assigned to `root.path.to.x` → removes that field (or array element at that index). + - Assigned to `meta` → clears all metadata. + - Assigned to `meta key` → removes that metadata key. + - Returned from a `.map_each` lambda → that **element is dropped from the resulting array/object** (`query/iterator.go:242-247`). The result does not contain a `null` hole. + - Returned from a match arm → produces the `deleted()` value; the surrounding assignment then applies it according to the target rules above. + - As operand to arithmetic or comparison operators → type error (the value is not a number, string, etc.). + - As operand of `\|` or `.or(fallback)` → **replaced with fallback** (treated as null). + - As operand of `.catch(fallback)` → **preserved** (not an error). + - In a path tail (`deleted().foo`) → returns `null`, per §12.5 universal null-tolerant path access (the sentinel is not an object, so any field/segment access yields null). +- **`nothing()`** — the *no-op* sentinel (`value.Nothing`). **Sources** of this sentinel include: + - The explicit `nothing()` function call. + - `if { body }` with no `else`, when the condition is falsy (§8.3). + - `match { ... }` with no matching arm and no wildcard (§8.4). + - `if cond { body1 } else if cond2 { body2 }` where no arm's condition is truthy. + + **Behaviour of the sentinel**: + - Assigned to anything → the assignment is **silently skipped** (`mapping/statement.go:64-67`); the target is left unchanged (prior value preserved, or the key is absent if never set). The field does **not** appear as `null` in the output. + - Returned from a `.map_each` lambda → the **original element is preserved unchanged** (`query/iterator.go:243-244`). + - As operand to arithmetic or comparison operators → type error (same as `deleted()`). + - As operand of `\|` or `.or(fallback)` → replaced with fallback (treated as null). + - As operand of `.catch(fallback)` → preserved (not an error — there is nothing to catch). + - Assigned as the value of a `let` binding → the binding is **deleted**, not set to null. A subsequent `$name` read raises `"variable 'name' undefined"`. So `let x = nothing()` has the same effect as never declaring `x`. + +The distinction matters for migration: a match arm returning `deleted()` vs. `nothing()` at the **same position** produces different outputs (the field is removed vs. left at its prior value). Do not collapse them. + +**Sentinels inside array and object literals silently omit the entry**. `root = [1, nothing(), 3]` produces `[1, 3]` — the `nothing()` element is elided, not preserved as `null`. `root = {"a": 1, "b": deleted()}` produces `{"a": 1}` — the `"b"` key is omitted. Both sentinels behave identically here (array element dropped; object key dropped). This matters for migration tools that rewrite literal construction: a naive translation that keeps the sentinel in place will change the array length or object shape. + +### 9.5 Plugin-registered functions and methods + +See §13. + +--- + +## 10. Maps and Modules + +### 10.1 Named maps + +```blobl +map things { + root.first = this.thing_one + root.second = this.thing_two +} + +root.foo = this.value_one.apply("things") +root.bar = this.value_two.apply("things") +``` + +A **named map** is a reusable mapping body, defined at the **top level** of a source file. It cannot: + +- Be defined inside another map, function, or block. +- Contain `meta` assignments (enforced by the parser — maps produce values, not metadata). +- Contain `import` or nested `map` statements (no nesting). +- Recurse without bound — `Environment.WithMaxMapRecursion(n)` configures a per-environment limit. The default behaviour (when the option is not set) depends on the host and is not documented at the API level; tools that evaluate mappings should set an explicit limit. + +Within a map body: + +- `this` is the **receiver** — the value passed to `.apply("name")`. +- `root` is a **fresh** value scoped to the map; the map's result is that fresh `root`. +- `$var`s are **reset** on entry (variables do not leak in or out). + +### 10.2 `.apply(name)` + +`.apply("name")` is the canonical invocation. The argument is an expression; a literal string is usual, but computed names (`.apply(this.kind)`) work and allow dynamic dispatch. + +### 10.3 Built-in catalogue integration + +Unlike functions and methods, maps are user-defined only. There is no built-in map library. + +### 10.4 `import` statement + +```blobl +import "./shared/common_maps.blobl" + +root.foo = this.bar.apply("some_map_from_that_file") +``` + +- The path is resolved **relative to the importing file** when importing a file from disk, or **relative to the process working directory** for the outermost file. +- Imported files typically contain `map` definitions (and further `import`s); whether the parser allows top-level statements inside an imported file is not directly exercised by the test corpus — treat as undefined and restrict imported files to map definitions for portability. +- A map-name collision across imports is a parse error. +- Imports are static: the path must be a string literal. + +### 10.5 `from "path"` — direct include + +```blobl +from "./shared/base_mapping.blobl" +``` + +- The referenced file is a full mapping (top-level statements allowed), and it **replaces** the current mapping body entirely. +- `from` must be the **only** statement in the file using it; it cannot be mixed with other statements. +- Rarely used in modern corpora; most migration targets are `import` + `.apply(...)`. + +### 10.6 Environment-level imports + +`Environment.WithDisabledImports()` rejects all `import`/`from` at parse time. `Environment.WithCustomImporter(fn)` overrides the filesystem resolution — useful for embedded tools. A migration tool that receives a mapping out of context may not know what `import` paths resolve to. + +--- + +## 11. Field Interpolation (`${! ... }`) + +Field interpolation is the **expression-only** dialect used inside string configuration values in Redpanda Connect YAML configs: + +```yaml +output: + kafka: + topic: "ingest.${! this.region }.${! meta(\"tenant\").or(\"default\") }" + key: "${! this.id }" +``` + +Rules (`internal/bloblang/field/`): + +- The substring between `${!` and the matching `}` is parsed as a **single query expression**. Statements, `let`, `map`, `import`, multi-statement `if` blocks, and assignments are not allowed. +- A single config string may contain **multiple interpolations**; each one is parsed independently. The surrounding literal text is emitted verbatim. +- To emit a literal `${!...}` without interpolation, use **double braces**: `${{! expression }}` is emitted as the literal `${! expression }`. +- Any other `$...` sequence (not `${!`) is left verbatim, including `$foo`, `${foo}` (environment-style braces), etc. +- Interpolation results are **coerced to string** for concatenation with the surrounding literal text. Exact coercion rules for each type (particularly null and structured values) depend on the implementation in `internal/bloblang/field/`; treat as defined by the reference impl. +- Errors in interpolation propagate to the enclosing component, which decides whether to fail the message or use a fallback. + +--- + +## 12. Error and Nullability Model + +### 12.1 Errors are out-of-band + +At runtime, evaluating an expression returns either a value **or** an error. Errors propagate through the expression eagerly: the innermost operation that fails produces the error; outer operations pass it up unless explicitly caught. + +### 12.2 Catch vs. null-default + +| Construct | Catches errors | Catches nulls | Catches `deleted()` / `nothing()` sentinels | Notes | +|-----------|:--------------:|:-------------:|:-------------:|-------| +| `.catch(fallback)` | yes | no | no (passed through) | Preserves non-error nulls and sentinels. | +| `.or(fallback)` | yes | yes | **yes** (sentinels register as null) | Fallback on null, error, or sentinel. | +| `\|` operator | yes | yes | **yes** | Identical to `.or(...)` — `coalesce` at `query/arithmetic.go:444-452`. | + +A **recoverable error** is one that does not crash the mapping: type mismatches in operators, missing metadata, out-of-range indexes, divide-by-zero, `throw(...)`, etc. The default behaviour when a mapping reaches completion with an unhandled error is to **reject the message** at the processor level. Configure processor-level error handling outside Bloblang. + +### 12.3 `error()` function + +Inside a mapping, after an error has been produced further up the chain *and caught*, `error()` returns the stringified error message. More commonly used in downstream processors (catching into a branch) than within a single mapping. + +### 12.4 `throw(msg)` + +`throw("something went wrong")` produces an error with the given message. + +### 12.5 Null-safe path access + +**Path access (`.field`, `.0`, `."quoted"`) is universally null-tolerant**: it returns `null` for every case where another language would raise a type error. Not just null — any non-object receiver. `5.foo`, `true.foo`, `"hello".foo`, `[1,2,3].foo`, and `null.foo` all yield `null`. The path machinery uses gabs which treats any non-object traversal as a missing key. + +**Method calls (`.method()`) are NOT null-tolerant by default**. They have per-method type requirements and generally error on a wrong-type receiver: +- `null.length()` → error `"expected string, array or object value, got null"`. +- `null.uppercase()` → error `"expected string value, got null"`. +- `5.length()` → error `"expected string, array or object value, got number"`. + +A migration tool should therefore: +- Treat any path access as returning null for missing/wrong-type data — never assume an error. +- Treat any method call as a potential type error if the receiver type is not guaranteed. +- Use `.catch(fallback)` to handle method-on-null errors, or `.or(fallback)` to handle both null results and errors uniformly. + +**`.index(n)` on an array is a method (not a path)** and follows method semantics: out-of-range indexes are a **runtime error** (`"variable arr: index '5' was out of bounds for array size: 3"`), not `null`. Use `.index(n).catch(null)` or `.index(n).or(null)` to get null-on-OOB. + +### 12.6 Sentinel interaction summary + +The `deleted()` and `nothing()` sentinels are distinct from errors. Their exact interaction with `.or(...)`, `|`, and `.catch(...)` is covered in §9.4 and §12.2. In short: + +- `.catch(x)` treats them as successful values and **passes them through** (they aren't errors). +- `.or(x)` and `|` treat them as null-like and **replace them with the fallback** (via `value.IIsNull`). + +This asymmetry is load-bearing: a mapping relying on `.catch(deleted())` to keep a deletion intent will behave differently if naively rewritten to `.or(deleted())`. + +--- + +## 13. Environment, Plugins, and Extensibility + +The language is parameterised by an **Environment** (`internal/bloblang/environment.go`) that holds: + +- The registered set of functions (`Environment.Functions`). +- The registered set of methods (`Environment.Methods`). +- The import resolver (filesystem by default; overridable). +- Optional restrictions: `WithoutFunctions(names...)`, `WithoutMethods(names...)`, `OnlyPure()`, `WithDisabledImports()`. +- Named contexts injected from the host (rare, but visible in the parser). + +From Go, hosts extend the language via the public `public/bloblang/` package. The primary entry points are: + +- `env.RegisterFunctionV2(name, spec, ctor)` / `env.RegisterMethodV2(name, spec, ctor)` — register a custom function/method with a `*PluginSpec` describing parameters and docs. +- `env.RegisterFunction(name, ctor)` / `env.RegisterMethod(name, ctor)` — legacy shorthand for simple cases without a rich spec. +- `env.Parse(blobl)` returns an `*Executor` that can be `Execute(...)`d against messages. + +From YAML-only contexts, the Redpanda Connect `bloblang` processor allows no direct plugin registration; plugins are added at the binary-build level. + +**Migration implication**: a mapping validated by a custom environment may contain function/method names that are *not* in the default environment. A migration tool should: + +1. Parse against the default environment first. +2. Treat "unknown function" errors for non-standard identifiers as *plugin names* rather than failures, and emit a migration note rather than erroring out. +3. Preserve unknown identifiers verbatim in the output. + +--- + +## 14. Quirks, Legacy Forms, and Migration Gotchas + +This section catalogues behaviours that are accepted by the V1 parser but that a migration tool **must** handle explicitly. Many are flagged in the V1 source with `TODO V5` comments. + +1. **Bare identifiers as `this.` paths**. `foo.bar` at the start of an expression is parsed as `this.foo.bar` (`parser/query_function_parser.go:271`). Migration should rewrite to explicit `this.foo.bar`. +2. **Bare paths as assignment targets**. `foo.bar = 1` is parsed as `root.foo.bar = 1` (`parser/mapping_parser.go` target parser). Rewrite to explicit `root.foo.bar = 1`. +3. **Unusual `&&`/`||` precedence**. `a || b && c` parses as `(a || b) && c`. Always preserve original parentheses; when adding parentheses in a rewrite, match V1 semantics. +4. **High-precedence `|`**. `a + b | c` is `a + (b | c)`. Parenthesise on rewrite if unsure. +5. **Integer division produces `float64` internally**. `4 / 2` is a `float64(2.0)`, though JSON output encodes it as `2` (no trailing `.0` for whole values). Code that relies on the result being `int64` — or that compares against an int64 via `.type()` or type-strict means — will break. Use `.floor()` / `.round()` to get an integer back, or rely on the representation-agnostic `==` (quirk #6) when only value-equality matters. +6. **`==` is representation-agnostic for numbers**. `5 == 5.0` is `true`. V2 may differ; check before rewriting comparisons. +7. **Triple-quoted strings are raw**. `\n` inside `"""..."""` is a literal backslash+n. Do not mechanically re-escape. +8. **Bare-identifier object keys are interpreted as dynamic keys, not literal keys**. `{a: 1}` **parses successfully** — `a` is resolved as the legacy bare-ident form (`this.a`) and used as a dynamic key at runtime. If `this.a` isn't a string at runtime, the mapping errors with `"mapping returned invalid key type: "`. To get a literal key, use `{"a": 1}`. This means `{a: 1}` and `{"a": 1}` are NOT interchangeable — the first depends on `this`. Auto-rewrite should always prefer the quoted form for literal keys. +9. **Computed keys require parentheses**. `{("k_" + x): v}`, not `{"k_" + x: v}`. +10. **`this[0]` is a parse error** — use `.index(0)` or the `this.0` path form. +11. **`this.-1` is a parse error**. The path segment charset is `[A-Za-z0-9_]`, which excludes `-`. Use `.index(-1)` for last-element access. `this.0`, `this.5`, etc. are fine. +12. **`from "file"` replaces the whole mapping** — treat as a distinct migration target from `import`. +13. **Named-map bodies forbid `meta`**. A migration that tries to promote a bulk mapping into a `map` must split out meta writes. +14. **Variables are cleared at `apply` boundaries.** Don't assume `$x` set before `apply(...)` is visible inside the applied map. +15. **`root` inside a map is not the outer `root`.** It's a fresh value scoped to the map. Inner `root.x = ...` does not write to the outer document — the outer caller writes the map's result. +16. **Bare expression shorthand is single-statement-only**. `this.x.y` alone is `root = this.x.y`; adding any other statement makes it a parse error. Always emit explicit `root = ...` on rewrite. +17. **`nothing()` silently no-ops assignments**. `root.x = nothing()` preserves any existing `root.x` value (or leaves the key absent). `root = nothing()` at top level emits the **input unchanged** — equivalent to "this message passes through". Mappings relying on conditional `nothing()` returns to "skip" an assignment must be preserved; a naive conditional rewrite that always writes `null` changes semantics. +18. **`deleted()` has different meaning at each target level**. Whole-message delete (`root = deleted()`) vs. field removal (`root.x = deleted()`) vs. meta removal (`meta key = deleted()`). Migrations must preserve the target. +19. **`meta` assignment with bare identifier vs. quoted string**: `meta foo = v` and `meta "foo" = v` are equivalent. The `meta(expr) = v` form is **not** a valid assignment target — see Quirk #45. +20. **`@` alone is the whole metadata object**; `@foo` is `meta("foo")`. Don't confuse with `this.@foo` (which isn't valid). +21. **Plugin-registered functions and methods** are invisible without the plugin context. Tools should preserve unknown identifiers rather than reject. +22. **Imports resolve relative to the file**, not the mapping's logical location. When rewriting, rebase paths if the file moves. +23. **Recursive map calls** are allowed up to an environment-dependent depth. Don't flatten recursion during rewrite without checking that depth is bounded. +24. **Short-circuit evaluation of `&&`/`||` IS guaranteed by the implementation** (`query/arithmetic.go:396-442`), even though older docs hedge on this. Example: `false && (1 / this.x)` returns `false` without ever evaluating the division. Note that this does NOT rescue type errors in path access — `this != null && this.foo > 0` still errors when `this = {}`, because both operands evaluate and `null > 0` is a type error (the `&&` short-circuits *only* when the LHS is `false`). +25. **Hex/binary/exponent numeric literals are not supported**. Source like `1e6`, `0x10`, or the short forms `.5` / `5.` is a parse error. +26. **Integer overflow is silent** — e.g. `9223372036854775800 + 100` wraps to `-9223372036854775716` per Go int64 semantics; there is no automatic promotion to float or big-int. V1 has no `^` or `**` exponentiation operator, so express large constants as literals directly. A migration tool should flag large-constant arithmetic for review. +27. **Division and modulo by zero raise an error**, not `Inf` or `NaN`. `1 / 0` is `ErrDivideByZero`. +28. **Booleans are not orderable**. `true > false` is a type error — V1 refuses the comparison rather than using Go's `false < true` convention. +29. **Timestamp comparisons work by accident**: timestamps are RFC3339Nano-formatted and compared as strings, which happens to produce the right order for well-formed timestamps in the same timezone. Mixed-timezone or mixed-format timestamps may compare incorrectly. +30. **`==` across types usually returns `false` rather than erroring**. `5 == "5"` is `false`, not a type error (`query/type_helpers.go:839-892`). Migration tools may choose to preserve or normalise these. +31. **`.from(idx)` with negative or out-of-range index is implementation-defined** — depending on the `MsgBatch` implementation, it may panic, return `nil`, or wrap. Treat as suspect on migration. +32. **Count/counter stateful functions** (`count("name")`) persist state between messages. A mapping that uses them behaves differently when run in isolation vs. in a running pipeline. Tooling that evaluates mappings for migration testing must seed this state explicitly. +33. **Bracketed named-capture form** — `foo.(name -> body)` binds `name` but leaves `this` unchanged; `foo.(body)` rebinds `this` to `foo`. The two forms are semantically different even when `body` looks the same. +34. **`.map_each` treats `deleted()` and `nothing()` differently** — `deleted()` drops the element; `nothing()` keeps the original element unchanged. Do not substitute one for the other during rewrite. +35. **Lambda arguments pop the context**. In `items.map_each(x -> body)`, `body` executes with `this` = the **outer** context, not the element. Only the named parameter `x` refers to the element. Contrast `items.map_each(this.value)` (no lambda), where `this` IS the element. Migration tools must never mechanically wrap a non-lambda argument in `x -> ...` — semantics change. See §6.5 for the rule and `query/expression.go:166-175` for the implementation (`NamedContextFunction.Exec` calls `ctx.PopValue`). +36. **`.catch(...)` and `.or(...)` treat sentinels differently**. `.catch(x)` passes `deleted()` / `nothing()` through untouched; `.or(x)` replaces them with the fallback. They cannot be used interchangeably when sentinels are in play. See §9.4 and §12.2. +37. **Constant folding turns some runtime errors into compile errors** — but only for **arithmetic** (`+ - * / %`) and **comparison** (`== != < <= > >=`) operators. With two literal operands, V1 evaluates these at parse time and raises any error fatally: `root = 5 / 0`, `root = true + false`, `root = null < 3` all fail at parse. The **logical operators `&&`, `||`, and the coalesce `|` are NOT constant-folded** — they defer to runtime even when both operands are literals. So `root = false || "x"` runs and errors at runtime; `root = null | "fallback"` runs and returns `"fallback"`. Expressions with one non-literal operand always defer to runtime. See §5.3. +38. **`==` is asymmetric for `bool` vs. `number` operands**. `true == 1` returns `true` (bool-path coerces number to bool via `IGetBool`); `1 == true` returns `false` (number-path cannot coerce bool to number). Never swap comparison operand order during rewrite. See §5.3 note on comparison asymmetry. +39. **`%` truncates float operands to `int64`** silently — `7.5 % 2.5` evaluates as `7 % 2 == 1`, not a type error. If the mapping intends an fmod-like operation, the V1 result will be wrong. See §5.3. +40. **String `.length()` returns byte count, not codepoint count**. `"héllo".length()` is `6` (é is 2 bytes UTF-8), `"🎉".length()` is `4`. Migration tools should flag any test or mapping that assumes codepoint semantics from `.length()` on a string. Array `.length()` and object `.length()` behave as expected (element / key count). For codepoint counts, V1 has no built-in equivalent to V2's string-length semantics — rewrite to `.split("").length()` or a regex-based count if migrated. +41. **Lambda bodies cannot start on a new line**. `items.map_each(x ->` followed by a newline and the body is a parse error — both sides of `->` use `SpacesAndTabs` only. Keep the lambda body on the same line as `->`, or move the whole expression inside `.(...)`/let-binding on one line. See §2.1. +42. **Arithmetic/comparison/logical operators reject a leading newline**. `a\n + b` is a parse error; `a +\n b` is fine. Migration tools pretty-printing long expressions must break *after* the operator, not before. +43. **Method/path dots cannot have whitespace before them**. `a.b` and `a.\n b` work; `a .b`, `a\n .b`, `a\n.b` all fail. A mapping that pretty-prints method chains across lines must break *after* each `.`. See §2.1. +44. **`if`-without-matching-branch and non-matching `match` produce `nothing`, not `null`**. The resulting assignment is silently skipped (§8.3, §8.4, §9.4). `.catch()` cannot rescue this — there is no error to catch. +45. **`meta() = v` is NOT a valid assignment target**, even though `meta("key")` as a read works. Only bare identifier or quoted string is accepted after `meta` on the LHS. See §6.4. +46. **Numeric path segments as write targets create OBJECT keys, not array indices**. `root.items.0 = "x"` produces `{"items": {"0": "x"}}`. No array gap-filling. See §6.3. +47. **`@` and `meta` refer to the SAME underlying map**; there is no copy-on-write separation between input-metadata reads and output-metadata writes. A later `@key` expression sees the most recent `meta key = …` write. +48. **`&&` and `||` coerce numbers to bool** via `IGetBool` — `true && 1` is `true`, `0 || true` is `true`. They are strict about strings/null/arrays/objects (which error). See §5.3. +49. **`.number()` always returns `float64`** — `"42".number()` is `42.0`, and `.sum()` / `.min()` / `.max()` on arrays return `float64` even when inputs are all-int. Integer methods like `.floor()` / `.round()` reduce to int64 when needed. See §3 and §9.0. +50. **V1 `assign` is a deep merge** (overrides per-key), and `merge` *combines* duplicate keys into arrays — this is the opposite of V2's naming. Read the method docs before rewriting either. +51. **`.all()` on an empty array returns `false`** in V1 (not `true` by vacuous truth). `.any()` on an empty array returns `false` as expected. +52. **`.fold()` is NOT curried in V1**. Despite occasional older docs showing `fold(init, tally -> value -> expr)`, the parser rejects that form as a name collision. The actual form is `.fold(init, item -> expr)` where `item` is an **object with `{tally, value}` fields**. + Similarly, `.sort(left, right -> ...)` (multi-parameter lambda) does not parse. Use `.sort(left > right)` — `left` and `right` are method-injected implicit names, not user-named lambda parameters. +53. **`.reverse()` is string-only in V1**. There is no array-reverse method in either the core registry or `internal/impl/pure` — grepping the source finds one `reverse` registration (for strings) at `methods_strings.go:1477`. To reverse an array, use a comparator sort: `.sort(left > right)`. +54. **`.round()` takes no arguments in either tier**. The method is core-registered (`methods_numbers.go:226`) and does not accept a precision argument; `5.5.round(2)` produces `"wrong number of arguments, expected 0, got 1"`. There is **no** `.round(N)` form in V1. For decimal-precision rounding, scale manually: `(x * 100).round() / 100`. +55. **`.index()` silently truncates non-whole float arguments**. `[1,2,3].index(1.7)` behaves like `.index(1)`. +56. **`find()` on arrays returns Go `int`, not `int64`** — an unusual type-width quirk that can cause subtle comparison / arithmetic mismatches. Normalise with `.number()` if needed. +57. **`now()` returns a `string`**, not a `timestamp`. For a typed timestamp use `ts_parse(...)` (extension-only) on the string, or build a `time.Time` via other means. +58. **`range(a, b)` is a compile-time validated builtin**. `range(5, 5)` errors at parse. `range(0, 10, 3)` yields `[0, 3, 6]` (integer truncation of `(stop-start)/step`, not inclusive of the stop bound). +59. **`random_int` validates arguments at compile time** — negative `min`, `min > max`, etc. fail to parse. +60. **`format_json` HTML-escapes by default** (`<` → `<`) and returns the literal string `"null"` when called on an empty array. +61. **`.apply("name")` resolves the map name at runtime**, not compile time. `.apply("missing")` produces a runtime error, not a parse error — tools validating imports against a manifest should know this. +62. **Recursion-limit errors ARE catchable** via `.catch()` — they come through as ordinary runtime errors even though they originate deep in the interpreter. +63. **Error messages carry a prefix describing the failure source**. Common prefixes observed: `"field \`\`: "` (e.g. `"field \`this.b\`: cannot divide types number (from number literal) and null (from field \`this.b\`)"`) and `"number literal: "` (e.g. `"number literal: attempted to divide by zero"` — specifically when the divisor is a literal zero that triggers compile-time folding). Other forms such as `"null literal: value is null"` and `"string literal: "` appear in the code but are only exercised under specific conditions. Migration tools that substring-match on V1 errors should anchor on the inner phrase rather than the prefix. +64. **The `throw` function's argument is named `why`** (not `msg`). Compile errors about wrong type/arity mention `why`, not `throw`: `"missing parameter: why"`, `"field why: wrong argument type, expected string, got number"`. +65. **Top-level statements in imported files** have two behaviours depending on whether the file contains any `map` definitions: + - **File with no `map` definitions** (only top-level statements or `let`s): `import "file"` **fails at parse time** with `"no maps to import from ''"` (`parser/mapping_parser.go:229`). The import cannot proceed. + - **File with at least one `map` and some top-level statements**: the imports succeed, but the top-level statements are **silently dropped** — their side effects don't run, their `let` bindings are invisible to any `.apply()` call on the caller side. A caller referencing a var "defined" in the imported file raises `"variable 'x' undefined"` at runtime. + Migration tools that reformat imports should either reject files that rely on top-level statements or hoist those statements into the caller. +66. **Whole-meta assignment requires an object**. `meta = "str"`, `meta = 5`, `meta = [1,2]` all raise runtime `"setting root meta object requires object value, received: "`. See §6.4. +67. **Path collision on assignment raises a runtime error**. `root.user = "Alice"; root.user.name = "Jane"` fails with `"unable to set target path user.name as the value of user was a non-object type (string)"`. Migration tools should order/restructure assignments to avoid setting a scalar on a path prefix that is later extended. +68. **Assignment `=` requires whitespace on both sides**. `root.a = 1` is fine; `root.a=1`, `root.a =1`, `let x=5` are all parse errors (`"expected whitespace"`). This is specific to the statement-level `=` — binary operators have no such restriction (`1+2` works). See §2.1. +69. **Double-not `!!x` is a parse error**. The `!` prefix is an optional-once position (`parseWithTails` uses `Optional(!)` — zero or one), not repeatable. Write `!(!x)` if double negation is needed. See §5.1. +70. **`.type()` on sentinels returns `"delete"` or `"nothing"`**. In addition to the eight canonical type names (null, bool, number, string, bytes, timestamp, array, object), calling `.type()` directly on a sentinel exposes its sentinel kind: `deleted().type()` → `"delete"`, `nothing().type()` → `"nothing"`. Useful for defensive code; migration tools should not expect only the canonical eight. +71. **Sentinels inside array and object literals silently elide the entry**. `[1, nothing(), 3]` → `[1, 3]`; `{"a": 1, "b": deleted()}` → `{"a": 1}`. See §9.4. This changes array length and object shape in ways a naive rewrite may not preserve. +72. **`this` and `this.` are not valid `root` aliases at assignment-target position**. `this = v` and `this.foo = v` parse but produce `{"this": v}` / `{"this": {"foo": v}}` — literal top-level `"this"` key — not `{..}` on root. The parser only strips the leading `root` segment. See §6.4. +73. **Statement-form `if null { ... }` errors**; expression-form `if null { ... }` treats null as falsy. Different code paths. `query/expression.go:99-105` has the null-is-falsy accommodation; `mapping/statement.go:143-146` does not. See §7.3 and §8.3. +74. **Method-argument context rebinding depends on the method**. Iterator methods (`map_each`, `filter`, `fold`, `sort`, `any`, `all`) rebind `this` to the current element in their non-lambda argument. Non-iterator methods (`slice`, `format`, `index`, `get`, most others) do not — their arguments see the outer `this`. See §6.5. +75. **Match pattern literal/expression classification happens after constant folding**. `(2+1) => ...` folds to `3` and becomes a literal-equality pattern. Only patterns that survive folding as a non-literal (function calls, variable refs, field refs on non-literals) are treated as boolean predicates. See §8.4. +76. **`let "" = …` parses but `$""` does not**. Quoted variable bindings with non-identifier characters are write-only — unreadable. Use only identifier-valid names for variables. See §7.2. + +--- + +## 15. Grammar Summary (Informal EBNF) + +``` +Mapping = { Statement (Newline | EOF) } + +Statement = Assignment + | LetBinding + | MapDefinition + | ImportStatement + | FromStatement + | RootLevelIf + | BareExpression # only if it's the sole statement + +Assignment = Target "=" Expression +Target = "root" ("." PathSegment)* + | "this" ("." PathSegment)* # legacy = root… + | "meta" [ BareKey | QuotedString ] # whole-meta or single key (no computed keys) + | BareKey ("." PathSegment)* # legacy bare path = root.… + +LetBinding = "let" (Ident | QuotedString) "=" Expression + +MapDefinition = "map" Ident "{" { Statement (Newline) } "}" + +ImportStatement = "import" StringLiteral +FromStatement = "from" StringLiteral + +RootLevelIf = "if" Expression "{" { Statement } "}" + { "else" "if" Expression "{" { Statement } "}" } + [ "else" "{" { Statement } "}" ] + +BareExpression = Expression # only as sole statement + +Expression = ArithmeticChain +ArithmeticChain = [ "-" ] Term { BinOp [ "-" ] Term } +BinOp = "+" | "-" | "*" | "/" | "%" + | "==" | "!=" | "<" | "<=" | ">" | ">=" + | "&&" | "||" | "|" + +Term = [ "!" ] Unary + +Unary = Primary { Tail } +Primary = Literal + | "this" | "root" + | Ident # legacy root-scoped access (= this.Ident) + | "$" Ident + | "@" [ Ident | QuotedString ] + | "(" Expression ")" + | ArrayLiteral + | ObjectLiteral + | FunctionCall + | IfExpression + | MatchExpression + | LambdaExpression + +Tail = "." PathSegment # field access + | "." Ident "(" [ Args ] ")" # method call + | "." "(" Expression ")" # map expression + +PathSegment = Ident | QuotedString + +FunctionCall = Ident "(" [ Args ] ")" +Args = Arg { "," Arg } # all-positional or all-named, not mixed +Arg = Expression | Ident ":" Expression + +IfExpression = "if" Expression "{" Expression "}" + { "else" "if" Expression "{" Expression "}" } + [ "else" "{" Expression "}" ] + +MatchExpression = "match" [ Expression ] "{" MatchCase { Sep MatchCase } "}" +MatchCase = ( "_" | Expression ) "=>" Expression +Sep = "," | Newline + +LambdaExpression= (Ident | "_") "->" Expression + +ArrayLiteral = "[" [ Expression { "," Expression } [ "," ] ] "]" +ObjectLiteral = "{" [ ObjectMember { "," ObjectMember } [ "," ] ] "}" +ObjectMember = (QuotedString | "(" Expression ")") ":" Expression + +Literal = IntLit | FloatLit | QuotedString | TripleQuotedString + | "true" | "false" | "null" + +IntLit = Digit+ +FloatLit = Digit+ "." Digit+ +QuotedString = "\"" { EscapedChar | NonQuote } "\"" +TripleQuotedString = "\"\"\"" { any } "\"\"\"" +Ident = [A-Za-z0-9_]+ # lenient (path segments, lambda params) +BareKey = [a-z0-9_]+ # snake_case (function/method names, named args, meta keys) +``` + +This EBNF is *informal* — the real parser is a hand-written combinator with specific ordering and lookahead choices. For corner cases, consult `internal/bloblang/parser/`. + +--- + +## 16. File Map + +| Concern | Source | +|---------|--------| +| Parser entry and dispatch | `internal/bloblang/parser/mapping_parser.go`, `parser/query_parser.go` | +| Expression tails and paths | `parser/query_function_parser.go` | +| Arithmetic, precedence, coalesce | `parser/query_arithmetic_parser.go`, `query/arithmetic.go` | +| If / match / lambda / parens | `parser/query_expression_parser.go`, `parser/root_expression_parser.go` | +| Literals | `parser/query_literal_parser.go`, `parser/combinators.go` | +| Field interpolation dialect | `parser/field_parser.go`, `field/` | +| Assignment semantics | `mapping/assignment.go`, `mapping/statement.go` | +| Built-in functions | `query/functions*.go`, registry in `query/function_set.go` | +| Built-in methods | `query/methods*.go`, registry in `query/method_set.go` | +| Docs metadata for each spec | `query/docs.go`, `query/params.go` | +| Environment / plugin API | `internal/bloblang/environment.go`, `plugins/` | + +--- + +## 17. Known Gaps in This Spec + +This document is descriptive, not exhaustive on individual built-ins. In particular: + +- **Per-function and per-method semantics** are not enumerated here. Use `query/docs.go` registrations and the online docs as the source of truth. +- **Deprecated builtins** are not individually listed. Enumerate by scanning the registry for `StatusDeprecated`. +- **Plugin-provided builtins** are inherently out of scope for a static document. +- **Implementation-defined behaviour** under extreme inputs (very large numbers, deep recursion, enormous strings) is not specified here; measure against the reference implementation. + +When a migration tool encounters a construct not described here, default to: parse with the reference parser, preserve verbatim, and flag for human review. diff --git a/internal/bloblang2/migrator/v1spec/README.md b/internal/bloblang2/migrator/v1spec/README.md new file mode 100644 index 000000000..795612068 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/README.md @@ -0,0 +1,46 @@ +# V1 Spec Compliance Suite + +A Go test harness + YAML corpus that verifies the **Bloblang V1 interpreter** behaves the way `../bloblang_v1_spec.md` says it does. This is not a test of the migrator — the migrator doesn't exist yet — it's groundwork: the same corpus will later serve as fixture data for migrator round-trip tests, and the V1 spec itself was refined by investigating discrepancies this suite surfaced. + +## Layout + +- **`tests/`** — 128 YAML files mirroring `../../spec/tests/`. Each is a V1 equivalent of the corresponding V2 conformance test. The schema is identical (reuses `internal/bloblang2/go/spectest` for loading), with one added field: `skip: ""` on tests that have no direct V1 equivalent. +- **`interp.go`** — `V1Interp` implements `spectest.Interpreter` using `public/bloblang` for compilation and `mapping.Executor.ExecOnto` for execution. Executes directly rather than via `MapPart` to preserve raw scalar types (`MapPart` stringifies through the message body, which would re-parse `"true"` as a bool). +- **`runner.go`** — `RunT` wraps `spectest.RunT` and pre-scans each YAML for the `skip` field, surfacing those tests via `t.Skip` rather than compiling them. +- **`v1spec_test.go`** — `TestBloblangV1Spec` entrypoint. + +## Running + +From `internal/bloblang2/`: + +```sh +task test:v1spec # run the full suite +task test:v1spec -- -v # verbose +task test:v1spec -- -run 'TestBloblangV1Spec/types/bool_null' # one file +``` + +Or from the repo root: + +```sh +go test ./internal/bloblang2/migrator/v1spec/... -run TestBloblangV1Spec +``` + +A test **passes** when the V1 interpreter produces the `output` / `deleted` / `error` / `compile_error` the YAML expects. **Skips** are V2-only constructs that can't be expressed in V1 at all. Current state: 2090 pass, 0 fail, 984 skip across 3074 test cases. + +## Intended uses + +1. **Spec validation** — fixing a failure generally exposes a V1 behaviour the spec should document more clearly. The spec and this corpus evolved together. +2. **Migrator fixture data** — when the migrator tool is built, it will be fed the V2 source tests and asked to produce mappings equivalent to these V1 tests. Any divergence is a migration bug. + +## Schema extension + +The base schema is from `internal/bloblang2/go/spectest` (see its `TEST_PLAN.md` / `README.md`). One migrator-specific addition: + +```yaml +- name: "uses V2-only typed numeric width" + mapping: | + root = this.x.int32() + skip: "V1 has no typed numeric widths" +``` + +A test with `skip:` is surfaced via `t.Skip(reason)` and otherwise ignored. The runner does not compile or execute skipped mappings, so the mapping field can contain non-V1 code if needed for documentation. diff --git a/internal/bloblang2/migrator/v1spec/interp.go b/internal/bloblang2/migrator/v1spec/interp.go new file mode 100644 index 000000000..0e46c2162 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/interp.go @@ -0,0 +1,113 @@ +// Package v1spec provides a Bloblang V1 spec-test runner. It adapts the +// shared spectest schema (originally built for V2 conformance) so that the V1 +// equivalents under ./tests can be executed by the V1 interpreter and their +// outputs compared against the same expectations. +package v1spec + +import ( + "fmt" + "strings" + + "github.com/redpanda-data/benthos/v4/internal/bloblang/mapping" + "github.com/redpanda-data/benthos/v4/internal/bloblang/query" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/spectest" + "github.com/redpanda-data/benthos/v4/internal/message" + "github.com/redpanda-data/benthos/v4/internal/value" + "github.com/redpanda-data/benthos/v4/public/bloblang" + + // Side-effect import: registers the typed-numeric methods (int32, int64, + // uint32, uint64, float32, float64, abs, pow, round(N), etc.), the ts_* + // formatters, and other extension-only builtins that ship with Redpanda + // Connect but aren't in the bare public/bloblang environment. These are + // what most real V1 mappings depend on, so the spec-compliance suite + // should run against them. + _ "github.com/redpanda-data/benthos/v4/internal/impl/pure" +) + +// V1Interp implements spectest.Interpreter using the public V1 Bloblang API. +type V1Interp struct{} + +// Compile parses a V1 mapping, wiring any in-memory import files through a +// custom importer. +func (V1Interp) Compile(src string, files map[string]string) (spectest.Mapping, error) { + env := bloblang.NewEnvironment() + if len(files) > 0 { + env = env.WithCustomImporter(func(name string) ([]byte, error) { + if content, ok := files[name]; ok { + return []byte(content), nil + } + if content, ok := files[strings.TrimPrefix(name, "./")]; ok { + return []byte(content), nil + } + return nil, fmt.Errorf("import %q not found in test files", name) + }) + } + exec, err := env.Parse(src) + if err != nil { + return nil, &spectest.CompileError{Message: err.Error()} + } + uw, ok := exec.XUnwrapper().(interface{ Unwrap() *mapping.Executor }) + if !ok { + return nil, &spectest.CompileError{Message: "internal: executor does not expose unwrapper"} + } + return &v1Mapping{exec: uw.Unwrap()}, nil +} + +type v1Mapping struct { + exec *mapping.Executor +} + +// Exec runs the V1 mapping against the given input + metadata. It uses +// Executor.ExecOnto directly (rather than MapPart) to preserve the raw Go type +// of the mapped root value — MapPart stringifies scalars through a message +// body, which would re-parse `"true"` as a bool. +func (m *v1Mapping) Exec(input any, meta map[string]any) (any, map[string]any, bool, error) { + // message.Part holds output metadata (and batch-scoped meta reads). + part := message.NewPart(nil) + if input != nil { + part.SetStructured(input) + } + for k, v := range meta { + part.MetaSetMut(k, v) + } + batch := message.Batch{part} + + vars := map[string]any{} + var newValue any = value.Nothing(nil) + + ctx := query.FunctionContext{ + Maps: m.exec.Maps(), + Vars: vars, + Index: 0, + MsgBatch: batch, + NewMeta: part, + NewValue: &newValue, + }.WithValue(input) + + if err := m.exec.ExecOnto(ctx, mapping.AssignmentContext{ + Vars: vars, + Meta: part, + Value: &newValue, + }); err != nil { + return nil, nil, false, err + } + + switch newValue.(type) { + case value.Delete: + return nil, nil, true, nil + case value.Nothing: + // Mapping made no payload assignment — preserve the input. + newValue = input + } + + outMeta := map[string]any{} + _ = part.MetaIterMut(func(key string, v any) error { + outMeta[key] = v + return nil + }) + + return newValue, outMeta, false, nil +} + +// Compile-time guard that V1Interp satisfies spectest.Interpreter. +var _ spectest.Interpreter = V1Interp{} diff --git a/internal/bloblang2/migrator/v1spec/runner.go b/internal/bloblang2/migrator/v1spec/runner.go new file mode 100644 index 000000000..9b6909cd8 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/runner.go @@ -0,0 +1,129 @@ +package v1spec + +import ( + "fmt" + "os" + "path/filepath" + "testing" + + "gopkg.in/yaml.v3" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/spectest" +) + +// skipProbe is used to read just the `skip` field of each test case so we can +// filter them out before handing the file to the shared spectest runner, which +// does not understand `skip`. +type skipProbe struct { + Tests []struct { + Name string `yaml:"name"` + Skip string `yaml:"skip"` + } `yaml:"tests"` +} + +// skippedNames returns a set of test-case names (and reasons) marked with +// `skip:` in the raw YAML. The spectest schema discards unknown fields, so we +// read the file twice — once for this probe, once for the real loader. +func skippedNames(path string) (map[string]string, error) { + data, err := os.ReadFile(path) + if err != nil { + return nil, err + } + var probe skipProbe + if err := yaml.Unmarshal(data, &probe); err != nil { + return nil, fmt.Errorf("parsing skip probe for %s: %w", path, err) + } + out := map[string]string{} + for _, t := range probe.Tests { + if t.Skip != "" { + out[t.Name] = t.Skip + } + } + return out, nil +} + +// RunT walks every YAML file under dir, filters tests marked `skip:` (recording +// them via t.Skip for visibility), and runs the remainder against the given +// interpreter through spectest. +// +// Layout matches spectest.RunT: a subtest per file, a sub-subtest per case. +func RunT(t *testing.T, dir string, interp spectest.Interpreter) { + t.Helper() + + files, err := spectest.DiscoverFiles(dir) + if err != nil { + t.Fatalf("discovering test files: %v", err) + } + if len(files) == 0 { + t.Fatalf("no test files found in %s", dir) + } + + for _, path := range files { + rel, relErr := filepath.Rel(dir, path) + if relErr != nil { + rel = path + } + t.Run(rel, func(t *testing.T) { + skips, err := skippedNames(path) + if err != nil { + t.Fatalf("loading skip probe: %v", err) + } + tf, err := spectest.LoadFile(path) + if err != nil { + t.Fatalf("loading test file: %v", err) + } + for i := range tf.Tests { + tc := &tf.Tests[i] + if reason, ok := skips[tc.Name]; ok { + t.Run(tc.Name, func(t *testing.T) { + t.Skipf("skipped by test file: %s", reason) + }) + continue + } + runSingle(t, tf, tc, rel, interp) + } + }) + } +} + +// runSingle replicates spectest.RunT's per-test dispatch for one test case, +// inline so we can interleave skip handling. Multi-case tests are handed off to +// spectest's internal runner via a slim TestFile copy. +func runSingle(t *testing.T, tf *spectest.TestFile, tc *spectest.TestCase, rel string, interp spectest.Interpreter) { + t.Helper() + + // Isolate this test case in a single-test TestFile so spectest.RunFile + // handles the compile/exec/compare dance — including multi-case + // (`cases:`) tests. + single := &spectest.TestFile{ + Description: tf.Description, + Files: tf.Files, + Tests: []spectest.TestCase{*tc}, + } + results := spectest.RunFile(single, rel, interp) + + if len(tc.Cases) > 0 { + t.Run(tc.Name, func(t *testing.T) { + for _, r := range results { + caseName := r.Case + if caseName == "" { + caseName = "(case)" + } + t.Run(caseName, func(t *testing.T) { + if r.Err != nil { + t.Fatal(r.Err) + } + }) + } + }) + return + } + + t.Run(tc.Name, func(t *testing.T) { + for _, r := range results { + if r.Err != nil { + t.Fatal(r.Err) + } + } + }) +} diff --git a/internal/bloblang2/migrator/v1spec/tests/access/dynamic_access.yaml b/internal/bloblang2/migrator/v1spec/tests/access/dynamic_access.yaml new file mode 100644 index 000000000..6a629c5b3 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/access/dynamic_access.yaml @@ -0,0 +1,158 @@ +description: "Dynamic access with [expr]: arrays, objects, strings, bytes; type errors for wrong index types" + +tests: + # --- Object dynamic access --- + + - name: "object dynamic access with string literal" + mapping: | + let obj = {"name": "Alice"} + root.v = $obj.get("name") + output: {"v": "Alice"} + + - name: "object dynamic access with variable" + mapping: | + let obj = {"color": "blue"} + let key = "color" + root.v = $obj.get($key) + output: {"v": "blue"} + + - name: "object dynamic access with expression" + mapping: | + let obj = {"key_a": "found"} + let suffix = "a" + root.v = $obj.get("key_" + $suffix) + output: {"v": "found"} + + - name: "object dynamic access non-existent key returns null" + mapping: | + let obj = {"a": 1} + root.v = $obj.get("missing") + output: {"v": null} + + - name: "object dynamic access with integer key is error" + mapping: | + let obj = {"name": "Alice"} + root.v = $obj.get(0) + compile_error: "expected string, got number" + + - name: "object dynamic access with bool key is error" + mapping: | + let obj = {"name": "Alice"} + root.v = $obj.get(true) + compile_error: "expected string, got bool" + + - name: "object dynamic access with null key is error" + mapping: | + let obj = {"name": "Alice"} + root.v = $obj.get(null) + compile_error: "expected string, got null" + + # --- Array dynamic access --- + + - name: "array index zero" + mapping: | + let arr = ["a", "b", "c"] + root.v = $arr.index(0) + output: {"v": "a"} + + - name: "array index middle" + mapping: | + let arr = ["a", "b", "c"] + root.v = $arr.index(1) + output: {"v": "b"} + + - name: "array index last" + mapping: | + let arr = ["a", "b", "c"] + root.v = $arr.index(2) + output: {"v": "c"} + + - name: "array index with float whole number accepted" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(2.0) + output: {"v": 30} # FIXME-v1: verify - V1 .index() may require int + + - name: "array index with non-whole float is error" + skip: "V1 .index() silently truncates non-whole floats to int (1.5 → 1); no error raised" + + - name: "array index with string is error" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index("0") + compile_error: "expected number value, got string" + + - name: "array index with bool is error" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(true) + compile_error: "expected number value, got bool" + + - name: "array index with variable" + mapping: | + let arr = [10, 20, 30] + let i = 1 + root.v = $arr.index($i) + output: {"v": 20} + + - name: "nested array and object dynamic access" + input: {"users": [{"name": "Alice"}, {"name": "Bob"}]} + mapping: | + root.v = this.users.index(1).name + output: {"v": "Bob"} + + # --- String dynamic access (codepoint) --- + + - name: "string index returns codepoint value" + skip: "V2-only: V1 has no string indexing (strings are not indexable as arrays of codepoints)" + + - name: "string index with float whole number accepted" + skip: "V2-only: V1 has no string indexing" + + - name: "string index with non-whole float is error" + skip: "V2-only: V1 has no string indexing" + + - name: "string index with string key is error" + skip: "V2-only: V1 has no string indexing" + + - name: "string index non-ascii codepoint value" + skip: "V2-only: V1 has no string indexing" + + # --- Bytes dynamic access --- + + - name: "bytes index returns byte value" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes index with float whole number accepted" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes index with non-whole float is error" + skip: "V2-only: V1 has no bytes indexing via brackets" + + # --- Indexing non-indexable types --- + + - name: "index on boolean is error" + skip: "V2-only: V1 has no bracket indexing syntax; closest analogue is .index()/.get() which do not apply to scalars" + + - name: "index on integer is error" + skip: "V2-only: V1 has no bracket indexing syntax" + + - name: "index on float is error" + skip: "V2-only: V1 has no bracket indexing syntax" + + - name: "index on null is error" + skip: "V2-only: V1 has no bracket indexing syntax; .index()/.get() on null returns null in V1" + + # --- Chained dynamic access --- + + - name: "chained bracket access on nested object" + mapping: | + let data = {"a": {"b": {"c": "deep"}}} + root.v = $data.get("a").get("b").get("c") + output: {"v": "deep"} + + - name: "mixed dot and bracket access" + input: {"users": [{"name": "Alice"}]} + mapping: | + root.v = this.users.index(0).name + output: {"v": "Alice"} diff --git a/internal/bloblang2/migrator/v1spec/tests/access/field_access.yaml b/internal/bloblang2/migrator/v1spec/tests/access/field_access.yaml new file mode 100644 index 000000000..541ae11a6 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/access/field_access.yaml @@ -0,0 +1,161 @@ +description: "Static field access: dot notation, keywords as fields, quoted fields, nested access, null for missing, errors on non-objects" + +tests: + # --- Basic dot notation --- + + - name: "access input field" + input: {"name": "Alice"} + mapping: | + root.v = this.name + output: {"v": "Alice"} + + - name: "access nested input field" + input: {"user": {"name": "Alice"}} + mapping: | + root.v = this.user.name + output: {"v": "Alice"} + + - name: "deeply nested field access" + mapping: | + root.v = this.a.b.c.d + cases: + - name: "all present" + input: {"a": {"b": {"c": {"d": 42}}}} + output: {"v": 42} + - name: "null intermediate is error" + input: {"a": {}} + output: {"v": null} # FIXME-v1: verify - V1 path access into missing/null returns null (no error) per §12.5 + + - name: "access output field after assignment" + mapping: | + root.x = 10 + root.y = root.x + 5 + output: {"x": 10, "y": 15} + + - name: "access variable field" + mapping: | + let obj = {"name": "Bob", "age": 25} + root.name = $obj.name + root.age = $obj.age + output: {"name": "Bob", "age": 25} + + # --- Non-existent fields return null --- + + - name: "non-existent field on input returns null" + input: {"name": "Alice"} + mapping: | + root.v = this.missing + output: {"v": null} + + - name: "non-existent nested field returns null" + input: {"user": {"name": "Alice"}} + mapping: | + root.v = this.user.email + output: {"v": null} + + - name: "non-existent field on variable returns null" + mapping: | + let obj = {"x": 1} + root.v = $obj.y + output: {"v": null} + + - name: "non-existent field on empty object returns null" + input: {} + mapping: | + root.v = this.anything + output: {"v": null} + + # --- Keywords as valid field names --- + + - name: "keyword map as field name" + input: {"map": "value"} + mapping: | + root.v = this.map + output: {"v": "value"} + + - name: "keyword if as field name" + mapping: | + root.if = "conditional" + root.v = root.if + output: {"if": "conditional", "v": "conditional"} + + - name: "keyword match as field name" + input: {"match": 42} + mapping: | + root.v = this.match + output: {"v": 42} + + - name: "keyword true as field name" + input: {"true": "yes"} + mapping: | + root.v = this.true + output: {"v": "yes"} + + - name: "keyword null as field name" + input: {"null": "not actually null"} + mapping: | + root.v = this.null + output: {"v": "not actually null"} + + # --- Quoted field names --- + + - name: "quoted field with spaces" + input: {"field with spaces": "hello"} + mapping: | + root.v = this."field with spaces" + output: {"v": "hello"} + + - name: "quoted field with special characters" + input: {"special-chars!@#": "value"} + mapping: | + root.v = this."special-chars!@#" + output: {"v": "value"} + + - name: "quoted field starting with digit" + input: {"123": "numeric"} + mapping: | + root.v = this."123" + output: {"v": "numeric"} + + - name: "quoted field with dots" + input: {"a.b.c": "dotted"} + mapping: | + root.v = this."a.b.c" + output: {"v": "dotted"} + + - name: "quoted field on output" + mapping: | + root."my-field" = "works" + output: {"my-field": "works"} + + - name: "quoted field nested access" + input: {"top level": {"inner-key": "found"}} + mapping: | + root.v = this."top level"."inner-key" + output: {"v": "found"} + + # --- Field access on non-object types is error --- + + - name: "field access on string is error" + mapping: | + let s = "hello" + root.v = $s.field + output: {"v": null} # V1 silently returns null for field access on non-object values + + - name: "field access on integer is error" + mapping: | + let n = 42 + root.v = $n.field + output: {"v": null} # V1 silently returns null for field access on non-object values + + - name: "field access on boolean is error" + mapping: | + let b = true + root.v = $b.field + output: {"v": null} # V1 silently returns null for field access on non-object values + + - name: "field access on array is error" + mapping: | + let arr = [1, 2, 3] + root.v = $arr.field + output: {"v": null} # V1 silently returns null for field access on non-object values diff --git a/internal/bloblang2/migrator/v1spec/tests/access/negative_indexing.yaml b/internal/bloblang2/migrator/v1spec/tests/access/negative_indexing.yaml new file mode 100644 index 000000000..22ea64a9a --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/access/negative_indexing.yaml @@ -0,0 +1,103 @@ +description: "Negative indexing for arrays, strings, and bytes: -1 is last, -2 is second-to-last, out-of-bounds errors" + +tests: + # --- Array negative indexing --- + + - name: "array negative index -1 is last element" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-1) + output: {"v": 30} + + - name: "array negative index -2 is second to last" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-2) + output: {"v": 20} + + - name: "array negative index -3 is first element" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-3) + output: {"v": 10} + + - name: "array negative index on single element" + mapping: | + let arr = [42] + root.v = $arr.index(-1) + output: {"v": 42} + + - name: "array negative index out of bounds" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-4) + error: "out of bounds" + + - name: "array negative index far out of bounds" + mapping: | + let arr = [1, 2] + root.v = $arr.index(-100) + error: "out of bounds" + + - name: "array negative index on single element -2 is error" + mapping: | + let arr = [42] + root.v = $arr.index(-2) + error: "out of bounds" + + - name: "array negative index with float whole number" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-1.0) + output: {"v": 30} # FIXME-v1: verify - V1 may require int index + + # --- String negative indexing (codepoint) --- + + - name: "string negative index -1 is last codepoint" + skip: "V2-only: V1 has no string indexing" + + - name: "string negative index -2 is second to last" + skip: "V2-only: V1 has no string indexing" + + - name: "string negative index -5 is first codepoint" + skip: "V2-only: V1 has no string indexing" + + - name: "string negative index on single char" + skip: "V2-only: V1 has no string indexing" + + - name: "string negative index out of bounds" + skip: "V2-only: V1 has no string indexing" + + - name: "string negative index on non-ascii" + skip: "V2-only: V1 has no string indexing" + + - name: "string negative index round trip with char" + skip: "V2-only: V1 has no string indexing or .char() method" + + # --- Bytes negative indexing --- + + - name: "bytes negative index -1 is last byte" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes negative index -2 is second to last byte" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes negative index -5 is first byte" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes negative index out of bounds" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes negative index on multibyte utf8" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes negative index on single byte" + skip: "V2-only: V1 has no bytes indexing via brackets" + + # --- Negative indexing from input --- + + - name: "array from input with negative index" + input: {"items": ["first", "middle", "last"]} + mapping: | + root.v = this.items.index(-1) + output: {"v": "last"} diff --git a/internal/bloblang2/migrator/v1spec/tests/access/null_safe.yaml b/internal/bloblang2/migrator/v1spec/tests/access/null_safe.yaml new file mode 100644 index 000000000..57234f392 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/access/null_safe.yaml @@ -0,0 +1,149 @@ +description: "Null-safe navigation with ?. and ?[: short-circuits on null, type errors on non-null wrong type" + +tests: + # --- Basic ?. on null --- + + - name: "null-safe field access on null returns null" + mapping: | + let v = null + root.v = $v.name + output: {"v": null} # V1: path into null yields null natively; no ?. needed (§12.5) + + - name: "null-safe field access on non-null object works" + mapping: | + let v = {"name": "Alice"} + root.v = $v.name + output: {"v": "Alice"} + + - name: "null-safe on null input field" + input: {"user": null} + mapping: | + root.v = this.user.name + output: {"v": null} + + - name: "null-safe on missing input field returns null" + input: {} + mapping: | + root.v = this.missing.name + output: {"v": null} + + # --- Chained ?. --- + + - name: "chained null-safe field access" + mapping: | + root.v = this.user.address.city + cases: + - name: "all present" + input: {"user": {"address": {"city": "London"}}} + output: {"v": "London"} + - name: "middle is null" + input: {"user": {"address": null}} + output: {"v": null} + - name: "first is null" + input: {"user": null} + output: {"v": null} + - name: "first missing" + input: {} + output: {"v": null} + + # --- ?[ on null --- + + - name: "null-safe bracket on null returns null" + mapping: | + let v = null + root.v = $v.get("key") + output: {"v": null} # FIXME-v1: verify - V1 .get() on null returns null + + - name: "null-safe bracket on object works" + mapping: | + let v = {"key": "value"} + root.v = $v.get("key") + output: {"v": "value"} + + - name: "null-safe bracket on null array returns null" + mapping: | + let v = null + root.v = $v.index(0).or(null) + output: {"v": null} # V1 .index() on null errors; use .or(null) to mimic null-safe semantics + + - name: "null-safe bracket on array works" + mapping: | + let v = [10, 20, 30] + root.v = $v.index(1) + output: {"v": 20} + + # --- Null-safe method call --- + + - name: "null-safe method on null returns null" + mapping: | + let v = null + root.v = $v.length().catch(null) + output: {"v": null} # FIXME-v1: verify - V1 .length() on null errors; use .catch(null) to match ?. semantics + + - name: "null-safe method on non-null works" + mapping: | + let v = "hello" + root.v = $v.length() + output: {"v": 5} + + - name: "null-safe chained method on null" + mapping: | + let v = null + root.v = $v.trim().catch(null) + output: {"v": null} # FIXME-v1: verify - V1 .trim() on null errors; use .catch(null) + + # --- Type errors still throw on non-null wrong type --- + + - name: "null-safe field on integer is error" + mapping: | + root.v = 5.name + output: {"v": null} # V1 field access on scalars silently returns null + + - name: "null-safe field on string is error" + mapping: | + root.v = "hello".name + output: {"v": null} # V1 field access on scalars silently returns null + + - name: "null-safe field on boolean is error" + mapping: | + root.v = true.name + output: {"v": null} # V1 field access on scalars silently returns null + + - name: "null-safe field on array is error" + mapping: | + root.v = [1, 2].name + output: {"v": null} # V1 field access on scalars silently returns null + + - name: "non-null wrong type in chain is error" + input: {"value": "hello"} + mapping: | + root.v = this.value.nonfield.trim() + error: "field" # FIXME-v1: verify - field access on string may error or return null + + # --- Mixed null-safe and regular access --- + + - name: "regular then null-safe access" + input: {"user": {"profile": null}} + mapping: | + root.v = this.user.profile.bio + output: {"v": null} + + - name: "null-safe then regular access on present value" + input: {"user": {"profile": {"bio": "hello"}}} + mapping: | + root.v = this.user.profile.bio + output: {"v": "hello"} + + # --- Null-safe with dynamic access chains --- + + - name: "null-safe bracket then dot" + mapping: | + let data = null + root.v = $data.get("key").name + output: {"v": null} # FIXME-v1: verify - V1 .get() on null returns null, then path into null also null + + - name: "null-safe bracket on nested null" + input: {"items": null} + mapping: | + root.v = this.items.index(0).or(null) + output: {"v": null} # V1 .index() on null errors; use .or(null) for null-safe semantics diff --git a/internal/bloblang2/migrator/v1spec/tests/access/out_of_bounds.yaml b/internal/bloblang2/migrator/v1spec/tests/access/out_of_bounds.yaml new file mode 100644 index 000000000..dab1cb40c --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/access/out_of_bounds.yaml @@ -0,0 +1,172 @@ +description: "Out of bounds index errors for arrays, strings, and bytes; empty collections; boundary indices" + +tests: + # --- Array out of bounds --- + + - name: "array index beyond length" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(3) + error: "out of bounds" + + - name: "array index far beyond length" + mapping: | + let arr = [1, 2] + root.v = $arr.index(100) + error: "out of bounds" + + - name: "empty array index 0 is error" + mapping: | + let arr = [] + root.v = $arr.index(0) + error: "out of bounds" + + - name: "empty array negative index is error" + mapping: | + let arr = [] + root.v = $arr.index(-1) + error: "out of bounds" + + - name: "array last valid positive index" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(2) + output: {"v": 30} + + - name: "array first invalid positive index" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(3) + error: "out of bounds" + + - name: "array last valid negative index" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-3) + output: {"v": 10} + + - name: "array first invalid negative index" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-4) + error: "out of bounds" + + - name: "single element array valid indices" + mapping: | + let arr = [42] + root.a = $arr.index(0) + root.b = $arr.index(-1) + output: {"a": 42, "b": 42} + + - name: "single element array invalid positive" + mapping: | + let arr = [42] + root.v = $arr.index(1) + error: "out of bounds" + + # --- String out of bounds --- + + - name: "string index beyond length" + skip: "V2-only: V1 has no string indexing" + + - name: "string index far beyond length" + skip: "V2-only: V1 has no string indexing" + + - name: "empty string index 0 is error" + skip: "V2-only: V1 has no string indexing" + + - name: "empty string negative index is error" + skip: "V2-only: V1 has no string indexing" + + - name: "string last valid positive index" + skip: "V2-only: V1 has no string indexing" + + - name: "string first invalid positive index" + skip: "V2-only: V1 has no string indexing" + + - name: "string last valid negative index" + skip: "V2-only: V1 has no string indexing" + + - name: "string first invalid negative index" + skip: "V2-only: V1 has no string indexing" + + - name: "single char string boundary" + skip: "V2-only: V1 has no string indexing" + + - name: "single char string invalid positive" + skip: "V2-only: V1 has no string indexing" + + # --- Bytes out of bounds --- + + - name: "bytes index beyond length" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes index far beyond length" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "empty bytes index 0 is error" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "empty bytes negative index is error" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes last valid positive index" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes first invalid positive index" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes last valid negative index" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "bytes first invalid negative index" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "single byte boundary" + skip: "V2-only: V1 has no bytes indexing via brackets" + + - name: "single byte invalid positive" + skip: "V2-only: V1 has no bytes indexing via brackets" + + # --- Multibyte boundary --- + + - name: "multibyte string codepoint boundary" + skip: "V2-only: V1 has no string indexing" + + - name: "multibyte string codepoint out of bounds" + skip: "V2-only: V1 has no string indexing" + + # --- Non-integer and extreme float indices --- + + - name: "array index with NaN is error" + input: {"nan": {_type: "float64", value: "NaN"}} + mapping: | + root.v = [10, 20, 30].index(this.nan) + skip: "V2-only: V1 test harness cannot inject raw NaN via {_type} markers" + + - name: "array index with Infinity is error" + input: {"inf": {_type: "float64", value: "Infinity"}} + mapping: | + root.v = [10, 20, 30].index(this.inf) + skip: "V2-only: V1 test harness cannot inject raw Infinity via {_type} markers" + + - name: "array index with large float exceeding int64 range is error" + input: {"big": {_type: "float64", value: "1e19"}} + mapping: | + root.v = [10, 20, 30].index(this.big) + skip: "V2-only: V1 test harness cannot inject 1e19 float via {_type} markers" + + - name: "array index with whole-number float 2.0 is accepted" + mapping: | + root.v = [10, 20, 30].index(2.0) + output: {"v": 30} # FIXME-v1: verify - V1 .index() may require integer + + - name: "array index with fractional float 1.5 is error" + skip: "V1 .index() silently truncates non-whole floats to int (1.5 → 1); no error raised" + + - name: "multibyte bytes boundary differs from string" + mapping: | + let s = "café" + root.string_len = $s.length() + root.bytes_len = $s.bytes().length() + output: {"string_len": 5, "bytes_len": 5} # V1 .length() on a string returns byte count, not codepoint count diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/cloudformation_inventory.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/cloudformation_inventory.yaml new file mode 100644 index 000000000..74a0d737c --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/cloudformation_inventory.yaml @@ -0,0 +1,160 @@ +description: > + AWS CloudFormation resource inventory — transform a stack description and its + resources into a CMDB-friendly record. Parses account and region from the stack + ARN, converts KV-pair arrays (Parameters, Tags, Outputs) into objects, coerces + numeric string parameters to integers, groups resources by AWS service + category, and checks overall health status. + +tests: + - name: "build inventory record from CloudFormation stack" + input: + stack: + StackName: "prod-api-stack" + StackId: "arn:aws:cloudformation:us-east-1:123456789012:stack/prod-api-stack/guid" + StackStatus: "UPDATE_COMPLETE" + LastUpdatedTime: "2024-01-14T22:15:00Z" + Parameters: + - {ParameterKey: "environment", ParameterValue: "production"} + - {ParameterKey: "instance_type", ParameterValue: "m5.xlarge"} + - {ParameterKey: "min_capacity", ParameterValue: "3"} + - {ParameterKey: "max_capacity", ParameterValue: "12"} + Tags: + - {Key: "team", Value: "platform"} + - {Key: "cost-center", Value: "eng-1234"} + Outputs: + - {OutputKey: "LoadBalancerDNS", OutputValue: "prod-api-lb-123.us-east-1.elb.amazonaws.com"} + - {OutputKey: "ApiEndpoint", OutputValue: "https://api.example.com"} + resources: + - LogicalResourceId: "WebLoadBalancer" + PhysicalResourceId: "arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/prod-api-lb/abc123" + ResourceType: "AWS::ElasticLoadBalancingV2::LoadBalancer" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "WebTargetGroup" + PhysicalResourceId: "arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/prod-api-tg/def456" + ResourceType: "AWS::ElasticLoadBalancingV2::TargetGroup" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "AppAutoScalingGroup" + PhysicalResourceId: "prod-api-asg-XYZ789" + ResourceType: "AWS::AutoScaling::AutoScalingGroup" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "AppSecurityGroup" + PhysicalResourceId: "sg-0a1b2c3d" + ResourceType: "AWS::EC2::SecurityGroup" + ResourceStatus: "UPDATE_COMPLETE" + - LogicalResourceId: "AppLogGroup" + PhysicalResourceId: "/aws/ecs/prod-api" + ResourceType: "AWS::Logs::LogGroup" + ResourceStatus: "CREATE_COMPLETE" + mapping: | + # Map AWS resource type to a friendly service category. + # V1 maps take a single receiver (`this`) and have no parameters. + map service_category { + root = match this.split("::").1 { + "ElasticLoadBalancingV2" => "ELB", + "AutoScaling" => "AutoScaling", + "EC2" => "EC2", + "Logs" => "CloudWatch", + _ => this.split("::").1, + } + } + + # Extract the short resource type (e.g. "LoadBalancer" from + # "AWS::ElasticLoadBalancingV2::LoadBalancer"). + map short_type { + root = this.split("::").2 + } + + let stack = this.stack + + # Parse region and account from the stack ARN. + # V2 $arr[3] with literal index is equivalent to V1 $arr.3. + let arn_parts = $stack.StackId.split(":") + let region = $arn_parts.3 + let account = $arn_parts.4 + + # Convert Parameters KV array to object, parsing pure-numeric values. + # V2 .collect() → V1 fold of {key,value} records into an object. + # V2 .int64() → V1 .number() (returns int if the input parses as one). + # V2 block-body lambda with `let` → V1 inline. + let config = $stack.Parameters.map_each(p -> { + "key": p.ParameterKey, + "value": if p.ParameterValue.re_match("^[0-9]+$") { + p.ParameterValue.number() + } else { + p.ParameterValue + }, + }).fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + + # Convert Tags and Outputs KV arrays to objects. + let tags = $stack.Tags. + map_each(t -> {"key": t.Key, "value": t.Value}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + let endpoints = $stack.Outputs. + map_each(o -> {"key": o.OutputKey, "value": o.OutputValue}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + + # Group resources by service category. + # V2 .map() → V1 .map_each(). + let services = this.resources.map_each(r -> r.ResourceType.apply("service_category")).unique() + let resources_by_service = $services.map_each(svc -> { + "key": svc, + "value": this.resources. + filter(r -> r.ResourceType.apply("service_category") == svc). + map_each(r -> { + "logical_id": r.LogicalResourceId, + "physical_id": r.PhysicalResourceId, + "type": r.ResourceType.apply("short_type"), + }), + }).fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + + root.stack = $stack.StackName + root.region = $region + root.account = $account + root.status = $stack.StackStatus + root.last_updated = $stack.LastUpdatedTime + root.team = $tags.team + root.cost_center = $tags."cost-center" + root.config = $config + root.endpoints = $endpoints + root.resources_by_service = $resources_by_service + root.resource_count = this.resources.length() + root.all_healthy = this.resources. + all(r -> r.ResourceStatus.has_suffix("COMPLETE")) + output: + stack: "prod-api-stack" + region: "us-east-1" + account: "123456789012" + status: "UPDATE_COMPLETE" + last_updated: "2024-01-14T22:15:00Z" + team: "platform" + cost_center: "eng-1234" + config: + environment: "production" + instance_type: "m5.xlarge" + min_capacity: 3.0 # V1 .number() always returns float64; no separate int parse method + max_capacity: 12.0 + endpoints: + LoadBalancerDNS: "prod-api-lb-123.us-east-1.elb.amazonaws.com" + ApiEndpoint: "https://api.example.com" + resources_by_service: + ELB: + - logical_id: "WebLoadBalancer" + physical_id: "arn:aws:elasticloadbalancing:us-east-1:123456789012:loadbalancer/app/prod-api-lb/abc123" + type: "LoadBalancer" + - logical_id: "WebTargetGroup" + physical_id: "arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/prod-api-tg/def456" + type: "TargetGroup" + AutoScaling: + - logical_id: "AppAutoScalingGroup" + physical_id: "prod-api-asg-XYZ789" + type: "AutoScalingGroup" + EC2: + - logical_id: "AppSecurityGroup" + physical_id: "sg-0a1b2c3d" + type: "SecurityGroup" + CloudWatch: + - logical_id: "AppLogGroup" + physical_id: "/aws/ecs/prod-api" + type: "LogGroup" + resource_count: 5 + all_healthy: true diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/debezium_cdc.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/debezium_cdc.yaml new file mode 100644 index 000000000..2fe4ce2db --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/debezium_cdc.yaml @@ -0,0 +1,143 @@ +description: > + Debezium CDC change event processing — diff the before/after snapshots to + identify changed fields, parse embedded JSON columns (shipping address, line + items), convert Debezium epoch-day dates, convert cents to dollars, extract + provenance metadata, and set output metadata for downstream CDC routing. + +tests: + - name: "process Debezium MySQL update event with embedded JSON" + input: + payload: + before: + id: 10042 + customer_id: 1007 + status: "pending" + order_date: 19797 + shipping_address: '{"street":"742 Evergreen Terrace","city":"Springfield","state":"IL","zip":"62704"}' + line_items_json: '[{"sku":"WIDGET-A","qty":3,"unit_price_cents":1500},{"sku":"GADGET-B","qty":1,"unit_price_cents":4200}]' + total_cents: 8700 + currency: "USD" + updated_at: "2024-03-15T10:30:00Z" + after: + id: 10042 + customer_id: 1007 + status: "shipped" + order_date: 19797 + shipping_address: '{"street":"742 Evergreen Terrace","city":"Springfield","state":"IL","zip":"62704"}' + line_items_json: '[{"sku":"WIDGET-A","qty":3,"unit_price_cents":1500},{"sku":"GADGET-B","qty":1,"unit_price_cents":4200}]' + total_cents: 8700 + currency: "USD" + updated_at: "2024-03-15T14:22:17Z" + source: + connector: "mysql" + name: "dbserver1" + db: "inventory" + table: "orders" + ts_ms: 1710509137000 + gtid: "3e11fa47-71ca-11e1-9e33-c80aa9429562:58" + file: "mysql-bin.000003" + pos: 484 + version: "2.5.0.Final" + op: "u" + ts_ms: 1710509137425 + mapping: | + let before = this.payload.before + let after = this.payload.after + let src = this.payload.source + + # Diff: compare selected fields between before and after. + # V2 $before[f] with dynamic key → V1 $before.get(f). + # V2 .collect() → V1 fold of {key,value} records into an object. + let diff_fields = ["status", "updated_at", "total_cents", "currency"] + let changed = $diff_fields. + filter(f -> $before.get(f) != $after.get(f)). + map_each(f -> {"key": f, "value": {"before": $before.get(f), "after": $after.get(f)}}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + + # Parse embedded JSON string columns from the after image. + # V2 .map() → V1 .map_each(). + let address = $after.shipping_address.parse_json() + let line_items = $after.line_items_json.parse_json(). + map_each(li -> li.merge({"subtotal_cents": li.qty * li.unit_price_cents})) + + # Debezium epoch-day (days since 1970-01-01) as unix seconds. + # The migrator V1 core env has no ts_format (lives in impl/pure), so we + # emit the raw numeric timestamp rather than formatting it. + let order_date_unix = $after.order_date * 86400 + + # Event timestamp from Debezium milliseconds; emit as numeric unix millis. + let event_ts_ms = this.payload.ts_ms + + # Map operation code to name + let op_name = match this.payload.op { + "c" => "create", + "u" => "update", + "d" => "delete", + "r" => "snapshot", + _ => "unknown", + } + + root.table = $src.db + "." + $src.table + root.operation = $op_name + root.key = {"id": $after.id} + root.timestamp_ms = $event_ts_ms + root.changed_fields = $changed + root.current = { + "id": $after.id, + "customer_id": $after.customer_id, + "status": $after.status, + "order_date_unix": $order_date_unix, + "shipping_address": $address, + "line_items": $line_items, + "total_dollars": $after.total_cents.number() / 100.0, + "currency": $after.currency, + } + root.provenance = { + "connector": $src.connector, + "server": $src.name, + "gtid": $src.gtid, + "binlog_file": $src.file, + "binlog_pos": $src.pos, + "version": $src.version, + } + + # V2 output@.x → V1 meta x + meta cdc_table = $src.db + "." + $src.table + meta cdc_operation = $op_name + meta cdc_gtid = $src.gtid + output: + table: "inventory.orders" + operation: "update" + key: {id: 10042} + timestamp_ms: 1710509137425 + changed_fields: + status: {before: "pending", after: "shipped"} + updated_at: {before: "2024-03-15T10:30:00Z", after: "2024-03-15T14:22:17Z"} + current: + id: 10042 + customer_id: 1007 + status: "shipped" + order_date_unix: 1710460800 + shipping_address: + street: "742 Evergreen Terrace" + city: "Springfield" + state: "IL" + zip: "62704" + line_items: + # V1 parse_json decodes JSON numbers as float64; qty/unit_price_cents + # and arithmetic over them are therefore floats here. + - {sku: "WIDGET-A", qty: 3.0, unit_price_cents: 1500.0, subtotal_cents: 4500.0} + - {sku: "GADGET-B", qty: 1.0, unit_price_cents: 4200.0, subtotal_cents: 4200.0} + total_dollars: 87.0 + currency: "USD" + provenance: + connector: "mysql" + server: "dbserver1" + gtid: "3e11fa47-71ca-11e1-9e33-c80aa9429562:58" + binlog_file: "mysql-bin.000003" + binlog_pos: 484 + version: "2.5.0.Final" + output_metadata: + cdc_table: "inventory.orders" + cdc_operation: "update" + cdc_gtid: "3e11fa47-71ca-11e1-9e33-c80aa9429562:58" diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/ecommerce_order.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/ecommerce_order.yaml new file mode 100644 index 000000000..c43b30058 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/ecommerce_order.yaml @@ -0,0 +1,142 @@ +description: > + E-commerce order normalization — flatten a Shopify-style order into a + warehouse-friendly format. Joins fulfillment tracking back to line items, + computes net prices from string-encoded decimals, converts units, formats + addresses, and detects billing/shipping mismatches. + +tests: + - name: "normalize Shopify order for warehouse ERP" + input: + order: + id: 5765806342 + email: "jane@example.com" + created_at: "2024-01-15T10:30:00-05:00" + currency: "USD" + billing_address: + first_name: "John" + last_name: "Smith" + address1: "123 Fake Street" + city: "Faketown" + province_code: "ON" + country_code: "CA" + zip: "K2P 1L4" + shipping_address: + first_name: "Jane" + last_name: "Smith" + address1: "123 Fake Street" + city: "Faketown" + province_code: "ON" + country_code: "CA" + zip: "K2P 1L4" + line_items: + - id: 1001 + title: "Red Leather Coat" + sku: "RLC-001" + quantity: 1 + price: "129.99" + grams: 1700 + tax_lines: + - {price: "7.80"} + discount_allocations: + - {amount: "13.00"} + - id: 1002 + title: "Blue Suede Shoes" + sku: "BSS-001" + quantity: 1 + price: "85.95" + grams: 750 + tax_lines: [] + discount_allocations: [] + - id: 1003 + title: "Raspberry Beret" + sku: "RB-001" + quantity: 2 + price: "19.99" + grams: 320 + tax_lines: + - {price: "2.40"} + discount_allocations: [] + fulfillments: + - id: 1 + tracking_numbers: ["1Z999AA10123456784"] + line_items: + - {id: 1001} + - {id: 1002} + discount_codes: + - {code: "FAKE30"} + mapping: | + # V1 maps take a single receiver (`this`); V2 parameterised maps become + # single-receiver maps invoked via .apply, with the argument reshaped to + # pass data through `this`. + map format_address { + root = this.first_name + " " + this.last_name + ", " + + this.address1 + ", " + this.city + " " + + this.province_code + " " + this.zip + ", " + this.country_code + } + + let order = this.order + let fulfillments = $order.fulfillments + let bill = $order.billing_address + let ship = $order.shipping_address + + # V2 .float64() → V1 .number(). + # V2 .find(lambda) → V1 .filter(lambda).index(0). + # V2 ?.x → V1 .x | null. + # V2 block-body lambdas with `let` do not exist in V1; inline each item's + # derived values through the .(capture -> body) named-capture form. + let items = $order.line_items.map_each(item -> $fulfillments.filter(f -> f.line_items.any(li -> li.id == item.id)).index(0).or(null).(ful -> {"sku": item.sku, "title": item.title, "quantity": item.quantity, "unit_price": item.price.number(), "discount": item.discount_allocations.map_each(d -> d.amount.number()).sum(), "tax_total": item.tax_lines.map_each(t -> t.price.number()).sum(), "net_price": ((item.price.number() * item.quantity - item.discount_allocations.map_each(d -> d.amount.number()).sum() + item.tax_lines.map_each(t -> t.price.number()).sum()) * 100).round() / 100, "weight_kg": (item.grams * item.quantity).number() / 1000.0, "fulfilled": ful != null, "tracking": ful.tracking_numbers.0 | null})) + + root.order_id = $order.id + root.order_date = $order.created_at.slice(0, 10) + root.customer_email = $order.email + root.currency = $order.currency + root.shipping_differs_from_billing = $bill.first_name != $ship.first_name || + $bill.last_name != $ship.last_name || + $bill.address1 != $ship.address1 || + $bill.city != $ship.city + root.ship_to = $ship.apply("format_address") + root.items = $items + root.unfulfilled_count = $items.filter(i -> !i.fulfilled).length() + root.total_weight_kg = ($items.map_each(i -> i.weight_kg).sum() * 100).round() / 100 + root.discount_codes_used = $order.discount_codes.map_each(d -> d.code) + output: + order_id: 5765806342 + order_date: "2024-01-15" + customer_email: "jane@example.com" + currency: "USD" + shipping_differs_from_billing: true + ship_to: "Jane Smith, 123 Fake Street, Faketown ON K2P 1L4, CA" + items: + - sku: "RLC-001" + title: "Red Leather Coat" + quantity: 1 + unit_price: 129.99 + discount: 13.0 + tax_total: 7.8 + net_price: 124.79 + weight_kg: 1.7 + fulfilled: true + tracking: "1Z999AA10123456784" + - sku: "BSS-001" + title: "Blue Suede Shoes" + quantity: 1 + unit_price: 85.95 + discount: 0.0 + tax_total: 0.0 + net_price: 85.95 + weight_kg: 0.75 + fulfilled: true + tracking: "1Z999AA10123456784" + - sku: "RB-001" + title: "Raspberry Beret" + quantity: 2 + unit_price: 19.99 + discount: 0.0 + tax_total: 2.4 + net_price: 42.38 + weight_kg: 0.64 + fulfilled: false + tracking: null + unfulfilled_count: 1 + total_weight_kg: 3.09 + discount_codes_used: ["FAKE30"] diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/ga4_clickstream.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/ga4_clickstream.yaml new file mode 100644 index 000000000..93cbbec56 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/ga4_clickstream.yaml @@ -0,0 +1,197 @@ +description: > + Google Analytics 4 clickstream normalization — flatten the BigQuery export + format's typed-value union event_params and user_properties into plain + objects, parse microsecond timestamps, build item category hierarchies, + and compute per-item discount percentages and subtotals. + +tests: + - name: "flatten GA4 purchase event from BigQuery export" + input: + event_date: "20240315" + event_name: "purchase" + event_timestamp: "1710505200000000" + event_params: + - key: "session_id" + value: {string_value: null, int_value: 1710504000, double_value: null} + - key: "page_location" + value: {string_value: "https://shop.example.com/checkout", int_value: null, double_value: null} + - key: "transaction_id" + value: {string_value: "TXN-8A3F", int_value: null, double_value: null} + - key: "value" + value: {string_value: null, int_value: null, double_value: 127.5} + - key: "currency" + value: {string_value: "USD", int_value: null, double_value: null} + - key: "shipping" + value: {string_value: null, int_value: null, double_value: 7.5} + - key: "tax" + value: {string_value: null, int_value: null, double_value: 10.0} + - key: "coupon" + value: {string_value: "SPRING24", int_value: null, double_value: null} + user_id: "user-9f8e7d6c" + user_properties: + - key: "membership_tier" + value: {string_value: "gold", double_value: null} + - key: "lifetime_value" + value: {string_value: null, double_value: 1842.3} + device: + category: "mobile" + mobile_brand_name: "Apple" + mobile_model_name: "iPhone 15" + operating_system: "iOS" + operating_system_version: "17.3.1" + web_info: + browser: "Safari" + browser_version: "17.3" + geo: + country: "United States" + region: "California" + city: "San Francisco" + traffic_source: + source: "instagram" + medium: "paid_social" + name: "spring_collection_2024" + items: + - item_id: "SKU-W-1042" + item_name: "Merino Wool Cardigan" + item_brand: "HouseLabel" + item_category: "Apparel" + item_category2: "Women" + item_category3: "Knitwear" + item_variant: "Oatmeal / M" + price: 89.0 + quantity: 1 + discount: 13.35 + item_list_name: "Spring Collection" + - item_id: "SKU-A-2087" + item_name: "Leather Crossbody Bag" + item_brand: "HouseLabel" + item_category: "Accessories" + item_category2: "Bags" + item_category3: "Crossbody" + item_variant: "Tan" + price: 65.0 + quantity: 1 + discount: 0.0 + item_list_name: "Recommended For You" + mapping: | + # Extract the single non-null typed value from a GA4 value object. + # V1 maps take one receiver (`this`); no parameterised maps. + map ga4_value { + root = match { + this.string_value != null => this.string_value, + this.int_value != null => this.int_value, + this.double_value != null => this.double_value, + _ => null, + } + } + + # Collapse KV arrays with typed-value unions into plain objects. + # V2 .map() → V1 .map_each(). + # V2 .collect() → V1 fold of {key, value} records into an object. + let params = this.event_params. + map_each(p -> {"key": p.key, "value": p.value.apply("ga4_value")}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + let user_props = this.user_properties. + map_each(p -> {"key": p.key, "value": p.value.apply("ga4_value")}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + + # Event timestamp (microseconds since epoch) as unix seconds. + # V1 core env (used by the migrator) has no ts_format; we emit the raw + # unix-seconds numeric value rather than formatting it. + let event_ts_seconds = this.event_timestamp.number() / 1000000 + + # Build item records with category hierarchy and computed fields. + # V2 block-body lambdas with `let` do not exist in V1; inline computations + # and use the .(capture -> body) form to reuse derived values. + let items = this.items.map_each(item -> [item.item_category, item.item_category2, item.item_category3].filter(c -> c != null && c != "").(cats -> {"sku": item.item_id, "name": item.item_name, "brand": item.item_brand, "categories": cats, "variant": item.item_variant, "price": item.price, "quantity": item.quantity, "discount": item.discount, "discount_pct": if item.price > 0.0 && item.discount > 0.0 { (item.discount / (item.price * item.quantity) * 1000.0).round() / 10 } else { 0.0 }, "subtotal": ((item.price * item.quantity - item.discount) * 100).round() / 100, "source_list": item.item_list_name})) + + root.event = this.event_name + root.timestamp_seconds = $event_ts_seconds + root.session_id = $params.session_id + root.user = { + "id": this.user_id, + "properties": $user_props, + } + root.page_url = $params.page_location + root.transaction = { + "id": $params.transaction_id, + "revenue": $params.value, + "shipping": $params.shipping, + "tax": $params.tax, + "coupon": $params.coupon, + "item_total": ($items.map_each(i -> i.subtotal).sum() * 100).round() / 100, + "total_discount": ($items.map_each(i -> i.discount).sum() * 100).round() / 100, + } + root.items = $items + root.device = { + "type": this.device.category, + "brand": this.device.mobile_brand_name, + "model": this.device.mobile_model_name, + "os": this.device.operating_system + " " + this.device.operating_system_version, + "browser": this.device.web_info.browser + " " + this.device.web_info.browser_version, + } + root.geo = { + "country": this.geo.country, + "region": this.geo.region, + "city": this.geo.city, + } + root.attribution = { + "source": this.traffic_source.source, + "medium": this.traffic_source.medium, + "campaign": this.traffic_source.name, + } + output: + event: "purchase" + timestamp_seconds: 1710505200.0 + session_id: 1710504000 + user: + id: "user-9f8e7d6c" + properties: + membership_tier: "gold" + lifetime_value: 1842.3 + page_url: "https://shop.example.com/checkout" + transaction: + id: "TXN-8A3F" + revenue: 127.5 + shipping: 7.5 + tax: 10.0 + coupon: "SPRING24" + item_total: 140.65 + total_discount: 13.35 + items: + - sku: "SKU-W-1042" + name: "Merino Wool Cardigan" + brand: "HouseLabel" + categories: ["Apparel", "Women", "Knitwear"] + variant: "Oatmeal / M" + price: 89.0 + quantity: 1 + discount: 13.35 + discount_pct: 15.0 + subtotal: 75.65 + source_list: "Spring Collection" + - sku: "SKU-A-2087" + name: "Leather Crossbody Bag" + brand: "HouseLabel" + categories: ["Accessories", "Bags", "Crossbody"] + variant: "Tan" + price: 65.0 + quantity: 1 + discount: 0.0 + discount_pct: 0.0 + subtotal: 65.0 + source_list: "Recommended For You" + device: + type: "mobile" + brand: "Apple" + model: "iPhone 15" + os: "iOS 17.3.1" + browser: "Safari 17.3" + geo: + country: "United States" + region: "California" + city: "San Francisco" + attribution: + source: "instagram" + medium: "paid_social" + campaign: "spring_collection_2024" diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/github_webhook.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/github_webhook.yaml new file mode 100644 index 000000000..5b9ff4289 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/github_webhook.yaml @@ -0,0 +1,96 @@ +description: > + GitHub pull request webhook normalization — extract fields from a deeply nested + PR event payload, parse issue references from the body with regex, merge + reviewer lists, categorize PR size, and build a notification-ready summary. + +tests: + - name: "normalize PR opened webhook into notification event" + input: + action: "opened" + number: 42 + pull_request: + title: "feat: add retry logic to payment processor" + body: "## Summary\nAdds exponential backoff.\n\nCloses #38\nRelated: #35, #40" + html_url: "https://github.com/acme/payments/pull/42" + state: "open" + draft: false + additions: 347 + deletions: 42 + changed_files: 8 + created_at: "2024-01-15T14:30:00Z" + user: + login: "alice-dev" + head: + ref: "feat/payment-retry" + base: + ref: "main" + labels: + - {name: "enhancement"} + - {name: "payments"} + - {name: "needs-review"} + requested_reviewers: + - {login: "bob-reviewer"} + - {login: "carol-lead"} + requested_teams: + - {name: "platform-team"} + mapping: | + let pr = this.pull_request + let url_parts = $pr.html_url.split("/") + let repo = $url_parts.3 + "/" + $url_parts.4 + + # Categorize by total lines changed + let total_changes = $pr.additions + $pr.deletions + let size_category = match { + $total_changes > 300 => "large", + $total_changes > 100 => "medium", + _ => "small", + } + + # Parse issue references (#NNN) from PR body, deduplicate and sort + let issue_refs = $pr.body.re_find_all("#\\d+"). + map_each(ref -> ref.trim_prefix("#").number()). + sort(). + unique() + + # Merge individual reviewers and team reviewers into one sorted list + # V2 .concat(...) → V1 .merge(...) concatenates arrays + let reviewers = $pr.requested_reviewers.map_each(r -> r.login). + merge($pr.requested_teams.map_each(t -> t.name)). + sort() + + root.event_type = "pr_" + this.action + root.repo = $repo + root.pr_number = this.number + root.title = $pr.title + root.author = $pr.user.login + root.url = $pr.html_url + root.branch = $pr.head.ref + " -> " + $pr.base.ref + root.labels = $pr.labels.map_each(l -> l.name).sort() + root.reviewers = $reviewers + root.size = { + "additions": $pr.additions, + "deletions": $pr.deletions, + "files": $pr.changed_files, + "category": $size_category, + } + root.referenced_issues = $issue_refs + root.is_feature = $pr.title.has_prefix("feat") + root.summary = "[" + $repo + "] " + $pr.user.login + " " + this.action + " #" + this.number.string() + ": " + $pr.title + " (" + $size_category + ", " + $pr.changed_files.string() + " files)" + output: + event_type: "pr_opened" + repo: "acme/payments" + pr_number: 42 + title: "feat: add retry logic to payment processor" + author: "alice-dev" + url: "https://github.com/acme/payments/pull/42" + branch: "feat/payment-retry -> main" + labels: ["enhancement", "needs-review", "payments"] + reviewers: ["bob-reviewer", "carol-lead", "platform-team"] + size: + additions: 347 + deletions: 42 + files: 8 + category: "large" + referenced_issues: [35.0, 38.0, 40.0] # V1 .number() always returns float64 + is_feature: true + summary: "[acme/payments] alice-dev opened #42: feat: add retry logic to payment processor (large, 8 files)" diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/kubernetes_pod.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/kubernetes_pod.yaml new file mode 100644 index 000000000..1259435b1 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/kubernetes_pod.yaml @@ -0,0 +1,141 @@ +description: > + Kubernetes pod health alert — zip container spec with status by name to + correlate resource limits with failure reasons, detect OOMKilled containers, + build a condition summary from the conditions array, and set output metadata + for alert routing. + +tests: + - name: "build alert from unhealthy pod with OOMKilled container" + input: + metadata: + name: "order-processor-7b4f8d6c9-xk2lp" + namespace: "production" + labels: + app.kubernetes.io/name: "order-processor" + app.kubernetes.io/version: "2.4.1" + creationTimestamp: "2024-03-15T08:22:05Z" + spec: + nodeName: "ip-10-0-47-132.ec2.internal" + containers: + - name: "app" + image: "registry.internal/order-processor:2.4.1" + resources: + limits: {cpu: "1000m", memory: "1Gi"} + - name: "sidecar" + image: "envoyproxy/envoy:v1.28.0" + resources: + limits: {cpu: "500m", memory: "256Mi"} + status: + phase: "Running" + conditions: + - {type: "Initialized", status: "True"} + - {type: "ContainersReady", status: "False", reason: "ContainersNotReady"} + - {type: "Ready", status: "False", reason: "ContainersNotReady"} + startTime: "2024-03-15T08:22:05Z" + containerStatuses: + - name: "app" + ready: false + restartCount: 4 + started: false + state: + waiting: {reason: "CrashLoopBackOff"} + lastState: + terminated: + exitCode: 137 + reason: "OOMKilled" + startedAt: "2024-03-15T09:11:45Z" + finishedAt: "2024-03-15T09:14:32Z" + - name: "sidecar" + ready: true + restartCount: 0 + started: true + state: + running: {startedAt: "2024-03-15T08:22:20Z"} + lastState: {} + mapping: | + let meta = this.metadata + let spec = this.spec + let status = this.status + + # For each unhealthy container, correlate with spec and describe state. + # V2 block-body lambdas with inline `let` do not exist in V1; inline all + # intermediate computations instead. Uses the .(capture -> ...) named-capture + # form (§5.4) to bind a looked-up container for reuse in the object body. + # V2 .find(lambda) → V1 .filter(lambda).index(0) (null when no match). + # V2 .has_key(k) → V1 .keys().contains(k). + # V2 ?.x → V1 .x | null (coalesce catches null and errors). + let unhealthy = $status.containerStatuses. + filter(cs -> !cs.ready). + map_each(cs -> $spec.containers.filter(c -> c.name == cs.name).index(0).or(null).(spec_c -> {"name": cs.name, "state": match { cs.state.keys().contains("waiting") => cs.state.waiting.reason, cs.state.keys().contains("terminated") => cs.state.terminated.reason, _ => "Running" }, "last_exit_code": if cs.lastState.keys().contains("terminated") { cs.lastState.terminated.exitCode } else { null }, "last_exit_reason": if cs.lastState.keys().contains("terminated") { cs.lastState.terminated.reason } else { null }, "restart_count": cs.restartCount, "memory_limit": spec_c.resources.limits.memory | null, "cpu_limit": spec_c.resources.limits.cpu | null, "oom_suspected": cs.lastState.keys().contains("terminated") && (cs.lastState.terminated.reason | "") == "OOMKilled"})) + + # Convert conditions array to a boolean summary object. + # V2 .collect() → V1 fold {key,value} records into an object. + let condition_summary = $status.conditions. + map_each(c -> {"key": c.type, "value": c.status == "True"}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + + # Severity: critical if any OOM or high restart count + let severity = if $unhealthy.any(c -> c.oom_suspected || c.restart_count > 3) { + "critical" + } else if $unhealthy.length() > 0 { + "warning" + } else { + "ok" + } + + root.alert = { + "severity": $severity, + "source": "k8s-pod-monitor", + } + root.pod = { + "name": $meta.name, + "namespace": $meta.namespace, + "node": $spec.nodeName, + "app": $meta.labels."app.kubernetes.io/name", + "version": $meta.labels."app.kubernetes.io/version", + "phase": $status.phase, + } + root.unhealthy_containers = $unhealthy + root.condition_summary = $condition_summary + root.not_ready_reason = $status.conditions. + filter(c -> c.type == "Ready" && c.status != "True"). + index(0).reason | null + root.healthy_container_count = $status.containerStatuses. + filter(cs -> cs.ready).length() + root.total_container_count = $status.containerStatuses.length() + + # V2 output@.x → V1 meta x + meta alert_severity = $severity + meta k8s_namespace = $meta.namespace + meta k8s_pod = $meta.name + output: + alert: + severity: "critical" + source: "k8s-pod-monitor" + pod: + name: "order-processor-7b4f8d6c9-xk2lp" + namespace: "production" + node: "ip-10-0-47-132.ec2.internal" + app: "order-processor" + version: "2.4.1" + phase: "Running" + unhealthy_containers: + - name: "app" + state: "CrashLoopBackOff" + last_exit_code: 137 + last_exit_reason: "OOMKilled" + restart_count: 4 + memory_limit: "1Gi" + cpu_limit: "1000m" + oom_suspected: true + condition_summary: + Initialized: true + ContainersReady: false + Ready: false + not_ready_reason: "ContainersNotReady" + healthy_container_count: 1 + total_container_count: 2 + output_metadata: + alert_severity: "critical" + k8s_namespace: "production" + k8s_pod: "order-processor-7b4f8d6c9-xk2lp" diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/nlp_enrichment.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/nlp_enrichment.yaml new file mode 100644 index 000000000..8437c6273 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/nlp_enrichment.yaml @@ -0,0 +1,82 @@ +description: > + NLP enrichment merge — combine entity recognition and key phrase extraction + results into a unified structure. Groups entities by type, correlates key + phrases with overlapping entities by character offset, and ranks entities + by confidence score. + +tests: + - name: "merge NER entities and key phrases into enriched document" + input: + source_text: "Bob ordered two sandwiches from Seattle Deli on January 5th for $24.99" + entities: + - {text: "Bob", score: 0.997, type: "PERSON", begin_offset: 0, end_offset: 3} + - {text: "two", score: 0.95, type: "QUANTITY", begin_offset: 12, end_offset: 15} + - {text: "Seattle Deli", score: 0.988, type: "ORGANIZATION", begin_offset: 33, end_offset: 45} + - {text: "January 5th", score: 0.999, type: "DATE", begin_offset: 49, end_offset: 60} + - {text: "$24.99", score: 0.93, type: "QUANTITY", begin_offset: 65, end_offset: 71} + sentiment: + label: "NEUTRAL" + scores: {positive: 0.12, negative: 0.03, neutral: 0.84, mixed: 0.01} + key_phrases: + - {text: "two sandwiches", score: 0.99, begin_offset: 12, end_offset: 27} + - {text: "Seattle Deli", score: 0.98, begin_offset: 33, end_offset: 45} + - {text: "January 5th", score: 0.97, begin_offset: 49, end_offset: 60} + mapping: | + let entities = this.entities + + # Group entities by type using unique types + filter. + # V2 .collect() → V1 fold {key,value} records into an object. + let entity_types = $entities.map_each(e -> e.type).unique() + let entities_by_type = $entity_types. + map_each(t -> { + "key": t, + "value": $entities.filter(e -> e.type == t).map_each(e -> e.text), + }). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + + # For each key phrase, find the first entity whose character span overlaps. + # V2 .find(lambda) → V1 .filter(lambda).index(0) (returns null if none). + # V1 lambdas must put body on the same line as `->`. + let enriched_phrases = this.key_phrases.map_each(kp -> { + "phrase": kp.text, + "overlapping_entity_type": $entities.filter(e -> e.begin_offset < kp.end_offset && e.end_offset > kp.begin_offset).index(0).type.catch(null), + }) + + # High-confidence entities (>= 0.95), sorted by score descending + # V1 `reverse` only applies to strings, so sort descending by negating + # the sort key rather than reversing the result array. + let high_conf = $entities. + filter(e -> e.score >= 0.95). + sort_by(e -> -e.score). + map_each(e -> {"text": e.text, "type": e.type, "score": e.score}) + + # Pick the dominant sentiment score + let sent = this.sentiment + + root.text = this.source_text + root.sentiment = $sent.label + root.sentiment_confidence = $sent.scores.values().max() + root.entities_by_type = $entities_by_type + root.enriched_phrases = $enriched_phrases + root.high_confidence_entities = $high_conf + output: + text: "Bob ordered two sandwiches from Seattle Deli on January 5th for $24.99" + sentiment: "NEUTRAL" + sentiment_confidence: 0.84 + entities_by_type: + PERSON: ["Bob"] + QUANTITY: ["two", "$24.99"] + ORGANIZATION: ["Seattle Deli"] + DATE: ["January 5th"] + enriched_phrases: + - phrase: "two sandwiches" + overlapping_entity_type: "QUANTITY" + - phrase: "Seattle Deli" + overlapping_entity_type: "ORGANIZATION" + - phrase: "January 5th" + overlapping_entity_type: "DATE" + high_confidence_entities: + - {text: "January 5th", type: "DATE", score: 0.999} + - {text: "Bob", type: "PERSON", score: 0.997} + - {text: "Seattle Deli", type: "ORGANIZATION", score: 0.988} + - {text: "two", type: "QUANTITY", score: 0.95} diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/otel_traces.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/otel_traces.yaml new file mode 100644 index 000000000..a57ef9c18 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/otel_traces.yaml @@ -0,0 +1,189 @@ +description: > + OpenTelemetry trace flattening — collapse the deeply nested OTLP export + format (resourceSpans -> scopeSpans -> spans) into flat span records. + Denormalizes resource attributes, converts the OTLP key-value attribute + model to plain objects, maps numeric span kinds to names, computes + durations from nanosecond timestamps, and extracts error info from events. + +tests: + - name: "flatten OTLP trace export into span records" + input: + resourceSpans: + - resource: + attributes: + - key: "service.name" + value: {stringValue: "api-gateway"} + - key: "service.version" + value: {stringValue: "2.4.1"} + - key: "deployment.environment" + value: {stringValue: "production"} + scopeSpans: + - spans: + - traceId: "5B8EFFF798038103D269B633813FC60C" + spanId: "EEE19B7EC3C1B174" + parentSpanId: "" + name: "POST /api/v1/orders" + kind: 2 + startTimeUnixNano: "1705312260000000000" + endTimeUnixNano: "1705312260345000000" + attributes: + - key: "http.method" + value: {stringValue: "POST"} + - key: "http.url" + value: {stringValue: "/api/v1/orders"} + - key: "http.status_code" + value: {intValue: "201"} + events: + - name: "auth.verified" + attributes: + - key: "auth.method" + value: {stringValue: "jwt"} + - name: "exception" + attributes: + - key: "exception.message" + value: {stringValue: "Deprecated field used"} + status: {code: 1} + - resource: + attributes: + - key: "service.name" + value: {stringValue: "order-service"} + - key: "service.version" + value: {stringValue: "3.1.0"} + - key: "deployment.environment" + value: {stringValue: "production"} + scopeSpans: + - spans: + - traceId: "5B8EFFF798038103D269B633813FC60C" + spanId: "AAA19B7EC3C1B175" + parentSpanId: "EEE19B7EC3C1B174" + name: "OrderService/CreateOrder" + kind: 2 + startTimeUnixNano: "1705312260050000000" + endTimeUnixNano: "1705312260300000000" + attributes: + - key: "rpc.system" + value: {stringValue: "grpc"} + - key: "rpc.method" + value: {stringValue: "CreateOrder"} + events: [] + status: {code: 1} + - traceId: "5B8EFFF798038103D269B633813FC60C" + spanId: "BBB19B7EC3C1B176" + parentSpanId: "AAA19B7EC3C1B175" + name: "postgres.query" + kind: 3 + startTimeUnixNano: "1705312260100000000" + endTimeUnixNano: "1705312260250000000" + attributes: + - key: "db.system" + value: {stringValue: "postgresql"} + - key: "db.statement" + value: {stringValue: "INSERT INTO orders (id, customer_id, total) VALUES ($1, $2, $3)"} + events: [] + status: {code: 1} + mapping: | + # V1 maps take a single receiver (`this`); V2 parameterised maps become + # single-receiver maps invoked via .apply, with the argument as `this`. + + # Extract the typed value from an OTLP attribute value object. + # V2 .has_key(k) → V1 .keys().contains(k). + # V2 .int64() → V1 .number(). + map extract_value { + root = match { + this.keys().contains("stringValue") => this.stringValue, + this.keys().contains("intValue") => this.intValue.number(), + _ => null, + } + } + + # Convert an OTLP attribute array [{key, value}, ...] to a plain object. + # V2 .map().collect() → V1 .map_each() + fold. + map attrs_to_object { + root = this.map_each(a -> {"key": a.key, "value": a.value.apply("extract_value")}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) + } + + # Map OTLP numeric span kind to its name. + map span_kind_name { + root = match this { + 1 => "INTERNAL", + 2 => "SERVER", + 3 => "CLIENT", + 4 => "PRODUCER", + 5 => "CONSUMER", + _ => "UNSPECIFIED", + } + } + + # Flatten: resourceSpans -> scopeSpans -> spans, denormalizing resource + # attributes onto each span. + # V2 block-body lambdas with `let` do not exist in V1; use the + # .(capture -> body) form to thread derived values through nested lambdas. + # V2 .find(lambda) → V1 .filter(lambda).index(0) (null when no match). + # V2 .int64() → V1 .number(). + root = this.resourceSpans.map_each(rs -> rs.resource.attributes.apply("attrs_to_object").(res_attrs -> rs.scopeSpans.map_each(ss -> ss.spans.map_each(span -> span.attributes.apply("attrs_to_object").(attrs -> span.events.filter(e -> e.name == "exception").index(0).or(null).(error_event -> { + "trace_id": span.traceId, + "span_id": span.spanId, + "parent_span_id": if span.parentSpanId == "" { null } else { span.parentSpanId }, + "service": res_attrs."service.name", + "service_version": res_attrs."service.version", + "environment": res_attrs."deployment.environment", + "operation": span.name, + "span_kind": span.kind.apply("span_kind_name"), + "duration_ms": (span.endTimeUnixNano.number() / 1000000.0 - span.startTimeUnixNano.number() / 1000000.0).round(), + "attributes": attrs, + "event_names": span.events.map_each(e -> e.name), + "has_error_event": span.events.any(e -> e.name == "exception"), + "error_message": if error_event != null { error_event.attributes.apply("attrs_to_object")."exception.message" | null } else { null }, + "status_ok": span.status.code == 1, + })))).flatten())).flatten() + output: + - trace_id: "5B8EFFF798038103D269B633813FC60C" + span_id: "EEE19B7EC3C1B174" + parent_span_id: null + service: "api-gateway" + service_version: "2.4.1" + environment: "production" + operation: "POST /api/v1/orders" + span_kind: "SERVER" + duration_ms: 345 + attributes: + http.method: "POST" + http.url: "/api/v1/orders" + http.status_code: 201.0 # V1 .number() returns float64 + event_names: ["auth.verified", "exception"] + has_error_event: true + error_message: "Deprecated field used" + status_ok: true + - trace_id: "5B8EFFF798038103D269B633813FC60C" + span_id: "AAA19B7EC3C1B175" + parent_span_id: "EEE19B7EC3C1B174" + service: "order-service" + service_version: "3.1.0" + environment: "production" + operation: "OrderService/CreateOrder" + span_kind: "SERVER" + duration_ms: 250 + attributes: + rpc.system: "grpc" + rpc.method: "CreateOrder" + event_names: [] + has_error_event: false + error_message: null + status_ok: true + - trace_id: "5B8EFFF798038103D269B633813FC60C" + span_id: "BBB19B7EC3C1B176" + parent_span_id: "AAA19B7EC3C1B175" + service: "order-service" + service_version: "3.1.0" + environment: "production" + operation: "postgres.query" + span_kind: "CLIENT" + duration_ms: 150 + attributes: + db.system: "postgresql" + db.statement: "INSERT INTO orders (id, customer_id, total) VALUES ($1, $2, $3)" + event_names: [] + has_error_event: false + error_message: null + status_ok: true diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/stripe_invoice.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/stripe_invoice.yaml new file mode 100644 index 000000000..3d1564052 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/stripe_invoice.yaml @@ -0,0 +1,114 @@ +description: > + Stripe invoice webhook normalization — parse embedded JSON from metadata, + convert Unix timestamps to RFC 3339, convert amounts from cents to dollars, + transform object keys, and restructure for a billing data warehouse. + +tests: + - name: "normalize Stripe invoice.paid event" + input: + type: "invoice.paid" + created: 1709251200 + data: + object: + id: "in_1OqR3m" + number: "INV-2024-0218" + customer: "cus_PaB3xK" + customer_email: "ops@megacorp.io" + customer_name: "MegaCorp Engineering" + metadata: + internal_account_id: "acct-00482" + provisioning: '{"tier":"growth","seats":25,"features":["sso","audit_log"]}' + salesforce_opp_id: "006Dn000002XLPQ" + status: "paid" + subscription: "sub_1NrT7a" + currency: "usd" + subtotal: 14900 + tax: 1200 + total: 16100 + status_transitions: + paid_at: 1709251200 + voided_at: null + lines: + data: + - amount: 9900 + description: "Growth Plan" + quantity: 1 + product: "prod_Growth" + - amount: 5000 + description: "Extra seats" + quantity: 5 + product: "prod_Seats" + mapping: | + let inv = this.data.object + + # Parse embedded JSON from the provisioning metadata field + let provisioning = $inv.metadata.provisioning.parse_json() + + # Convert each line item: cents to dollars. + # V2 .float64() → V1 .number(); division produces float regardless. + let line_items = $inv.lines.data.map_each(item -> { + "description": item.description, + "amount_dollars": item.amount.number() / 100.0, + "quantity": item.quantity, + "product_id": item.product, + }) + + root.invoice_id = $inv.id + root.invoice_number = $inv.number + root.event_type = this.type + root.customer = { + "id": $inv.customer, + "name": $inv.customer_name, + "email": $inv.customer_email, + } + root.provisioning = $provisioning + root.currency = $inv.currency.uppercase() + root.subtotal_dollars = $inv.subtotal.number() / 100.0 + root.tax_dollars = $inv.tax.number() / 100.0 + root.total_dollars = $inv.total.number() / 100.0 + root.line_items = $line_items + root.status = $inv.status + + # Paid-at unix seconds; V1 core env (used by migrator) has no ts_format, + # so we emit the raw numeric timestamp rather than an RFC 3339 string. + root.paid_at_unix = $inv.status_transitions.paid_at + + root.subscription_id = $inv.subscription + + # Strip the parsed provisioning field, then kebab-case the remaining keys. + # V1 has no map_keys method — use fold to rebuild the object. + root.external_refs = $inv.metadata. + without("provisioning"). + key_values(). + fold({}, item -> item.tally.merge({(item.value.key.replace_all("_", "-")): item.value.value})) + output: + invoice_id: "in_1OqR3m" + invoice_number: "INV-2024-0218" + event_type: "invoice.paid" + customer: + id: "cus_PaB3xK" + name: "MegaCorp Engineering" + email: "ops@megacorp.io" + provisioning: + tier: "growth" + seats: 25.0 # V1 parse_json decodes JSON numbers as float64 + features: ["sso", "audit_log"] + currency: "USD" + subtotal_dollars: 149.0 + tax_dollars: 12.0 + total_dollars: 161.0 + line_items: + - description: "Growth Plan" + amount_dollars: 99.0 + quantity: 1 + product_id: "prod_Growth" + - description: "Extra seats" + amount_dollars: 50.0 + quantity: 5 + product_id: "prod_Seats" + status: "paid" + paid_at_unix: 1709251200 + subscription_id: "sub_1NrT7a" + external_refs: + internal-account-id: "acct-00482" + salesforce-opp-id: "006Dn000002XLPQ" diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/v2_feature_showcase.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/v2_feature_showcase.yaml new file mode 100644 index 000000000..8400cc68b --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/v2_feature_showcase.yaml @@ -0,0 +1,18 @@ +description: > + Bloblang V2 feature showcase — demonstrates the major syntax and semantic + improvements over V1 in a single mapping: null-safe navigation, lambda + expressions, parameterized maps with defaults and named arguments, all three + match expression forms, separated error handling (.catch vs .or), metadata + access, and variable scoping rules. + +tests: + - name: "V2 feature showcase: SaaS user event processing" + skip: > + This test is explicitly a V2 showcase — by design, nearly every section + relies on a V2-only feature with no direct V1 equivalent. Preserved as a + single skipped case rather than split because the original is one test + with one output: parameterised maps with defaults / named args, match + with `as` binding, .catch(err -> ...) lambda form, lexical variable + scoping inside if-expressions, and null-safe ?. chaining are all V2-only. + The inline v1: comments in the source describe the closest V1 equivalent + for each section. diff --git a/internal/bloblang2/migrator/v1spec/tests/case_studies/vpc_flow_logs.yaml b/internal/bloblang2/migrator/v1spec/tests/case_studies/vpc_flow_logs.yaml new file mode 100644 index 000000000..1fe98501b --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/case_studies/vpc_flow_logs.yaml @@ -0,0 +1,145 @@ +description: > + VPC Flow Logs parsing and enrichment — split space-delimited log records into + structured objects, map protocol numbers to names, classify IPs by address + range, identify well-known port services, detect anomalies (rejected SSH), + compute per-flow throughput, filter NODATA records, and aggregate a summary. + +tests: + - name: "parse and enrich VPC flow log batch" + input: + owner: "123456789012" + logGroup: "/aws/vpc/flowlogs/vpc-0a1b2c3d" + logStream: "eni-0f1a2b3c-all" + logEvents: + - message: "2 123456789012 eni-0f1a2b3c 10.0.1.47 203.0.113.50 44832 443 6 12 1680 1710505200 1710505260 ACCEPT OK" + - message: "2 123456789012 eni-0f1a2b3c 198.51.100.22 10.0.1.47 55912 22 6 847 52140 1710505200 1710505260 REJECT OK" + - message: "2 123456789012 eni-0f1a2b3c 10.0.1.47 10.0.2.83 38210 5432 6 245 89200 1710505200 1710505260 ACCEPT OK" + - message: "2 123456789012 eni-0f1a2b3c - - - - - - - 1710505260 1710505320 - NODATA" + mapping: | + # V1 maps take a single receiver (`this`); V2 parameterised maps become + # single-receiver maps invoked via .apply. + map protocol_name { + root = match this { + "6" => "TCP", + "17" => "UDP", + "1" => "ICMP", + _ => "proto-" + this, + } + } + + map port_service { + root = match this { + "443" => "HTTPS", + "80" => "HTTP", + "22" => "SSH", + "5432" => "PostgreSQL", + "3306" => "MySQL", + _ => null, + } + } + + map classify_ip { + root = match { + this.has_prefix("10.") => "internal", + this.has_prefix("172.") => "internal", + this.has_prefix("192.168.") => "internal", + this.has_prefix("169.254.") => "link-local", + _ => "external", + } + } + + let nodata_count = this.logEvents. + filter(e -> e.message.has_suffix("NODATA")).length() + + # Parse space-delimited records, dropping NODATA. + # V2 $f[3] (bracket index into variable) → V1 $f.3 (literal numeric path). + # V2 block-body lambdas with `let` → V1 chain named-capture .(name -> ...) + # forms to bind multiple derived values. + # V2 .int64() → V1 .number(). + # V2 .split(" ")[-1] (negative index) → V1 .split(" ").index(-1). + # V1 lambda bodies must start on the same line as `->`. Deeply nested + # .(name -> body) named captures are chained inline. + let flows = this.logEvents. + filter(e -> !e.message.has_suffix("NODATA")). + map_each(e -> e.message.split(" ").(f -> f.3.apply("classify_ip").(src_class -> f.4.apply("classify_ip").(dst_class -> f.7.apply("protocol_name").(proto -> (f.6.apply("port_service") | f.5.apply("port_service") | null).(service -> (match { src_class == "internal" && dst_class == "external" => "outbound", src_class == "external" && dst_class == "internal" => "inbound", _ => "internal" }).(direction -> { + "protocol": proto, + "src_addr": f.3, + "dst_addr": f.4, + "src_port": f.5.number(), + "dst_port": f.6.number(), + "service": service, + "direction": direction, + "traffic_class": match direction { "outbound" => dst_class, "inbound" => src_class, _ => "internal" }, + "action": f.12, + "packets": f.8.number(), + "bytes": f.9.number(), + # V1 .round() takes no args; scale then round and divide to keep 1 decimal. + "bytes_per_sec": (f.9.number() / (f.11.number() - f.10.number()) * 10).round() / 10, + "anomaly": f.12 == "REJECT" && service == "SSH", + }))))))) + + root.vpc_id = this.logGroup.split("/").index(-1) + root.account_id = this.owner + root.interface_id = this.logStream.split("-all").0 + root.flows = $flows + root.summary = { + "total_flows": $flows.length(), + "accepted": $flows.filter(r -> r.action == "ACCEPT").length(), + "rejected": $flows.filter(r -> r.action == "REJECT").length(), + "anomalies": $flows.filter(r -> r.anomaly).length(), + "total_bytes": $flows.map_each(r -> r.bytes).sum(), + "nodata_dropped": $nodata_count, + } + output: + vpc_id: "vpc-0a1b2c3d" + account_id: "123456789012" + interface_id: "eni-0f1a2b3c" + flows: + # V1 .number() always returns float64 — port, packet, and byte counts + # that we parse via .number() are floats here (not ints). + - protocol: "TCP" + src_addr: "10.0.1.47" + dst_addr: "203.0.113.50" + src_port: 44832.0 + dst_port: 443.0 + service: "HTTPS" + direction: "outbound" + traffic_class: "external" + action: "ACCEPT" + packets: 12.0 + bytes: 1680.0 + bytes_per_sec: 28.0 + anomaly: false + - protocol: "TCP" + src_addr: "198.51.100.22" + dst_addr: "10.0.1.47" + src_port: 55912.0 + dst_port: 22.0 + service: "SSH" + direction: "inbound" + traffic_class: "external" + action: "REJECT" + packets: 847.0 + bytes: 52140.0 + bytes_per_sec: 869.0 + anomaly: true + - protocol: "TCP" + src_addr: "10.0.1.47" + dst_addr: "10.0.2.83" + src_port: 38210.0 + dst_port: 5432.0 + service: "PostgreSQL" + direction: "internal" + traffic_class: "internal" + action: "ACCEPT" + packets: 245.0 + bytes: 89200.0 + bytes_per_sec: 1486.7 + anomaly: false + summary: + total_flows: 3 + accepted: 2 + rejected: 1 + anomalies: 1 + total_bytes: 143020.0 # sum of floats remains a float + nodata_dropped: 1 diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/block_scoping.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/block_scoping.yaml new file mode 100644 index 000000000..b0dd0f564 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/block_scoping.yaml @@ -0,0 +1,149 @@ +description: "Block scoping — V1 has NO block scope for let variables; most tests intentionally behave differently from V2." + +tests: + # --- Expression context: shadowing (if expression) --- + # V1 has no block scope, so "shadowing" is really mutation — $value leaks out. + + - name: "if expression shadows outer variable" + skip: "V1 does not block-scope let variables — $value reassignment inside if expression leaks to outer scope" + + - name: "if expression shadow does not leak to else branch" + mapping: | + let value = 10 + root.result = if false { + (let value = 20) | $value + } else { + $value + } + skip: "V1 does not block-scope let variables; V1 if-expression branches are single expressions not statement lists, so inline let is not supported" + + - name: "if expression shadow in else branch" + skip: "V1 does not block-scope let variables, and V1 if-expression branches are single expressions" + + # --- Statement context: mutation (if statement) --- + + - name: "if statement mutates outer variable" + mapping: | + let value = 10 + if true { + let value = 20 + } + root.result = $value + output: {"result": 20} + + - name: "if statement mutation persists after block" + mapping: | + let x = 1 + let y = 2 + if true { + let x = 10 + let y = 20 + } + root.x = $x + root.y = $y + output: {"x": 10, "y": 20} + + - name: "if statement mutation only happens if condition is true" + mapping: | + let value = "original" + if false { + let value = "changed" + } + root.result = $value + output: {"result": "original"} + + # --- New variables in statement context are block-scoped --- + # V1 does NOT block-scope let — the variable DOES leak out. + + - name: "new variable in if statement not visible outside" + mapping: | + if true { + let new_var = 42 + } + root.result = $new_var + output: {"result": 42} + + - name: "new variable in else branch not visible outside" + mapping: | + if false { + let a = 1 + } else { + let b = 2 + } + root.result = $b + output: {"result": 2} + + - name: "variable in one branch not visible in sibling branch" + mapping: | + if false { + let x = 10 + } else { + root.result = $x + } + error: "variable 'x' undefined" + + # --- Expression context: match expression shadows --- + + - name: "match expression shadows outer variable" + skip: "V1 match-expression arms are single expressions, not statement lists; no inline let inside arm body" + + # --- Statement context: match statement mutates --- + + - name: "match statement mutates outer variable" + skip: "V1 match is an expression only — there is no statement form that holds a statement list per arm" + + # --- Nested scopes --- + + - name: "nested if expressions each shadow independently" + skip: "V1 does not block-scope let variables, and V1 if-expression branches are single expressions" + + - name: "if statement inside expression body is compile error" + skip: "V1 if-expression branches are always single expressions; nesting an if-statement inside is not even representable" + + # --- Variables declared in block not accessible after --- + + - name: "variable from match case not visible after match" + skip: "V1 match arms are single expressions; a let inside a match arm body is not expressible" + + # --- Else-if scoping --- + + - name: "else-if branches have independent scopes" + mapping: | + let result = "none" + if false { + let result = "first" + } else if true { + let result = "second" + let local = "only here" + } else { + let result = "third" + } + root.result = $result + root.local = $local + output: {"result": "second", "local": "only here"} + + - name: "else-if statement mutates outer variable from middle branch" + mapping: | + let result = "none" + if false { + let result = "first" + } else if true { + let result = "second" + } else { + let result = "third" + } + root.result = $result + output: {"result": "second"} + + # --- Deep nesting --- + + - name: "three levels of nesting with correct scoping" + skip: "V1 does not block-scope let variables; V1 if-expression branches are single expressions" + + # --- Match-as binding scope --- + + - name: "match-as binding not visible after match expression" + skip: "V1 has no match-as syntax" + + - name: "match-as binding does not conflict with variable of different naming" + skip: "V1 has no match-as syntax" diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/if_else_chains.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/if_else_chains.yaml new file mode 100644 index 000000000..459d04f75 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/if_else_chains.yaml @@ -0,0 +1,115 @@ +description: > + If-else chains, nested conditionals, if-expression in various contexts, + and empty body handling. + +tests: + # --- if-else if-else chains --- + + - name: "if-else if-else first branch" + mapping: | + root.v = if true { "first" } else if true { "second" } else { "third" } + output: {"v": "first"} + + - name: "if-else if-else second branch" + mapping: | + root.v = if false { "first" } else if true { "second" } else { "third" } + output: {"v": "second"} + + - name: "if-else if-else third branch" + mapping: | + root.v = if false { "first" } else if false { "second" } else { "third" } + output: {"v": "third"} + + - name: "long if-else if chain" + mapping: | + let x = 3 + root.v = if $x == 1 { "one" } else if $x == 2 { "two" } else if $x == 3 { "three" } else { "other" } + output: {"v": "three"} + + # --- if-else if without final else produces void --- + # V1 if-expression without matching else returns null, not void. Assigning null writes the field. + + - name: "if-else if without else — void when none match" + mapping: | + root.v = "default" + root.v = if false { "a" } else if false { "b" } + output: {"v": "default"} + + - name: "if-else if without else — first matches" + mapping: | + root.v = if true { "a" } else if false { "b" } + output: {"v": "a"} + + - name: "if-else if without else — second matches" + mapping: | + root.v = if false { "a" } else if true { "b" } + output: {"v": "b"} + + # --- Nested if expressions --- + + - name: "nested if in then branch" + mapping: | + let x = 5 + root.v = if $x > 0 { + if $x > 10 { "big" } else { "small" } + } else { "negative" } + output: {"v": "small"} + + - name: "nested if in else branch" + mapping: | + let x = -5 + root.v = if $x > 0 { "positive" } else { + if $x == 0 { "zero" } else { "negative" } + } + output: {"v": "negative"} + + # --- If-expression in various contexts --- + + - name: "if-expression as map argument" + skip: "V1 named maps do not take function-style parameters; maps receive a value via .apply(\"name\")" + + - name: "if-expression in array literal" + mapping: | + root.v = [1, if true { 2 } else { 20 }, 3] + output: {"v": [1, 2, 3]} + + - name: "if-expression in object literal" + mapping: | + root.v = {"status": if true { "ok" } else { "error" }} + output: {"v": {"status": "ok"}} + + - name: "if-expression in method argument" + mapping: | + root.v = [3, 1, 2].sort_by(x -> if true { x } else { -x }) + output: {"v": [1, 2, 3]} + + # --- If-statement with empty body --- + + - name: "if-statement with empty body is no-op" + mapping: | + root.v = "unchanged" + if true { } + output: {"v": "unchanged"} + + - name: "if-statement empty body, else has content" + mapping: | + root.v = "init" + if false { } else { + root.v = "else ran" + } + output: {"v": "else ran"} + + # --- Boolean coercion not implicit --- + + - name: "non-boolean condition is error" + mapping: | + root.v = if "truthy" { "yes" } else { "no" } + error: "bool" # FIXME-v1: verify + + - name: "integer condition is error" + mapping: | + root.v = if 1 { "yes" } else { "no" } + error: "bool" # FIXME-v1: verify + + - name: "null condition is error" + skip: "V1 treats null as falsy in if conditions (no error raised)" diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/if_expression.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/if_expression.yaml new file mode 100644 index 000000000..a7fe7dcd5 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/if_expression.yaml @@ -0,0 +1,133 @@ +description: "If as expression — returns value, with/without else, else-if chains, void behavior" + +tests: + # --- Basic if-else expression --- + + - name: "if true with else returns then branch" + mapping: | + root.result = if true { "yes" } else { "no" } + output: {"result": "yes"} + + - name: "if false with else returns else branch" + mapping: | + root.result = if true == false { "yes" } else { "no" } + output: {"result": "no"} + + - name: "if expression returns computed value" + input: {"score": 95} + mapping: | + root.grade = if this.score >= 90 { "A" } else { "B" } + output: {"grade": "A"} + + # --- Without else: void behavior --- + # V1 returns null when no branch matches, not a void sentinel. So assignment writes null (not skipped). + + - name: "if without else produces void when false — assignment skipped, field absent" + input: {} + mapping: | + root.x = if false { "hello" } + output: {} + + - name: "if without else produces value when true" + mapping: | + root.x = if true { "hello" } + output: {"x": "hello"} + + - name: "void preserves prior output value" + mapping: | + root.status = "pending" + root.status = if false { "override" } + output: {"status": "pending"} + + # --- Else-if chains --- + + - name: "else-if tier classification" + mapping: | + root.tier = if this.score >= 90 { + "gold" + } else if this.score >= 70 { + "silver" + } else { + "bronze" + } + cases: + - name: "selects middle branch" + input: {"score": 75} + output: {"tier": "silver"} + - name: "falls through to else" + input: {"score": 30} + output: {"tier": "bronze"} + - name: "selects first branch" + input: {"score": 95} + output: {"tier": "gold"} + + - name: "else-if without final else produces void when no branch matches" + input: {"score": 30} + mapping: | + root.tier = "default" + root.tier = if this.score >= 90 { "gold" } else if this.score >= 70 { "silver" } + output: {"tier": "default"} + + - name: "else-if without final else — branch matches" + mapping: | + root.tier = if this.score >= 90 { "gold" } else if this.score >= 70 { "silver" } + cases: + - name: "first branch" + input: {"score": 95} + output: {"tier": "gold"} + - name: "second branch" + input: {"score": 75} + output: {"tier": "silver"} + + # --- Variable declaration with if expression --- + + - name: "if expression assigned to variable" + mapping: | + let x = if true { 42 } else { 0 } + root.result = $x + output: {"result": 42} + + - name: "void in variable declaration is runtime error" + mapping: | + let x = if false { 42 } + root.result = $x + error: "variable 'x' undefined" + + - name: "void in variable reassignment is skipped" + mapping: | + let x = 10 + let x = if false { 42 } + root.result = $x + output: {"result": 10} + + # --- Variables inside if expression body --- + + - name: "variables declared inside if expression body" + skip: "V1 if-expression branches are single expressions, not statement lists — cannot have let statements inside" + + # --- Cannot contain output assignments (expression context) --- + + - name: "output assignment inside if expression is compile error" + skip: "V1 if-expression branches are single expressions; an assignment statement is not parseable inside" + + # --- Nested if expressions --- + + - name: "nested if expression" + input: {"a": true, "b": false} + mapping: | + root.result = if this.a { + if this.b { "both" } else { "only a" } + } else { + "neither" + } + output: {"result": "only a"} + + # --- If expression with non-boolean condition --- + + - name: "non-boolean condition is error" + mapping: | + root.x = if "hello" { 1 } else { 2 } + error: "bool" # FIXME-v1: verify + + - name: "null condition is error" + skip: "V1 treats null as falsy in if conditions (no error raised) — see nullable bool coercion quirk" diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/if_statement.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/if_statement.yaml new file mode 100644 index 000000000..d757b5213 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/if_statement.yaml @@ -0,0 +1,175 @@ +description: "If as statement — standalone with output assignments, empty body, else-if, trailing expression error" + +tests: + # --- Basic if statement with output assignment --- + + - name: "if statement conditional on input type" + mapping: | + if this.type == "user" { + root.role = "member" + } + cases: + - name: "assigns when true" + input: {"type": "user"} + output: {"role": "member"} + - name: "skips body when false" + input: {"type": "guest"} + output: {"type": "guest"} # V1 passes input through when no root assignment occurs + + - name: "if statement with multiple output assignments" + input: {"type": "admin"} + mapping: | + if this.type == "admin" { + root.role = "admin" + root.level = 10 + } + output: {"role": "admin", "level": 10} + + # --- Empty body is valid no-op --- + + - name: "empty if body is valid no-op" + mapping: | + if true { } + root.x = "after" + output: {"x": "after"} + + - name: "empty else body is valid no-op" + input: {} + mapping: | + if false { + root.x = "then" + } else { } + output: {} + + # --- If-else statement --- + + - name: "if-else statement branch selection" + mapping: | + if this.type == "admin" { + root.role = "admin" + } else { + root.role = "user" + } + cases: + - name: "selects then branch" + input: {"type": "admin"} + output: {"role": "admin"} + - name: "takes else branch" + input: {"type": "guest"} + output: {"role": "user"} + + # --- Else-if chains --- + + - name: "else-if chain selects middle branch" + input: {"type": "mod"} + mapping: | + if this.type == "admin" { + root.role = "admin" + root.permissions = ["read", "write", "delete"] + } else if this.type == "mod" { + root.role = "moderator" + root.permissions = ["read", "write"] + } else { + root.role = "user" + root.permissions = ["read"] + } + output: {"role": "moderator", "permissions": ["read", "write"]} + + - name: "else-if chain falls to else" + input: {"type": "visitor"} + mapping: | + if this.type == "admin" { + root.role = "admin" + } else if this.type == "mod" { + root.role = "moderator" + } else { + root.role = "user" + } + output: {"role": "user"} + + - name: "else-if without final else — no branch matches is no-op" + input: {"type": "visitor"} + mapping: | + if this.type == "admin" { + root.role = "admin" + } else if this.type == "mod" { + root.role = "moderator" + } + output: {"type": "visitor"} + + # --- Statement context modifies outer variables --- + + - name: "if statement modifies outer variable" + mapping: | + let x = 10 + if true { + let x = 20 + } + root.result = $x + output: {"result": 20} + + - name: "if statement does not modify outer variable when false" + mapping: | + let x = 10 + if false { + let x = 20 + } + root.result = $x + output: {"result": 10} + + # --- New variables in if statement are block-scoped --- + # V1 does NOT block-scope let; the variable leaks out. + + - name: "new variable in if statement body not visible outside" + mapping: | + if true { + let local = 42 + } + root.result = $local + output: {"result": 42} + + # --- Trailing expression in statement body is parse error --- + + - name: "trailing expression in if statement is compile error" + mapping: | + if true { + let x = 10 + $x + 5 + } + compile_error: "expected" # FIXME-v1: verify — V1 rejects bare expression as non-sole statement + + - name: "bare expression as only content in if statement is compile error" + skip: "V1 permits a single bare expression as an if-body — it is parsed as a value form and returned (sets root); only mixing bare expressions with statements produces a parse error" + + # --- Nested if statements --- + + - name: "nested if statements" + input: {"type": "admin", "active": true} + mapping: | + if this.type == "admin" { + if this.active { + root.status = "active admin" + } else { + root.status = "inactive admin" + } + } + output: {"status": "active admin"} + + # --- If statement preserves prior output --- + + - name: "if statement preserves output from before" + mapping: | + root.before = "exists" + if true { + root.added = "new" + } + output: {"before": "exists", "added": "new"} + + # --- Non-boolean condition error --- + + - name: "non-boolean condition in if statement is error" + mapping: | + if 42 { + root.x = "yes" + } + error: "bool" # FIXME-v1: verify diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/match_as.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_as.yaml new file mode 100644 index 000000000..34ab770b6 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_as.yaml @@ -0,0 +1,47 @@ +description: "Match with 'as' binding — V1 has no direct equivalent of match-as; all tests are skipped." + +tests: + - name: "match as score tier classification" + skip: "V1 has no match-as syntax" + + - name: "as binding used in result expression" + skip: "V1 has no match-as syntax" + + - name: "as binding used in both condition and result" + skip: "V1 has no match-as syntax" + + - name: "matched expression evaluated once" + skip: "V1 has no match-as syntax" + + - name: "non-boolean case in match-as is runtime error" + skip: "V1 has no match-as syntax" + + - name: "integer case in match-as is runtime error" + skip: "V1 has no match-as syntax" + + - name: "wildcard in match-as is not checked as boolean" + skip: "V1 has no match-as syntax" + + - name: "as binding not accessible after match" + skip: "V1 has no match-as syntax" + + - name: "as binding does not shadow outer variable — it is a new scope" + skip: "V1 has no match-as syntax" + + - name: "match-as with braced body using binding and local vars" + skip: "V1 has no match-as syntax" + + - name: "match-as on string value" + skip: "V1 has no match-as syntax" + + - name: "non-exhaustive match-as produces void" + skip: "V1 has no match-as syntax" + + - name: "match-as as statement with output assignments" + skip: "V1 has no match-as syntax" + + - name: "cases after first true not evaluated in match-as" + skip: "V1 has no match-as syntax" + + - name: "match-as with computed matched expression" + skip: "V1 has no match-as syntax" diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/match_block_body.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_block_body.yaml new file mode 100644 index 000000000..752ba1dc6 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_block_body.yaml @@ -0,0 +1,95 @@ +description: > + Match expression block bodies — V1 match arms are single expressions, + not statement lists. Braced block bodies with let assignments and a + trailing expression are a V2-only feature. + +tests: + # --- Block body with assignments --- + + - name: "block body with variable and final expression" + skip: "V1 match arms are single expressions, not statement-list blocks" + + - name: "block body with multiple assignments" + skip: "V1 match arms are single expressions, not statement-list blocks" + + - name: "block body with path assignment" + skip: "V1 match arms are single expressions; also $var path assignment not supported" + + - name: "block body with dynamic key assignment" + skip: "V1 match arms are single expressions; also $obj[key] = v is not V1 syntax" + + # --- Object literals via parentheses --- + # V1 does allow object literals directly in match arm bodies — no parens needed. + # The V2 ambiguity between block-body {...} and object-literal {...} doesn't exist in V1. + + - name: "empty object via parentheses" + mapping: | + root.v = match "miss" { + "hit" => "found", + _ => {} + } + output: {"v": {}} + + - name: "object literal via parentheses" + mapping: | + root.v = match "fallback" { + "match" => {"status": "matched"}, + _ => {"status": "default"} + } + output: {"v": {"status": "default"}} + + - name: "bare string result needs no parens" + mapping: | + root.v = match "a" { + "a" => "alpha", + _ => "other" + } + output: {"v": "alpha"} + + - name: "bare expression result no block needed" + mapping: | + root.v = match 5 { + 5 => 5 * 10, + _ => 0 + } + output: {"v": 50} + + # --- Block body on wildcard arm --- + + - name: "wildcard arm with block body" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Block body with outer variable capture --- + + - name: "block body captures outer variable" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Block body with match-as binding --- + + - name: "match-as block body name truncation" + skip: "V1 has no match-as syntax, and arms are single expressions" + + # --- Block body returning deleted --- + + - name: "block body returning deleted removes field" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Block body with void from if-without-else --- + + - name: "block body with void skips assignment" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Nested match with block bodies --- + + - name: "nested match both with block bodies" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Block body in boolean match --- + + - name: "boolean match with block body" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Multiple arms with block bodies --- + + - name: "multiple arms all with block bodies" + skip: "V1 match arms are single expressions, not statement-list blocks" diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/match_boolean.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_boolean.yaml new file mode 100644 index 000000000..b29366674 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_boolean.yaml @@ -0,0 +1,133 @@ +description: "Match boolean form — with and without matched expression, non-boolean case error, first true wins" + +tests: + # --- Boolean match without expression --- + + - name: "boolean match grade classification" + mapping: | + root.grade = match { + this.score >= 90 => "A", + this.score >= 80 => "B", + this.score >= 70 => "C", + _ => "F", + } + cases: + - name: "selects B for 85" + input: {"score": 85} + output: {"grade": "B"} + - name: "selects A for 95" + input: {"score": 95} + output: {"grade": "A"} + - name: "falls to wildcard for 30" + input: {"score": 30} + output: {"grade": "F"} + + - name: "boolean match first true wins even if later also true" + input: {"score": 95} + mapping: | + root.result = match { + this.score >= 80 => "eighty plus", + this.score >= 90 => "ninety plus", + _ => "other", + } + output: {"result": "eighty plus"} + + # --- Non-boolean case is error --- + # In V1, a non-literal, non-wildcard pattern that evaluates to non-bool simply doesn't match + # (the arm is skipped); it's not a runtime error. Literal patterns use ICompare equality against + # the subject — but subject-less match has no subject, so a literal is compared against nothing. + # Verify: V1 subject-less match with a literal arm treats it as an expression pattern; non-bool result + # fails to match and falls through. So `match { "hello" => "yes", _ => "no" }` would return "no", not error. + + - name: "non-boolean case in boolean match is runtime error" + input: {} + mapping: | + root.result = match { + "hello" => "yes", + _ => "no", + } + output: {"result": "no"} # V1 subjectless match uses `this` as implicit subject; literal pattern "hello" compared against {} → falls through + + - name: "integer case in boolean match is runtime error" + input: {} + mapping: | + root.result = match { + 42 => "yes", + _ => "no", + } + output: {"result": "no"} # V1 subjectless match uses `this` as implicit subject; literal 42 compared against {} → falls through + + - name: "null case in boolean match is runtime error" + input: {} + mapping: | + root.result = match { + null => "yes", + _ => "no", + } + output: {"result": "no"} # V1 subjectless match uses `this` as implicit subject; literal null compared against {} → falls through + + # --- Wildcard exempt from boolean requirement --- + + - name: "wildcard is not checked as boolean" + mapping: | + root.result = match { + false => "no", + _ => "default", + } + output: {"result": "default"} + + # --- Cases evaluated in order, short-circuit --- + + - name: "cases after first true are not evaluated" + skip: "V1 eagerly evaluates every match case pattern before selecting a winner — a later pattern that throws aborts the whole expression even if an earlier arm matches" + + # --- Non-exhaustive boolean match produces void --- + # V1 match with no matching case returns null, not a void sentinel. The assignment writes null. + + - name: "non-exhaustive boolean match produces void" + mapping: | + root.x = "prior" + root.x = match { + false => "nope", + } + output: {"x": "prior"} + + # --- Boolean match with complex conditions --- + + - name: "boolean match with compound conditions" + input: {"age": 25, "member": true} + mapping: | + root.discount = match { + this.age < 18 => "youth", + this.age >= 65 => "senior", + this.member && this.age >= 21 => "member", + _ => "none", + } + output: {"discount": "member"} + + # --- Boolean match with variables --- + + - name: "boolean match using variables in conditions" + mapping: | + let threshold = 50 + let value = 75 + root.result = match { + $value >= $threshold => "above", + _ => "below", + } + output: {"result": "above"} + + # --- Boolean match with braced case body --- + + - name: "boolean match case with braced body" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Boolean match as statement --- + + - name: "boolean match as statement assigns output" + skip: "V1 match is an expression only — arms cannot contain assignment statements" + + # --- Non-boolean skipped by earlier true case does not error --- + + - name: "non-boolean case after true case is never evaluated" + skip: "V1 does not short-circuit match case evaluation; this test also relies on first-true-wins semantics (see short-circuit skip)" diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/match_edge_cases.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_edge_cases.yaml new file mode 100644 index 000000000..fe95ad7f6 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_edge_cases.yaml @@ -0,0 +1,138 @@ +description: > + Match expression edge cases — equality match with various types, + match-as with complex expressions, wildcard semantics, and + interactions with void and deleted. + +tests: + # --- Equality match with various types --- + + - name: "equality match on integer" + mapping: | + root.v = match 2 { + 1 => "one", + 2 => "two", + 3 => "three", + _ => "other", + } + output: {"v": "two"} + + - name: "equality match on string" + mapping: | + root.v = match "hello" { + "hi" => "informal", + "hello" => "formal", + _ => "unknown", + } + output: {"v": "formal"} + + - name: "equality match on null" + mapping: | + root.v = match null { + null => "is null", + _ => "not null", + } + output: {"v": "is null"} + + - name: "equality match falls through to wildcard" + mapping: | + root.v = match "xyz" { + "a" => 1, + "b" => 2, + _ => 0, + } + output: {"v": 0} + + - name: "equality match with float" + mapping: | + root.v = match 3.14 { + 3.14 => "pi", + 2.71 => "e", + _ => "other", + } + output: {"v": "pi"} + + # --- match subject evaluated once --- + + - name: "match subject expression evaluated once" + mapping: | + let counter = 0 + let counter = $counter + 1 + root.v = match $counter { + 1 => "one", + 2 => "two", + _ => "many", + } + output: {"v": "one"} + + # --- match-as binding --- + + - name: "match-as binds value to variable" + skip: "V1 has no match-as syntax" + + - name: "match-as variable used in result expression" + skip: "V1 has no match-as syntax" + + - name: "match-as variable not accessible outside match" + skip: "V1 has no match-as syntax" + + # --- Wildcard catches all --- + + - name: "wildcard matches any value" + mapping: | + root.v = match "anything" { + _ => "caught", + } + output: {"v": "caught"} + + # --- Match with deleted in result --- + + - name: "match result is deleted — field removed" + mapping: | + root.keep = "yes" + root.maybe = "testing" + root.maybe = match "remove" { + "remove" => deleted(), + _ => root.maybe, + } + output: {"keep": "yes"} + + # --- Match with void from if-without-else in arm --- + # V1 if-without-else returns null, not void. So the arm returns null and assignment writes null. + + - name: "match arm contains if-without-else producing void" + mapping: | + root.v = "default" + root.v = match "a" { + "a" => if false { "override" }, + _ => "fallback", + } + output: {"v": "default"} # V1 if-without-else produces void; matched arm returns void; assignment is skipped, preserving prior + + # --- Nested match --- + + - name: "nested match expressions" + mapping: | + let role = this.role + root.v = match this.type { + "user" => match $role { + "admin" => "full access", + "viewer" => "read only", + _ => "limited", + }, + _ => "unknown type", + } + cases: + - name: "user admin gets full access" + input: {"type": "user", "role": "admin"} + output: {"v": "full access"} + - name: "non-user type falls through" + input: {"type": "device", "role": "admin"} + output: {"v": "unknown type"} + + # --- Match in statement context --- + + - name: "match statement modifies output conditionally" + skip: "V1 match is an expression only — arms cannot contain assignment statements" + + - name: "match statement with no matching arm is no-op" + skip: "V1 match is an expression only — arms cannot contain assignment statements" diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/match_equality.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_equality.yaml new file mode 100644 index 000000000..23c6adfe3 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_equality.yaml @@ -0,0 +1,150 @@ +description: "Match equality form — expression evaluated once, == comparison, first match wins, boolean case error" + +tests: + # --- Basic equality match --- + + - name: "match equality on animal sound" + mapping: | + root.sound = match this.animal { + "cat" => "meow", + "dog" => "woof", + _ => "unknown", + } + cases: + - name: "selects first case" + input: {"animal": "cat"} + output: {"sound": "meow"} + - name: "selects second case" + input: {"animal": "dog"} + output: {"sound": "woof"} + - name: "falls to wildcard" + input: {"animal": "bird"} + output: {"sound": "unknown"} + + # --- First match wins --- + + - name: "first matching case wins when multiple could match" + mapping: | + let x = "hello" + root.result = match $x { + "hello" => "first", + "hello" => "second", + _ => "default", + } + output: {"result": "first"} + + # --- Numeric equality match --- + + - name: "match on integer value" + input: {"code": 200} + mapping: | + root.status = match this.code { + 200 => "ok", + 404 => "not found", + 500 => "error", + _ => "other", + } + output: {"status": "ok"} + + # --- Null matching --- + + - name: "match null case" + input: {"val": null} + mapping: | + root.result = match this.val { + null => "is null", + _ => "not null", + } + output: {"result": "is null"} + + # --- Case expressions are evaluated (not just literals) --- + + - name: "case expression uses variable" + skip: "V1 match arm patterns that are non-literal expressions (like $var) are evaluated as boolean predicates, not compared for equality against the subject — `$target => ...` with $target=\"hello\" is treated as if(string), which fails" + + - name: "case expression uses concatenation" + mapping: | + root.result = match "foobar" { + "foo" + "bar" => "matched concat", + _ => "no match", + } + output: {"result": "matched concat"} + + # --- Subsequent cases not evaluated after match --- + + - name: "cases after match are not evaluated" + mapping: | + root.result = match "a" { + "a" => "found", + throw("should not evaluate") => "never", + _ => "default", + } + output: {"result": "found"} + + # --- Boolean case values are errors --- + # In V1, boolean literal patterns (true/false) with a subject are classified as literals at parse + # time and compared via ICompare against the subject. They are NOT compile or runtime errors. + # `match "hello" { true => ... }` compares "hello" == true (returns false) and falls through. + + - name: "boolean literal true as case is compile error" + mapping: | + root.result = match "hello" { + true => "yes", + _ => "no", + } + output: {"result": "no"} # FIXME-v1: verify — V1 compares "hello" == true, falls through to _ + + - name: "boolean literal false as case is compile error" + mapping: | + root.result = match 42 { + false => "no", + _ => "yes", + } + output: {"result": "yes"} # FIXME-v1: verify — V1 compares 42 == false, falls through to _ + + - name: "dynamic case evaluating to boolean is runtime error" + skip: "V1 rebinds `this` inside match arms to the subject value, so `this.score` inside the arm references the subject not the original root; the intended semantic is not expressible without a let binding" + + - name: "dynamic boolean case error is catchable" + skip: "V1 does not treat dynamic-boolean-case as an error, and .catch lambda-with-error-object syntax is V2-specific" + + # --- Expression evaluated once --- + + - name: "matched expression evaluated once (side effects)" + mapping: | + let counter = 0 + let counter = $counter + 1 + root.result = match $counter { + 1 => "one", + 2 => "two", + _ => "other", + } + output: {"result": "one"} + + # --- Match with braced case bodies --- + + - name: "match case with braced body containing variables" + skip: "V1 match arms are single expressions, not statement-list blocks" + + # --- Wildcard is only catch-all (no else keyword) --- + + - name: "wildcard catches all unmatched values" + mapping: | + root.result = match 999 { + 1 => "one", + 2 => "two", + _ => "catch all", + } + output: {"result": "catch all"} + + # --- Non-exhaustive match produces void --- + # V1 match with no matching case returns null, not a void sentinel. + + - name: "non-exhaustive match produces void — assignment skipped" + mapping: | + root.sound = "default" + root.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + output: {"sound": "default"} diff --git a/internal/bloblang2/migrator/v1spec/tests/control_flow/match_void.yaml b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_void.yaml new file mode 100644 index 000000000..2b6dae1ec --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/control_flow/match_void.yaml @@ -0,0 +1,137 @@ +description: > + Non-exhaustive match producing void — V1 has no void sentinel; a non-matching + match returns null. Most "skipped assignment" and "void error" tests therefore + behave differently. + +tests: + # --- Void in output assignment: skipped (V1: writes null) --- + + - name: "non-exhaustive equality match produces void — field absent" + input: {} + mapping: | + root.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + output: {} + + - name: "void from equality match preserves prior value" + mapping: | + root.sound = "chirp" + root.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + output: {"sound": "chirp"} + + - name: "non-exhaustive boolean match produces void — field absent" + input: {} + mapping: | + root.x = match { + false => "nope", + false => "also nope", + } + output: {} + + - name: "non-exhaustive match-as produces void — field absent" + skip: "V1 has no match-as syntax" + + # --- Void in variable declaration: runtime error (V1: assigns null) --- + + - name: "void from equality match in variable declaration is error" + mapping: | + let x = match "nope" { + "a" => 1, + "b" => 2, + } + root.result = $x + error: "variable 'x' undefined" + + - name: "void from boolean match in variable declaration is error" + mapping: | + let x = match { + false => 1, + } + root.result = $x + error: "variable 'x' undefined" + + - name: "void from match-as in variable declaration is error" + skip: "V1 has no match-as syntax" + + # --- Void in variable reassignment: skipped (V1: overwrites with null) --- + + - name: "void from match skips variable reassignment" + mapping: | + let x = "original" + let x = match "nope" { + "a" => "found", + } + root.result = $x + output: {"result": "original"} + + # --- Void in collection literal: error (V1: null element) --- + + - name: "void from match in array literal is error" + mapping: | + root.arr = [1, match "x" { "y" => 2 }, 3] + output: {"arr": [1, 3]} # V1 omits void elements from the array + + - name: "void from match in object literal is error" + mapping: | + root.obj = {"key": match "x" { "y" => "val" }} + output: {"obj": {}} # V1 omits keys whose value is void + + # --- Void as function/map argument: error --- + + - name: "void from match as map argument is error" + skip: "V1 named maps do not take function-style parameters" + + # --- Void rescued with .or() --- + + - name: "or rescues void from non-exhaustive equality match" + mapping: | + root.result = (match "bird" { "cat" => "meow" }).or("unknown") + output: {"result": "unknown"} + + - name: "or rescues void from non-exhaustive boolean match" + mapping: | + root.result = (match { false => "nope" }).or("default") + output: {"result": "default"} + + - name: "or does not trigger when match produces a value" + mapping: | + root.result = (match "cat" { "cat" => "meow", }).or("unknown") + output: {"result": "meow"} + + - name: "or rescues void for variable declaration" + mapping: | + let x = (match "bird" { "cat" => "meow" }).or("unknown") + root.result = $x + output: {"result": "unknown"} + + # --- Wildcard prevents void --- + + - name: "wildcard ensures match is exhaustive" + mapping: | + root.result = match "anything" { + "a" => 1, + _ => 0, + } + output: {"result": 0} + + # --- Void in expression context: error (V1: null + 10 is a type error) --- + + - name: "void from match in addition is error" + mapping: | + root.result = (match "x" { "y" => 1 }) + 10 + error: "cannot add types nothing" + + # --- .catch() does not rescue void --- + # V1 .catch() catches errors, not nulls. Non-matching match returns null (not error), so .catch + # does not trigger and the null value passes through. + + - name: "catch does not trigger on void from match" + mapping: | + root.x = "prior" + root.x = (match "x" { "y" => 1 }).catch(0) + output: {"x": "prior"} # .catch does not trigger on void; the void propagates and the assignment is skipped, preserving prior diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/deeply_nested.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/deeply_nested.yaml new file mode 100644 index 000000000..0513aecd8 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/deeply_nested.yaml @@ -0,0 +1,108 @@ +description: "Edge cases: deeply nested objects, arrays, and expression chains" + +tests: + # --- Deep object nesting --- + + - name: "access deeply nested object field" + mapping: | + root = this.a.b.c.d.e + input: {"a": {"b": {"c": {"d": {"e": "deep"}}}}} + output: "deep" + + - name: "assign deeply nested output path" + mapping: | + root.a.b.c.d.e = "deep" + output: {"a": {"b": {"c": {"d": {"e": "deep"}}}}} + + - name: "deeply nested object with mixed types" + mapping: | + root.a.b.c = {"x": [1, 2, {"y": true}]} + output: {"a": {"b": {"c": {"x": [1, 2, {"y": true}]}}}} + + - name: "multiple deeply nested assignments" + mapping: | + root.a.b.c = 1 + root.a.b.d = 2 + root.a.e = 3 + output: {"a": {"b": {"c": 1, "d": 2}, "e": 3}} + + # --- Deep array nesting --- + + - name: "access deeply nested array element" + mapping: | + root = this.arr.0.0.0 + input: {"arr": [[[42]]]} + output: 42 + + - name: "deeply nested array literal" + mapping: | + root = [[[[1, 2], [3, 4]], [[5, 6]]]] + output: [[[[1, 2], [3, 4]], [[5, 6]]]] + + - name: "mixed deep nesting — object in array in object" + mapping: | + root = this.data.0.items.1.value + input: {"data": [{"items": [{"value": "a"}, {"value": "b"}]}]} + output: "b" + + - name: "long method chain on string" + mapping: | + root = " Hello World ".trim().lowercase().replace_all("hello", "hi").uppercase() + output: "HI WORLD" + + - name: "long method chain on array" + # V1 `.reverse()` is a string method only — arrays use `.sort(left > right)` for descending order. + mapping: | + root = [3, 1, 4, 1, 5, 9, 2, 6].unique().sort(left > right) + output: [9, 6, 5, 4, 3, 2, 1] + + - name: "chained map calls" + mapping: | + map inc { root = this + 1 } + map double { root = this * 2 } + root = 1.apply("inc").apply("double").apply("inc").apply("double") + output: 10 + + - name: "nested ternary-style if expressions" + mapping: | + let x = 5 + root = if $x > 10 { "big" } else { if $x > 3 { "medium" } else { "small" } } + output: "medium" + + - name: "nested match expressions" + mapping: | + let x = "b" + root = match $x { + "a" => match 1 { 1 => "a1", _ => "a?" }, + "b" => match 2 { 1 => "b1", 2 => "b2", _ => "b?" }, + _ => "other", + } + output: "b2" + + # --- Deep object construction --- + + - name: "object literal with nested objects and arrays" + mapping: | + root = { + "users": [ + {"name": "Alice", "tags": ["admin", "user"]}, + {"name": "Bob", "tags": ["user"]}, + ], + "meta": {"count": 2, "active": true}, + } + output: {"users": [{"name": "Alice", "tags": ["admin", "user"]}, {"name": "Bob", "tags": ["user"]}], "meta": {"count": 2, "active": true}} + + # --- Deep null-safe access --- + + - name: "null-safe chain through missing nested fields" + mapping: | + root = this.a.b.c.d + input: {"a": null} + output: null + # FIXME-v1: verify — V1 path access through null returns null (§12.5), no ?. needed. + + - name: "null-safe chain where middle is present" + mapping: | + root = this.a.b.c + input: {"a": {"b": {"c": 42}}} + output: 42 diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/empty_collections.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/empty_collections.yaml new file mode 100644 index 000000000..1792ae084 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/empty_collections.yaml @@ -0,0 +1,117 @@ +description: "Edge cases: empty array and empty object with all applicable methods" + +tests: + # --- Empty array methods --- + + - name: "empty array sort returns empty array" + mapping: | + root = [].sort() + output: [] + + - name: "empty array sum returns 0" + # V1 `.sum()` always returns a float64 (even for empty input). + mapping: | + root = [].sum() + output: 0.0 + + - name: "empty array min is error" + mapping: | + root = [].min() + error: "empty" + # FIXME-v1: verify — V1 .min() on empty array may return null rather than error. + + - name: "empty array max is error" + mapping: | + root = [].max() + error: "empty" + # FIXME-v1: verify — V1 .max() on empty array may return null rather than error. + + - name: "empty array fold returns initial value" + mapping: | + root = [].fold(42, tally -> tally.tally + tally.value) + output: 42 + # FIXME-v1: verify — V1 .fold(init, lambda) lambda receives {"tally": ..., "value": ...}. + + - name: "empty array fold returns initial string" + mapping: | + root = [].fold("start", tally -> tally.tally + tally.value) + output: "start" + # FIXME-v1: verify — V1 .fold lambda receives {"tally": ..., "value": ...}. + + - name: "empty array length is 0" + mapping: | + root = [].length() + output: 0 + + - name: "empty array contains returns false" + mapping: | + root = [].contains(1) + output: false + + - name: "empty array unique returns empty array" + mapping: | + root = [].unique() + output: [] + + - name: "empty array filter returns empty array" + mapping: | + root = [].filter(x -> x > 0) + output: [] + + - name: "empty array map returns empty array" + mapping: | + root = [].map_each(x -> x * 2) + output: [] + # FIXME-v1: verify — V1 uses .map_each() rather than .map(). + + - name: "empty array reverse returns empty array" + # V1 `.reverse()` is a string-only method — calling it on an array is a runtime error. + mapping: | + root = [].reverse() + error: "expected string value" + + # --- Empty object methods --- + + - name: "empty object keys returns empty array" + mapping: | + root = {}.keys() + output: [] + + - name: "empty object values returns empty array" + mapping: | + root = {}.values() + output: [] + + - name: "empty object length is 0" + mapping: | + root = {}.length() + output: 0 + + - name: "empty object type is object" + mapping: | + root = {}.type() + output: "object" + + - name: "empty array type is array" + mapping: | + root = [].type() + output: "array" + + # --- Single element edge cases --- + + - name: "single element array sort returns same" + mapping: | + root = [42].sort() + output: [42] + + - name: "single element array min returns element" + # V1 `.min()` always returns a float64. + mapping: | + root = [42].min() + output: 42.0 + + - name: "single element array max returns element" + # V1 `.max()` always returns a float64. + mapping: | + root = [42].max() + output: 42.0 diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/infinity.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/infinity.yaml new file mode 100644 index 000000000..859b59035 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/infinity.yaml @@ -0,0 +1,145 @@ +description: "Edge cases: Infinity comparisons, arithmetic, equality, bool conversion" + +tests: + # --- Infinity comparisons --- + + - name: "Infinity > any finite number" + mapping: | + root = this.inf > 999999999999.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "Infinity >= any finite number" + mapping: | + root = this.inf >= 0.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "-Infinity < any finite number" + mapping: | + root = this.ninf < -999999999999.0 + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + skip: "V2-only: V1 has no -Infinity literal and no tagged numeric-type inputs." + + - name: "-Infinity <= any finite number" + mapping: | + root = this.ninf <= 0.0 + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + skip: "V2-only: V1 has no -Infinity literal and no tagged numeric-type inputs." + + - name: "finite < Infinity" + mapping: | + root = 1000000.0 < this.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "finite > -Infinity" + mapping: | + root = -1000000.0 > this.ninf + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + skip: "V2-only: V1 has no -Infinity literal and no tagged numeric-type inputs." + + # --- Infinity equality --- + + - name: "Infinity == Infinity is true" + mapping: | + root = this.inf == this.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "-Infinity == -Infinity is true" + mapping: | + root = this.ninf == this.ninf + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + skip: "V2-only: V1 has no -Infinity literal and no tagged numeric-type inputs." + + - name: "Infinity != -Infinity is true" + mapping: | + root = this.inf != this.ninf + input: {"inf": {_type: "float64", value: "Infinity"}, "ninf": {_type: "float64", value: "-Infinity"}} + output: true + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "Infinity > -Infinity" + mapping: | + root = this.inf > this.ninf + input: {"inf": {_type: "float64", value: "Infinity"}, "ninf": {_type: "float64", value: "-Infinity"}} + output: true + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + # --- Infinity arithmetic --- + + - name: "Infinity + Infinity = Infinity" + mapping: | + root = this.inf + this.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "Infinity"} + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "Infinity - Infinity = NaN" + mapping: | + root = this.inf - this.inf + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "NaN"} + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "Infinity * 2 = Infinity" + mapping: | + root = this.inf * 2.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "Infinity"} + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "Infinity * -1 = -Infinity" + mapping: | + root = this.inf * -1.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "-Infinity"} + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "-Infinity + -Infinity = -Infinity" + mapping: | + root = this.ninf + this.ninf + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: {_type: "float64", value: "-Infinity"} + skip: "V2-only: V1 has no -Infinity literal and no tagged numeric-type inputs." + + - name: "Infinity * 0.0 = NaN" + mapping: | + root = this.inf * 0.0 + input: {"inf": {_type: "float64", value: "Infinity"}} + output: {_type: "float64", value: "NaN"} + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + # --- Infinity bool conversion --- + + - name: "Infinity.bool() is true" + mapping: | + root = this.inf.bool() + input: {"inf": {_type: "float64", value: "Infinity"}} + output: true + skip: "V2-only: V1 has no Infinity literal and no tagged numeric-type inputs." + + - name: "-Infinity.bool() is true" + mapping: | + root = this.ninf.bool() + input: {"ninf": {_type: "float64", value: "-Infinity"}} + output: true + skip: "V2-only: V1 has no -Infinity literal and no tagged numeric-type inputs." + + # --- Infinity type --- + + - name: "Infinity type is float64" + mapping: | + root = this.inf.type() + input: {"inf": {_type: "float64", value: "Infinity"}} + output: "float64" + skip: "V2-only: V1 .type() reports 'number' for all numeric values, not 'float64'." diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow.yaml new file mode 100644 index 000000000..8357b8bcd --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow.yaml @@ -0,0 +1,121 @@ +description: "Edge cases: integer overflow for int32, int64, uint32, uint64 across add, sub, mul" + +tests: + # --- int64 overflow --- + + - name: "int64 max + 1 overflows" + mapping: | + root = 9223372036854775807 + 1 + error: "overflow" + skip: "V1-divergence: V1 silently wraps int64 overflow (§14.26) instead of erroring." + + - name: "int64 min literal is compile error" + mapping: | + root = -9223372036854775808 - 1 + compile_error: "exceeds" + skip: "V1-divergence: V1 parses -9223372036854775808 as unary-minus of a too-large literal; behaviour differs." + + - name: "int64 min - 1 overflows" + mapping: | + root = (-9223372036854775807 - 1) - 1 + error: "overflow" + skip: "V1-divergence: V1 silently wraps int64 overflow (§14.26) instead of erroring." + + - name: "int64 max * 2 overflows" + mapping: | + root = 9223372036854775807 * 2 + error: "overflow" + skip: "V1-divergence: V1 silently wraps int64 overflow (§14.26) instead of erroring." + + - name: "int64 large positive multiplication overflows" + mapping: | + root = 4611686018427387904 * 3 + error: "overflow" + skip: "V1-divergence: V1 silently wraps int64 overflow (§14.26) instead of erroring." + + - name: "int64 max value is representable" + mapping: | + root = 9223372036854775807 + output: 9223372036854775807 + + - name: "int64 min value via arithmetic is representable" + mapping: | + root = -9223372036854775807 - 1 + output: -9223372036854775808 + + # --- int32 overflow via conversion --- + + - name: "int32 max + 1 via conversion overflows" + mapping: | + root = 2147483648.int32() + error: "too large" + + - name: "int32 min - 1 via conversion overflows" + mapping: | + root = (-2147483649).int32() + error: "too small" + + - name: "int32 max is representable" + mapping: | + root = 2147483647.int32() + output: {_type: "int32", value: "2147483647"} + + - name: "int32 min is representable" + mapping: | + root = (-2147483648).int32() + output: {_type: "int32", value: "-2147483648"} + + # --- uint32 overflow via conversion --- + + - name: "uint32 max + 1 via conversion overflows" + mapping: | + root = 4294967296.uint32() + error: "too large" + + - name: "uint32 negative via conversion overflows" + mapping: | + root = (-1).uint32() + error: "negative and cannot be cast" + + - name: "uint32 max is representable" + mapping: | + root = 4294967295.uint32() + output: {_type: "uint32", value: "4294967295"} + + # --- uint64 overflow --- + + - name: "uint64 max + 1 from string overflows" + mapping: | + root = "18446744073709551616".uint64() + error: "value out of range" + + - name: "uint64 negative is error" + mapping: | + root = (-1).uint64() + error: "negative and cannot be cast" + + - name: "uint64 max from string is representable" + mapping: | + root = "18446744073709551615".uint64() + output: {_type: "uint64", value: "18446744073709551615"} + + - name: "uint64 max as bare literal is compile error" + mapping: | + root = 18446744073709551615.uint64() + compile_error: "value out of range" + + # --- Overflow at boundary --- + + - name: "int64 max minus 1 plus 2 overflows" + mapping: | + let v = 9223372036854775806 + root = $v + 2 + error: "overflow" + skip: "V1-divergence: V1 silently wraps int64 overflow (§14.26) instead of erroring." + + - name: "int64 min plus 1 minus 2 overflows" + mapping: | + let v = -9223372036854775807 + root = $v - 2 + error: "overflow" + skip: "V1-divergence: V1 silently wraps int64 overflow (§14.26) instead of erroring." diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow_ops.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow_ops.yaml new file mode 100644 index 000000000..8b1fb4cc2 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/integer_overflow_ops.yaml @@ -0,0 +1,134 @@ +description: > + Integer overflow across all operations and integer types — addition, + subtraction, multiplication for int32, int64, uint32, uint64. Also + tests that overflow is always a runtime error, never wraps. + +tests: + # --- int64 overflow: subtraction --- + + - name: "int64 min minus 1 via subtraction overflows" + mapping: | + let min = -9223372036854775807 - 1 + root = $min - 1 + error: "overflow" + skip: "V1-divergence: V1 wraps int64 overflow silently (no error) — quirk §14.26." + + - name: "int64 large negative subtraction overflows" + mapping: | + root = (-9223372036854775807) - 9223372036854775807 + error: "overflow" + skip: "V1-divergence: V1 wraps int64 overflow silently (no error) — quirk §14.26." + + # --- int32 overflow: arithmetic --- + + - name: "int32 max + 1 overflows" + mapping: | + root = 2147483647.int32() + 1.int32() + error: "overflow" + skip: "V1-divergence: arithmetic operators do not handle int32/uint32/int16/int8/uint16/uint8 values" + + - name: "int32 min - 1 overflows" + mapping: | + root = (-2147483648).int32() - 1.int32() + error: "overflow" + skip: "V1-divergence: int64 intermediate prevents overflow detection on int32 arithmetic" + + - name: "int32 max * 2 overflows" + mapping: | + root = 2147483647.int32() * 2.int32() + error: "overflow" + skip: "V1-divergence: int64 intermediate prevents overflow detection on int32 arithmetic" + + - name: "int32 large negative multiplication overflows" + mapping: | + root = (-2147483648).int32() * (-1).int32() + error: "overflow" + skip: "V1-divergence: int64 intermediate prevents overflow detection on int32 arithmetic" + + # --- uint32 overflow --- + + - name: "uint32 max + 1 overflows" + mapping: | + root = 4294967295.uint32() + 1.uint32() + error: "overflow" + skip: "V1-divergence: arithmetic operators do not handle int32/uint32/int16/int8/uint16/uint8 values" + + - name: "uint32 zero minus 1 overflows" + mapping: | + root = 0.uint32() - 1.uint32() + error: "overflow" + skip: "V1-divergence: int64 intermediate prevents overflow detection on uint32 arithmetic" + + - name: "uint32 max * 2 overflows" + mapping: | + root = 4294967295.uint32() * 2.uint32() + error: "overflow" + skip: "V1-divergence: int64 intermediate prevents overflow detection on uint32 arithmetic" + + # --- uint64 overflow --- + + - name: "uint64 max + 1 overflows" + mapping: | + root = "18446744073709551615".uint64() + 1.uint64() + error: "overflow" + skip: "V2-only: V1 has no .uint64() distinct numeric-width conversion, and silent overflow (§14.26)." + + - name: "uint64 zero minus 1 overflows" + mapping: | + root = 0.uint64() - 1.uint64() + error: "overflow" + skip: "V1-divergence: V1 wraps uint64 overflow silently (no error) — quirk §14.26." + + - name: "uint64 max * 2 overflows" + mapping: | + root = "18446744073709551615".uint64() * 2.uint64() + error: "overflow" + skip: "V1-divergence: V1 wraps uint64 overflow silently (no error) — quirk §14.26." + + # --- int64 multiplication near boundaries --- + + - name: "int64 max / 2 * 3 overflows" + mapping: | + let half = 4611686018427387903 + root = $half * 3 + error: "overflow" + skip: "V1-divergence: V1 wraps int64 overflow silently (no error) — quirk §14.26." + + - name: "int64 negative overflow from multiplication" + mapping: | + root = 9223372036854775807 * (-2) + error: "overflow" + skip: "V1-divergence: V1 wraps int64 overflow silently (no error) — quirk §14.26." + + # --- Non-overflow boundary values work --- + + - name: "int64 max - 1 + 1 is exactly max" + mapping: | + root = 9223372036854775806 + 1 + output: 9223372036854775807 + + - name: "int32 max representable after subtraction" + mapping: | + root = (2147483647.int32() - 1.int32()) + 1.int32() + output: {_type: "int32", value: "2147483647"} + skip: "V2-only: V1 has no int32 distinct numeric width." + + - name: "uint64 large value arithmetic within range" + mapping: | + root = "18446744073709551614".uint64() + 1.uint64() + output: {_type: "uint64", value: "18446744073709551615"} + + - name: "uint32 max minus 1 plus 1 is exactly max" + mapping: | + root = (4294967295.uint32() - 1.uint32()) + 1.uint32() + output: {_type: "uint32", value: "4294967295"} + skip: "V1-divergence: uint32 arithmetic collapses to int64, loses uint32 type identity" + + # --- Overflow in modulo (abs(min) overflows for signed) --- + + - name: "int64 min modulo -1 overflows" + mapping: | + let min = -9223372036854775807 - 1 + root = $min % (-1) + error: "overflow" + skip: "V1-divergence: V1 silently wraps int64 overflow (modulo of int64-min by -1 does not error)." diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/interpreter_reuse.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/interpreter_reuse.yaml new file mode 100644 index 000000000..9d9d7005d --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/interpreter_reuse.yaml @@ -0,0 +1,39 @@ +description: > + Interpreter reuse correctness — the same compiled mapping must produce + independent results when executed multiple times with different inputs. + Verifies no state leakage between executions. + +tests: + - name: "second execution sees fresh input" + input: {"x": 2} + mapping: | + root.doubled = this.x * 2 + output: {"doubled": 4} + + - name: "variables do not persist between executions" + mapping: | + let x = 42 + root.v = $x + output: {"v": 42} + + - name: "output starts empty on each execution" + mapping: | + root.fresh = true + output: {"fresh": true} + + - name: "map local variables are independent per execution" + mapping: | + map counter { + let local = this * 10 + root = $local + 1 + } + root.v = 3.apply("counter") + output: {"v": 31} + + - name: "recursive map frames do not leak between executions" + mapping: | + map factorial { + root = if this <= 1 { 1 } else { this * (this - 1).apply("factorial") } + } + root.v = 5.apply("factorial") + output: {"v": 120} diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/nan_behavior.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/nan_behavior.yaml new file mode 100644 index 000000000..8283bc998 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/nan_behavior.yaml @@ -0,0 +1,140 @@ +description: "Edge cases: NaN equality, comparison, arithmetic, sort ordering, unique dedup, bool error" + +tests: + # --- NaN equality --- + + - name: "NaN == NaN is false" + mapping: | + root = this.n == this.n + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN != NaN is true" + mapping: | + root = this.n != this.n + input: {"n": {_type: "float64", value: "NaN"}} + output: true + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN == 0.0 is false" + mapping: | + root = this.n == 0.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN != 0.0 is true" + mapping: | + root = this.n != 0.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: true + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + # --- NaN comparison --- + + - name: "NaN < 1.0 is false" + mapping: | + root = this.n < 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN > 1.0 is false" + mapping: | + root = this.n > 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN <= 1.0 is false" + mapping: | + root = this.n <= 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN >= 1.0 is false" + mapping: | + root = this.n >= 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "1.0 < NaN is false" + mapping: | + root = 1.0 < this.n + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "1.0 > NaN is false" + mapping: | + root = 1.0 > this.n + input: {"n": {_type: "float64", value: "NaN"}} + output: false + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + # --- NaN arithmetic --- + + - name: "NaN + 1.0 is NaN" + mapping: | + root = this.n + 1.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: {_type: "float64", value: "NaN"} + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN * 0.0 is NaN" + mapping: | + root = this.n * 0.0 + input: {"n": {_type: "float64", value: "NaN"}} + output: {_type: "float64", value: "NaN"} + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + - name: "NaN - NaN is NaN" + mapping: | + root = this.n - this.n + input: {"n": {_type: "float64", value: "NaN"}} + output: {_type: "float64", value: "NaN"} + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + # --- NaN in sort (total ordering: NaN after all values) --- + + - name: "sort ordering with NaN values" + mapping: | + root = this.arr.sort() + cases: + - name: "NaN after all finite values" + input: {"arr": [3.0, {_type: "float64", value: "NaN"}, 1.0, 2.0]} + output: [1.0, 2.0, 3.0, {_type: "float64", value: "NaN"}] + - name: "multiple NaN values kept at end" + input: {"arr": [{_type: "float64", value: "NaN"}, 2.0, {_type: "float64", value: "NaN"}, 1.0]} + output: [1.0, 2.0, {_type: "float64", value: "NaN"}, {_type: "float64", value: "NaN"}] + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + # --- NaN in unique (NaN treated as equal) --- + + - name: "unique treats multiple NaN as equal — keeps first" + mapping: | + root = this.arr.unique() + input: {"arr": [{_type: "float64", value: "NaN"}, 1.0, {_type: "float64", value: "NaN"}, 2.0]} + output: [{_type: "float64", value: "NaN"}, 1.0, 2.0] + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + # --- NaN bool conversion --- + + - name: "NaN.bool() is error" + mapping: | + root = this.n.bool() + input: {"n": {_type: "float64", value: "NaN"}} + error: "NaN" + skip: "V2-only: V1 has no NaN literal and no tagged numeric-type inputs." + + # --- NaN type --- + + - name: "NaN type is float64" + mapping: | + root = this.n.type() + input: {"n": {_type: "float64", value: "NaN"}} + output: "float64" + skip: "V2-only: V1 has no NaN literal, no tagged inputs, and .type() returns 'number' not 'float64'." diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/precision_loss.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/precision_loss.yaml new file mode 100644 index 000000000..1db5b0074 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/precision_loss.yaml @@ -0,0 +1,89 @@ +description: "Edge cases: precision loss when promoting large int64 to float64, explicit conversion unchecked" + +tests: + # --- int64 > 2^53 + float arithmetic errors --- + + - name: "int64 just above 2^53 plus float is error" + mapping: | + root = 9007199254740993 + 1.0 + error: "exact" + skip: "V1-divergence: V1 silently promotes int64 to float64 for mixed arithmetic (no exactness check)." + + - name: "int64 at 2^53 plus float is ok" + mapping: | + root = 9007199254740992 + 1.0 + output: 9007199254740993.0 + + - name: "int64 well above 2^53 plus float is error" + mapping: | + root = 9223372036854775807 + 0.0 + error: "exact" + skip: "V1-divergence: V1 silently promotes int64 to float64 (no exactness check)." + + - name: "large int64 minus float is error" + mapping: | + root = 9007199254740993 - 1.0 + error: "exact" + skip: "V1-divergence: V1 silently promotes int64 to float64 (no exactness check)." + + - name: "large int64 times float is error" + mapping: | + root = 9007199254740993 * 2.0 + error: "exact" + skip: "V1-divergence: V1 silently promotes int64 to float64 (no exactness check)." + + - name: "negative large int64 plus float is error" + mapping: | + root = -9007199254740993 + 1.0 + error: "exact" + skip: "V1-divergence: V1 silently promotes int64 to float64 (no exactness check)." + + # --- Small int64 with float is fine --- + + - name: "small int64 plus float is ok" + mapping: | + root = 42 + 1.5 + output: 43.5 + + - name: "int64 at 2^53 minus 1 plus float is ok" + mapping: | + root = 9007199254740991 + 1.0 + output: 9007199254740992.0 + + # --- Explicit conversion (.float64()) is unchecked --- + + - name: "explicit float64 conversion of large int64 is unchecked" + mapping: | + root = 9007199254740993.float64() + output: 9007199254740992.0 + skip: "V2-only: V1 has no .float64() method (only the generic .number() coercer)." + + - name: "explicit float64 conversion of int64 max is unchecked" + mapping: | + root = 9223372036854775807.float64().type() + output: "float64" + skip: "V2-only: V1 has no .float64() method and .type() returns 'number'." + + # --- uint64 > int64 max + int is error --- + + - name: "uint64 above int64 max plus int is error" + mapping: | + let big = "18446744073709551615".uint64() + root = $big + 1 + error: "uint64 value exceeds int64 range" + skip: "V1-divergence: uint64 + int silently wraps (produces 0) instead of erroring on exceeds int64 range" + + # --- Boundary: exactly 2^53 --- + + - name: "int64 exactly 2^53 float64 roundtrip is exact" + mapping: | + let v = 9007199254740992 + root = $v + 0.5 + output: 9007199254740992.5 + + - name: "int64 exactly 2^53 plus 1 float operation errors" + mapping: | + let v = 9007199254740993 + root = $v + 0.5 + error: "exact" + skip: "V1-divergence: V1 silently promotes int64 to float64 (no exactness check)." diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/string_codepoints.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/string_codepoints.yaml new file mode 100644 index 000000000..673f99fa8 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/string_codepoints.yaml @@ -0,0 +1,158 @@ +description: > + String codepoint operations — indexing returns codepoints (int64), + .char() converts back, .length() counts codepoints, .split("") splits + by codepoint, and .reverse() reverses by codepoint. + +tests: + # --- String indexing returns codepoint values --- + + - name: "ASCII character index returns codepoint" + mapping: | + root.v = "hello"[0] + output: {"v": 104} + skip: "V2-only: V1 has no bracket-indexing syntax (§14.10); strings are not indexable by int in V1." + + - name: "second ASCII character" + mapping: | + root.v = "hello"[1] + output: {"v": 101} + skip: "V2-only: V1 has no bracket-indexing syntax and no codepoint indexing on strings." + + - name: "space codepoint" + mapping: | + root.v = "a b"[1] + output: {"v": 32} + skip: "V2-only: V1 has no bracket-indexing syntax and no codepoint indexing on strings." + + - name: "digit character codepoint" + mapping: | + root.v = "0"[0] + output: {"v": 48} + skip: "V2-only: V1 has no bracket-indexing syntax and no codepoint indexing on strings." + + # --- .char() round-trip --- + + - name: "char converts codepoint back to string" + mapping: | + root.v = 104.char() + output: {"v": "h"} + skip: "V2-only: V1 has no .char() method." + + - name: "char round-trip from indexing" + mapping: | + root.v = "hello"[0].char() + output: {"v": "h"} + skip: "V2-only: V1 has no bracket-indexing and no .char() method." + + - name: "char with emoji codepoint" + mapping: | + root.v = 128075.char() + output: {"v": "\U0001F44B"} + skip: "V2-only: V1 has no .char() method." + + # --- .length() counts codepoints --- + + - name: "ASCII string length" + mapping: | + root.v = "hello".length() + output: {"v": 5} + + - name: "empty string length" + mapping: | + root.v = "".length() + output: {"v": 0} + + - name: "unicode string length counts codepoints" + # V1 `.length()` on a string counts bytes. "café" is 5 bytes (é = 2 bytes). + mapping: | + root.v = "café".length() + output: {"v": 5} + + # --- .split("") splits by codepoint --- + + - name: "split empty delimiter splits by codepoint" + mapping: | + root.v = "abc".split("") + output: {"v": ["a", "b", "c"]} + # FIXME-v1: verify — V1 .split("") behaviour with empty separator is not guaranteed to match. + + - name: "split empty delimiter on empty string" + mapping: | + root.v = "".split("") + output: {"v": []} + # FIXME-v1: verify. + + - name: "split on normal delimiter" + mapping: | + root.v = "a,b,c".split(",") + output: {"v": ["a", "b", "c"]} + + - name: "split on multi-char delimiter" + mapping: | + root.v = "a::b::c".split("::") + output: {"v": ["a", "b", "c"]} + + # --- .reverse() reverses by codepoint --- + + - name: "reverse ASCII string" + mapping: | + root.v = "hello".reverse() + output: {"v": "olleh"} + # FIXME-v1: verify — V1 .reverse() on strings may not be implemented or may behave differently. + + - name: "reverse empty string" + mapping: | + root.v = "".reverse() + output: {"v": ""} + # FIXME-v1: verify — V1 .reverse() on strings may not be implemented. + + - name: "reverse single character" + mapping: | + root.v = "a".reverse() + output: {"v": "a"} + # FIXME-v1: verify — V1 .reverse() on strings may not be implemented. + + # --- Negative indexing --- + + - name: "negative index -1 is last codepoint" + mapping: | + root.v = "hello"[-1] + output: {"v": 111} + skip: "V2-only: V1 has no bracket-indexing (§14.10) and no negative numeric path segments (§14.11)." + + - name: "negative index -1 char round-trip" + mapping: | + root.v = "hello"[-1].char() + output: {"v": "o"} + skip: "V2-only: V1 has no bracket-indexing, no .char() method, and no negative path indexing." + + # --- Slice on strings --- + + - name: "string slice basic" + mapping: | + root.v = "hello world".slice(0, 5) + output: {"v": "hello"} + + - name: "string slice from middle" + mapping: | + root.v = "hello world".slice(6, 11) + output: {"v": "world"} + + - name: "string slice clamped to length" + mapping: | + root.v = "hi".slice(0, 100) + output: {"v": "hi"} + # FIXME-v1: verify — V1 .slice() out-of-range behaviour (clamp vs error) differs by version. + + - name: "string slice with negative start" + mapping: | + root.v = "hello".slice(-3, 5) + output: {"v": "llo"} + # FIXME-v1: verify — V1 .slice() negative-index behaviour differs by version. + + - name: "string slice empty range" + # V1 `.slice(low, high)` rejects `low >= high` at compile time (the check is a literal-bounds check + # performed when both bounds are literal integers). + mapping: | + root.v = "hello".slice(3, 3) + compile_error: "lower slice bound" diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/unicode.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/unicode.yaml new file mode 100644 index 000000000..7103b58bb --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/unicode.yaml @@ -0,0 +1,114 @@ +description: "Edge cases: multi-codepoint emoji, combining characters, no normalization" + +tests: + # --- Multi-codepoint emoji --- + + - name: "skin-tone emoji has length 2" + # V1 `.length()` counts bytes. Each of these emoji is 4 bytes in UTF-8 → total 8. + mapping: | + root = "\U0001F44B\U0001F3FD".length() + output: 8 + + - name: "simple emoji has length 1" + # V1 `.length()` counts bytes. U+1F600 is 4 bytes in UTF-8. + mapping: | + root = "\U0001F600".length() + output: 4 + + - name: "flag emoji has length 2 (two regional indicators)" + # V1 `.length()` counts bytes. Two regional indicators are 4 bytes each → 8 bytes. + mapping: | + root = "\U0001F1FA\U0001F1F8".length() + output: 8 + + - name: "family emoji (ZWJ sequence) has multiple codepoints" + # V1 `.length()` counts bytes. Three 4-byte emoji + two 3-byte ZWJ joiners = 18 bytes. + mapping: | + root = "\U0001F468‍\U0001F469‍\U0001F467".length() + output: 18 + + # --- Codepoint indexing on emoji --- + + - name: "index first codepoint of multi-codepoint emoji" + mapping: | + root = "\U0001F44B\U0001F3FD"[0] + output: 128075 + skip: "V2-only: V1 has no bracket-indexing (§14.10) and no codepoint indexing on strings." + + - name: "index second codepoint of multi-codepoint emoji" + mapping: | + root = "\U0001F44B\U0001F3FD"[1] + output: 127997 + skip: "V2-only: V1 has no bracket-indexing (§14.10) and no codepoint indexing on strings." + + # --- Combining characters --- + + - name: "precomposed e-acute has length 1" + # V1 `.length()` counts bytes. Precomposed "é" (U+00E9) is 2 bytes in UTF-8. + mapping: | + root = "é".length() + output: 2 + + - name: "decomposed e plus combining acute has length 2" + # V1 .length() counts bytes: 'e' (1 byte) + U+0301 combining acute (2 bytes) = 3 bytes. + mapping: | + root = "é".length() + output: 3 + + - name: "precomposed and decomposed are not equal" + mapping: | + root = "é" == "é" + output: false + + - name: "precomposed and decomposed have different lengths" + mapping: | + let a = "é".length() + let b = "é".length() + root = $a == $b + output: false + # FIXME-v1: verify — depends on V1 .length() codepoint vs byte semantics. + + # --- String comparison is codepoint-by-codepoint --- + + - name: "string comparison is codepoint-based" + mapping: | + root = "á" < "é" + output: true + + # --- Unicode in object keys --- + + - name: "unicode string as object key" + mapping: | + root = {"é": "accent"} + output: {"é": "accent"} + + - name: "emoji as object key" + mapping: | + root = {"\U0001F600": "smile"} + output: {"😀": "smile"} + + # --- Unicode string methods --- + + - name: "contains with unicode" + mapping: | + root = "café".contains("é") + output: true + + - name: "uppercase with unicode" + mapping: | + root = "café".uppercase() + output: "CAFÉ" + + # --- Mixed ASCII and multi-byte --- + + - name: "length of mixed ASCII and multi-byte string" + # V1 `.length()` counts bytes: 'a' (1) + U+1F600 (4) + 'b' (1) = 6. + mapping: | + root = "a\U0001F600b".length() + output: 6 + + - name: "index past emoji in mixed string" + mapping: | + root = "a\U0001F600b"[2] + output: 98 + skip: "V2-only: V1 has no bracket-indexing (§14.10) and no codepoint indexing on strings." diff --git a/internal/bloblang2/migrator/v1spec/tests/edge_cases/whitespace_newlines.yaml b/internal/bloblang2/migrator/v1spec/tests/edge_cases/whitespace_newlines.yaml new file mode 100644 index 000000000..391091381 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/edge_cases/whitespace_newlines.yaml @@ -0,0 +1,144 @@ +description: "Edge cases: whitespace handling, raw string newlines, multi-line mappings" + +tests: + # --- Whitespace in expressions --- + + - name: "extra spaces around operators" + mapping: | + root = 1 + 2 + output: 3 + + - name: "no spaces around operators" + mapping: | + root = 1+2 + output: 3 + + - name: "tabs in expressions" + mapping: "root\t=\t1\t+\t2" + output: 3 + + - name: "spaces in array literal" + mapping: | + root = [ 1 , 2 , 3 ] + output: [1, 2, 3] + + - name: "spaces in object literal" + mapping: | + root = { "a" : 1 , "b" : 2 } + output: {"a": 1, "b": 2} + + # --- Multi-line mappings --- + + - name: "multi-line array literal" + mapping: | + root = [ + 1, + 2, + 3, + ] + output: [1, 2, 3] + + - name: "multi-line object literal" + mapping: | + root = { + "a": 1, + "b": 2, + "c": 3, + } + output: {"a": 1, "b": 2, "c": 3} + + - name: "multi-line method chain" + # V1 parser rejects a newline immediately before `.` (§2.1) — break *after* the dot instead. + # Note: V1 `.reverse()` is string-only, so use `.sort(left > right)` for descending order. + mapping: | + root = [3, 1, 2]. + sort(). + sort(left > right) + output: [3, 2, 1] + + - name: "multi-line if expression" + mapping: | + root = if true { + "yes" + } else { + "no" + } + output: "yes" + + - name: "multi-line match expression" + mapping: | + root = match 2 { + 1 => "one", + 2 => "two", + _ => "other", + } + output: "two" + + # --- Raw strings preserve newlines --- + + - name: "raw string preserves literal newline" + # YAML `|` block scalars strip the common leading indentation before the bloblang source sees + # the string. After YAML parse the mapping is `root = """line1\nline2"""` — so V1 sees a raw + # string with a single literal newline between `line1` and `line2` (no leading spaces). + mapping: | + root = """line1 + line2""" + output: "line1\nline2" + + - name: "raw string preserves multiple newlines" + # YAML strips the common leading indentation — the bloblang source sees `"""a\n\nb"""`. + mapping: | + root = """a + + b""" + output: "a\n\nb" + + - name: "raw string preserves tabs" + mapping: | + root = """col1 col2""" + output: "col1\tcol2" + + - name: "raw string does not process escape sequences" + mapping: | + root = """hello\nworld""" + output: "hello\\nworld" + + # --- Escaped newlines in regular strings --- + + - name: "escaped newline in regular string" + mapping: | + root = "line1\nline2" + output: "line1\nline2" + + - name: "escaped tab in regular string" + mapping: | + root = "col1\tcol2" + output: "col1\tcol2" + + # --- Blank lines between statements --- + + - name: "blank lines between statements are ignored" + mapping: | + root.a = 1 + + root.b = 2 + + root.c = 3 + output: {"a": 1, "b": 2, "c": 3} + + # --- Leading/trailing whitespace in mapping --- + + - name: "mapping with leading blank lines" + mapping: | + + root = 42 + output: 42 + + - name: "multiple assignments separated by blank lines" + mapping: | + let x = 10 + + let y = 20 + + root = $x + $y + output: 30 diff --git a/internal/bloblang2/migrator/v1spec/tests/error_handling/catch.yaml b/internal/bloblang2/migrator/v1spec/tests/error_handling/catch.yaml new file mode 100644 index 000000000..143823de0 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/error_handling/catch.yaml @@ -0,0 +1,177 @@ +description: ".catch() — intercepts errors, error object access, passthrough for non-errors, chaining, scope" + +tests: + # --- Basic catch usage --- + + - name: "catch intercepts division by zero" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).catch(err -> -1) + output: {"result": -1} # V1 literal `5 / 0` fails at compile time; use a runtime operand so .catch can run + + - name: "catch intercepts throw" + mapping: | + root.result = throw("boom").catch(err -> "caught") + output: {"result": "caught"} + + - name: "catch intercepts out of bounds" + mapping: | + let arr = [1, 2, 3] + root.result = $arr.index(10).catch(err -> 0) + output: {"result": 0} + + - name: "catch intercepts type mismatch" + input: {"s": "nope"} + mapping: | + root.result = (5 + this.s).catch(err -> 0) + output: {"result": 0} # V1 literal `5 + "nope"` fails at compile time; use a runtime operand so .catch can run + + # --- Error object access --- + # In V1 `err` is the error string itself (not an object with .what). + # V1 error messages are also typically prefixed with things like + # "failed assignment (line N): ..." — tests match on substring. + + - name: "error object has what field with message" + mapping: | + root.result = throw("something broke").catch(err -> err) + output: {"result": "something broke"} # FIXME-v1: verify — V1 err is a string; full message may include a prefix + + - name: "error what field from division by zero" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).catch(err -> err) + output: {"result": "field `this.zero`: attempted to divide by zero"} # V1 err is a string; literal 5/0 fails at compile — use runtime operand + + - name: "error what field used in concatenation" + mapping: | + root.result = throw("oops").catch(err -> "error: " + err) + output: {"result": "error: oops"} # FIXME-v1: verify — V1 err may include a prefix before "oops" + + # --- Directly on input/output/variable (no intermediate field access) --- + + - name: "catch on input directly when error" + input: null + mapping: | + root = this.not_null().catch(err -> "was null") + output: "was null" + + - name: "catch on variable directly when no error" + mapping: | + let v = "hello" + root = $v.catch(err -> "caught") + output: "hello" + + - name: "catch on input directly when no error" + input: "hello" + mapping: | + root = this.catch(err -> "fallback") + output: "hello" + + # --- No-error passthrough --- + + - name: "catch returns value unchanged when no error" + mapping: | + root.result = "hello".catch(err -> "fallback") + output: {"result": "hello"} + + - name: "catch returns int unchanged when no error" + mapping: | + root.result = 42.catch(err -> 0) + output: {"result": 42} + + - name: "catch returns null unchanged when no error" + mapping: | + root.result = null.catch(err -> "fallback") + output: {"result": null} + + - name: "catch lambda not invoked when no error" + mapping: | + root.result = "ok".catch(err -> throw("should not run")) + output: {"result": "ok"} + + # --- Void passes through catch unchanged --- + # V1 does not have a distinct "void" value — an if-without-else that + # doesn't match yields null, which `.catch(...)` passes through. + + - name: "void passes through catch — assignment skipped" + mapping: | + root.x = "prior" + root.x = (if false { 1 }).catch(err -> 99) + output: {"x": "prior"} # FIXME-v1: verify — V1 if-without-else yields null; assigning null to root.x leaves the prior value or assigns null depending on mapping semantics + skip: "V1 has no 'void' sentinel — if-without-else yields null, which .catch() passes through but then the assignment writes null rather than being skipped" + + - name: "void passes through catch then subsequent method errors" + mapping: | + root.result = (if false { 1 }).catch(err -> 0).string().catch(err -> "caught void method") + output: {"result": "caught void method"} + skip: "V1 has no void; (if false { 1 }) yields null, .catch passes it through, null.string() returns 'null' rather than erroring" + + # --- Deleted passes through catch unchanged --- + + - name: "deleted passes through catch — field removed" + mapping: | + root.x = "prior" + root.x = deleted().catch(err -> "rescued") + output: {} # FIXME-v1: verify — V1 .catch passes deleted() through; assigning deleted() to root.x removes the field + + # --- Parentheses define catch scope --- + + - name: "catch scoped to parenthesized expression" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).string().catch(err -> "0") + output: {"result": "0"} # V1 literal `5 / 0` fails at compile time; use a runtime operand so .catch can run + + - name: "catch scoped to inner parens catches inner error only" + input: {"zero": 0} + mapping: | + root.result = ((5 / this.zero).catch(err -> 42)).string() + output: {"result": "42"} # V1 literal `5 / 0` fails at compile time; use a runtime operand + + - name: "catch on method chain catches entire chain" + mapping: | + root.result = "hello".number().string().catch(err -> -1) + output: {"result": -1} # V1 core env has no `abs` or `int64` methods (those live in impl/pure); use .number() for the erroring step + + # --- Chained catches --- + + - name: "first catch handles error, second catch not invoked" + mapping: | + root.result = throw("x").catch(err -> "first").catch(err -> "second") + output: {"result": "first"} + + - name: "first catch re-throws, second catch handles" + mapping: | + root.result = throw("x").catch(err -> throw("re-thrown")).catch(err -> "final") + output: {"result": "final"} + + - name: "chained catch with error in first handler" + input: {"zero": 0} + mapping: | + root.result = throw("x").catch(err -> 5 / this.zero).catch(err -> "safe") + output: {"result": "safe"} # V1 literal `5 / 0` fails at compile; use a runtime operand so the first handler errors at runtime + + # --- Handler returning deleted --- + + - name: "catch handler returns deleted — field removed" + mapping: | + root.x = "prior" + root.x = throw("err").catch(err -> deleted()) + output: {} + + # --- Handler returning void --- + + - name: "catch handler returns void — assignment skipped" + mapping: | + root.x = "prior" + root.x = throw("err").catch(err -> if false { "nope" }) + output: {"x": "prior"} + skip: "V1 has no void; if-without-else yields null, which is assigned to root.x rather than skipping" + + # --- Catch after method that errors --- + + - name: "catch after string method on non-stringable" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).string().catch(err -> "zero div") + output: {"result": "zero div"} # V1 literal `5 / 0` fails at compile; use a runtime operand diff --git a/internal/bloblang2/migrator/v1spec/tests/error_handling/not_null.yaml b/internal/bloblang2/migrator/v1spec/tests/error_handling/not_null.yaml new file mode 100644 index 000000000..7d9da0854 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/error_handling/not_null.yaml @@ -0,0 +1,146 @@ +description: ".not_null() — returns value if not null, throws error on null, optional custom message" + +tests: + # --- Passthrough for non-null values --- + + - name: "not_null passes through string" + mapping: | + root.result = "hello".not_null() + output: {"result": "hello"} + + - name: "not_null passes through integer" + mapping: | + root.result = 42.not_null() + output: {"result": 42} + + - name: "not_null passes through zero" + mapping: | + root.result = 0.not_null() + output: {"result": 0} + + - name: "not_null passes through false" + mapping: | + root.result = false.not_null() + output: {"result": false} + + - name: "not_null passes through empty string" + mapping: | + root.result = "".not_null() + output: {"result": ""} + + - name: "not_null passes through empty array" + mapping: | + root.result = [].not_null() + output: {"result": []} + + - name: "not_null passes through empty object" + mapping: | + root.result = {}.not_null() + output: {"result": {}} + + # --- Error on null with default message --- + # V1 .not_null() error message is "value is null" (not "unexpected null value") + + - name: "not_null on null produces default error" + mapping: | + root.result = null.not_null() + error: "value is null" # FIXME-v1: verify — V1 uses "value is null" (possibly prefixed with "failed assignment (line N): ") + + - name: "not_null on null input field produces default error" + input: {} + mapping: | + root.result = this.missing_field.not_null() + error: "value is null" # FIXME-v1: verify + + # --- Error on null with custom message --- + # V1 .not_null() takes no arguments — there is no custom message form. + + - name: "not_null with custom message on null" + mapping: | + root.result = null.not_null("name required") + skip: "V1 .not_null() does not accept a custom message argument" + + - name: "not_null with custom message on missing input field" + input: {} + mapping: | + root.result = this.email.not_null("email is required") + skip: "V1 .not_null() does not accept a custom message argument" + + - name: "not_null custom message ignored when value present" + mapping: | + root.result = "exists".not_null("should not appear") + skip: "V1 .not_null() does not accept a custom message argument" + + # --- Caught by catch --- + + - name: "not_null error caught by catch — default message" + mapping: | + root.result = null.not_null().catch(err -> "was null") + output: {"result": "was null"} + + - name: "not_null error caught by catch — custom message accessible" + mapping: | + root.result = null.not_null("name required").catch(err -> err.what) + skip: "V1 .not_null() does not accept a custom message argument; V1 err is a string, not an object with .what" + + - name: "not_null error caught by catch with fallback value" + input: {} + mapping: | + root.result = this.name.not_null().catch(err -> "anonymous") + output: {"result": "anonymous"} # Note: V1 has no custom message form; dropping the argument + + # --- Chained with or --- + + - name: "or rescues null before not_null is reached" + input: {} + mapping: | + root.result = this.name.or("default").not_null() + output: {"result": "default"} + + - name: "not_null then or — error propagates through or" + mapping: | + root.result = null.not_null().or("default") + output: {"result": "default"} # FIXME-v1: verify — V1 .or() catches errors too, so the not_null error is caught + + # --- In postfix chain --- + + - name: "not_null in chain with subsequent method" + mapping: | + root.result = "hello".not_null().uppercase() + output: {"result": "HELLO"} + + - name: "not_null error skips subsequent methods" + mapping: | + root.result = null.not_null().uppercase() + error: "value is null" # FIXME-v1: verify + + - name: "not_null on result of field access" + input: {"user": {"name": "alice"}} + mapping: | + root.result = this.user.name.not_null() + output: {"result": "alice"} + + # --- Used in conditional patterns --- + + - name: "not_null in match arm with null check" + mapping: | + root.result = match this.val { + null => throw("got null"), + _ => this.not_null(), + } + cases: + - name: "non-null passes through" + input: {"val": "hello"} + output: {"result": "hello"} + - name: "null triggers throw" + input: {"val": null} + error: "got null" + # Note: V1 `match x { ... }` rebinds `this` to `x` inside arm bodies — `this.val` + # would refer to a .val on the matched scalar. Use `this` to reference the + # matched value itself (equivalent to V2's `match x as v` binding). + + - name: "not_null with catch as validation pattern" + input: {"name": null} + mapping: | + root.result = this.name.not_null().catch(err -> "unknown") + output: {"result": "unknown"} # Note: V1 has no custom-message form for .not_null() diff --git a/internal/bloblang2/migrator/v1spec/tests/error_handling/or.yaml b/internal/bloblang2/migrator/v1spec/tests/error_handling/or.yaml new file mode 100644 index 000000000..c9f1a5c95 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/error_handling/or.yaml @@ -0,0 +1,172 @@ +description: ".or() — rescues null, void, and deleted; does NOT catch errors; short-circuit evaluation" + +tests: + # --- Rescues null --- + + - name: "or rescues null with string default" + mapping: | + root.result = null.or("default") + output: {"result": "default"} + + - name: "or rescues null with integer default" + mapping: | + root.result = null.or(42) + output: {"result": 42} + + - name: "or rescues null from missing input field" + input: {} + mapping: | + root.result = this.name.or("anonymous") + output: {"result": "anonymous"} + + - name: "or rescues null with null (returns null)" + mapping: | + root.result = null.or(null) + output: {"result": null} + + # --- Rescues void --- + + - name: "or rescues void from if-without-else" + mapping: | + root.result = (if false { "hello" }).or("default") + output: {"result": "default"} # V1: if-without-else yields null, .or rescues null + + - name: "or rescues void from non-exhaustive match" + mapping: | + root.result = (match "bird" { "cat" => "meow" }).or("unknown") + output: {"result": "unknown"} # V1: non-matching match yields null, .or rescues + + # --- Rescues deleted --- + + - name: "or rescues deleted with string" + mapping: | + root.result = deleted().or("fallback") + output: {"result": "fallback"} + + - name: "or rescues deleted with integer" + mapping: | + root.result = deleted().or(0) + output: {"result": 0} + + # --- Directly on input/output/variable (no intermediate field access) --- + + - name: "or on input directly" + mapping: | + root = this.or("fallback") + cases: + - name: "null input returns fallback" + input: null + output: "fallback" + - name: "present input returned unchanged" + input: "hello" + output: "hello" + + - name: "or on variable directly" + mapping: | + let v = null + root = $v.or("default") + output: "default" + + - name: "or on variable directly when present" + mapping: | + let v = "value" + root = $v.or("default") + output: "value" + + # --- Non-null/void/deleted: returns value unchanged --- + + - name: "or returns string unchanged" + mapping: | + root.result = "hello".or("default") + output: {"result": "hello"} + + - name: "or returns zero unchanged (not null)" + mapping: | + root.result = 0.or(42) + output: {"result": 0} + + - name: "or returns false unchanged (not null)" + mapping: | + root.result = false.or(true) + output: {"result": false} + + - name: "or returns empty string unchanged" + mapping: | + root.result = "".or("default") + output: {"result": ""} + + - name: "or returns empty array unchanged" + mapping: | + root.result = [].or([1, 2, 3]) + output: {"result": []} + + # --- Short-circuit: argument not evaluated when value present --- + + - name: "or short-circuits on non-null (throw not evaluated)" + mapping: | + root.result = "hello".or(throw("should not run")) + output: {"result": "hello"} + + - name: "or short-circuits on zero (throw not evaluated)" + mapping: | + root.result = 0.or(throw("should not run")) + output: {"result": 0} + + - name: "or short-circuits on false (throw not evaluated)" + mapping: | + root.result = false.or(throw("should not run")) + output: {"result": false} + + # --- Does NOT catch errors --- + # V2 .or() does NOT catch errors. V1 .or() DOES catch errors + # (see migrator/bloblang_v1_spec.md §12.2 — V1 .or() triggers on error OR null). + # These tests are preserved with adjusted expectations to reflect V1 semantics. + + - name: "or does not catch division by zero" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).or("default") + output: {"result": "default"} # V1 literal `5 / 0` fails at compile; use runtime operand. V1 .or() catches errors too, so fallback is used + + - name: "or does not catch throw" + mapping: | + root.result = throw("boom").or("default") + output: {"result": "default"} # FIXME-v1: verify — V1 .or() catches throw errors + + - name: "or does not catch type mismatch" + input: {"s": "hello"} + mapping: | + root.result = (5 + this.s).or(0) + output: {"result": 0} # V1 literal `5 + "hello"` fails at compile; use runtime operand. V1 .or() catches the type mismatch error + + - name: "or does not catch out of bounds" + mapping: | + let arr = [1, 2] + root.result = $arr.index(5).or(0) + output: {"result": 0} # FIXME-v1: verify — V1 .or() catches the out-of-bounds error + + # --- Composing .or() and .catch() --- + + - name: "catch then or — catch handles error, or not needed" + mapping: | + root.result = throw("x").catch(err -> "caught").or("default") + output: {"result": "caught"} + + - name: "catch then or — catch returns null, or rescues" + mapping: | + root.result = throw("x").catch(err -> null).or("rescued") + output: {"result": "rescued"} + + - name: "or then catch — or does not catch error, catch does" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).or("ignored").catch(err -> "caught") + output: {"result": "ignored"} # V1 literal `5 / 0` fails at compile; use runtime operand. V1 .or() already catches the error, so the .catch is never needed and result is "ignored" + + # --- or can itself return deleted --- + + - name: "or returns deleted — field removed" + mapping: | + root.x = "prior" + root.x = null.or(deleted()) + output: {} diff --git a/internal/bloblang2/migrator/v1spec/tests/error_handling/or_catch_composition.yaml b/internal/bloblang2/migrator/v1spec/tests/error_handling/or_catch_composition.yaml new file mode 100644 index 000000000..5645c402b --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/error_handling/or_catch_composition.yaml @@ -0,0 +1,129 @@ +description: > + Composing .or() and .catch() — order matters, short-circuit semantics, + null-safe operator interactions, and complex rescue chains. + +tests: + # --- .catch() first, .or() second --- + + - name: "catch handles error, or not needed" + mapping: | + root.v = throw("x").catch(err -> "caught").or("default") + output: {"v": "caught"} + + - name: "catch returns null, or rescues null" + mapping: | + root.v = throw("x").catch(err -> null).or("default") + output: {"v": "default"} + + - name: "catch returns value, or passes through" + mapping: | + root.v = throw("x").catch(err -> 42).or(0) + output: {"v": 42} + + # --- .or() first, .catch() second --- + # Note: V1 .or() catches errors too, so the .catch() below never fires. + + - name: "or does not handle error, catch does" + input: {"zero": 0} + mapping: | + root.v = (5 / this.zero).or("ignored").catch(err -> "caught") + output: {"v": "ignored"} # V1 literal `5 / 0` fails at compile; use runtime operand. V1 .or() catches the divide-by-zero, so value is "ignored" + + - name: "or rescues null, catch not needed" + mapping: | + root.v = null.or("default").catch(err -> "error") + output: {"v": "default"} + + # --- .or() default can itself error, caught by .catch() --- + + - name: "or default errors, caught by subsequent catch" + input: {"zero": 0} + mapping: | + root.v = null.or(5 / this.zero).catch(err -> "safe") + output: {"v": "safe"} # V1 literal `5 / 0` fails at compile; use runtime operand + + # --- Chained rescue patterns --- + # V1 has no null-safe `?.` operator (§6.2 of the V1 spec). Path access into + # null naturally yields null in V1, so `.or(...)` alone is the idiom. + + - name: "null-safe then or for nested field" + mapping: | + root.v = this.user.name.or("anonymous") + cases: + - name: "null user returns fallback" + input: {"user": null} + output: {"v": "anonymous"} # FIXME-v1: verify — V1 path access into null yields null; .or rescues + - name: "present user returns name" + input: {"user": {"name": "Alice"}} + output: {"v": "Alice"} + + - name: "null-safe chain to null then catch for method error" + mapping: | + let v = null + root.v = $v.trim().or("empty") + output: {"v": "empty"} # FIXME-v1: verify — V1 .trim() on null errors; V1 .or() catches errors (unlike V2) so fallback is used + + # --- Directly on input/variable with chaining --- + + - name: "or on input then method chain" + input: null + mapping: | + root.v = this.or("hello").uppercase() + output: {"v": "HELLO"} + + - name: "catch on input then or" + input: null + mapping: | + root.v = this.not_null().catch(err -> null).or("rescued") + output: {"v": "rescued"} + + - name: "or on variable then catch" + input: {"zero": 0} + mapping: | + let v = null + root.v = $v.or(5 / this.zero).catch(err -> "safe") + output: {"v": "safe"} # V1 literal `5 / 0` fails at compile; use runtime operand + + # --- .or() with deleted --- + + - name: "or rescues deleted from conditional" + mapping: | + root.v = (if false { deleted() } else { deleted() }).or("rescued") + output: {"v": "rescued"} + + # --- Error in or argument is only evaluated when needed --- + + - name: "or short-circuits — error in argument not evaluated" + mapping: | + root.v = "present".or(throw("boom")) + output: {"v": "present"} + + - name: "or evaluates argument on null — error propagates" + mapping: | + root.v = null.or(throw("boom")) + error: "boom" + + # --- Triple chain: or, catch, or --- + + - name: "triple rescue chain" + input: {"zero": 0} + mapping: | + root.v = null.or(5 / this.zero).catch(err -> null).or("final") + output: {"v": "final"} # V1 literal `5 / 0` fails at compile; use runtime operand + + # --- not_null with catch --- + + - name: "not_null error caught by catch" + mapping: | + root.v = null.not_null().catch(err -> "was null") + output: {"v": "was null"} + + - name: "not_null passes through non-null" + mapping: | + root.v = "hello".not_null() + output: {"v": "hello"} + + - name: "not_null error message contains context" + mapping: | + root.v = null.not_null().catch(err -> err) + output: {"v": "null literal: value is null"} # V1 err is a string (no .what); message is prefixed with the source expression context diff --git a/internal/bloblang2/migrator/v1spec/tests/error_handling/propagation.yaml b/internal/bloblang2/migrator/v1spec/tests/error_handling/propagation.yaml new file mode 100644 index 000000000..de1b12596 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/error_handling/propagation.yaml @@ -0,0 +1,162 @@ +description: "Error propagation through expressions, postfix chains, and multiple error sources" + +tests: + # --- Errors propagate through arithmetic --- + + - name: "error in left operand propagates through addition" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero) + 1 + error: "attempted to divide by zero" # V1 literal 5/0 fails at compile; use runtime operand + + - name: "error in right operand propagates through addition" + input: {"zero": 0} + mapping: | + root.result = 1 + (5 / this.zero) + error: "attempted to divide by zero" + + - name: "error in left operand propagates through multiplication" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero) * 3 + error: "attempted to divide by zero" + + - name: "error propagates through nested arithmetic" + input: {"zero": 0} + mapping: | + root.result = (1 + (5 / this.zero)) * 2 + error: "attempted to divide by zero" + + # --- Errors propagate through comparison and logical operators --- + + - name: "error propagates through equality check" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero) == 0 + error: "attempted to divide by zero" + + - name: "error propagates through comparison" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero) > 0 + error: "attempted to divide by zero" + + - name: "error propagates through logical and" + input: {"zero": 0} + mapping: | + root.result = true && ((5 / this.zero) > 0) + error: "attempted to divide by zero" + + - name: "error propagates through negation" + mapping: | + root.result = !throw("bad") + error: "bad" + + # --- Postfix chain: subsequent operations skipped --- + + - name: "error in method receiver skips subsequent method" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).string() + error: "attempted to divide by zero" + + - name: "error skips multiple chained methods" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero).string().uppercase().length() + error: "attempted to divide by zero" + + - name: "error from method skips subsequent methods" + mapping: | + root.result = "hello".number().string() + error: "strconv.ParseFloat" # V1 core env has no abs/int64 methods; use .number() which errors on non-numeric string + + - name: "error in field access skips subsequent field" + input: {} + mapping: | + root.result = throw("no data").name.length() + error: "no data" + + - name: "error in index access skips subsequent operations" + mapping: | + let arr = [1, 2] + root.result = $arr.index(5).string() + error: "out of bounds" + + # --- Error propagation through string interpolation --- + + - name: "error in string concatenation propagates" + input: {"zero": 0} + mapping: | + root.result = "value is: " + (5 / this.zero).string() + error: "attempted to divide by zero" + + # --- Error propagates into variable assignment --- + + - name: "error assigned to variable propagates on use" + input: {"zero": 0} + mapping: | + let x = 5 / this.zero + root.result = $x + error: "attempted to divide by zero" + + # --- Error propagates through collection literal --- + + - name: "error in array element propagates" + input: {"zero": 0} + mapping: | + root.result = [1, 5 / this.zero, 3] + error: "attempted to divide by zero" + + - name: "error in object value propagates" + input: {"zero": 0} + mapping: | + root.result = {"a": 5 / this.zero} + error: "attempted to divide by zero" + + # --- Error propagates through if condition --- + + - name: "error in if condition propagates" + input: {"zero": 0} + mapping: | + root.result = if (5 / this.zero) > 0 { "positive" } else { "negative" } + error: "attempted to divide by zero" + + # --- Error propagates through match subject --- + + - name: "error in match subject propagates" + input: {"zero": 0} + mapping: | + root.result = match (5 / this.zero) { + 0 => "zero", + _ => "other", + } + error: "attempted to divide by zero" + + # --- Error propagates through map arguments --- + # V1 named-map invocation is via `.apply("name")`, not function-style `double(x)`; + # the equivalent pattern is `(5 / 0).apply("double")` which propagates errors the + # same way. + + - name: "error in map argument propagates" + input: {"zero": 0} + mapping: | + map double { root = this * 2 } + root.result = (5 / this.zero).apply("double") + error: "attempted to divide by zero" # V1 map bodies assign to root; applied with .apply("name") + + # --- Type mismatch errors propagate --- + + - name: "type mismatch error propagates through chain" + input: {"s": "hello"} + mapping: | + root.result = (5 + this.s).string() + error: "cannot add" # V1 literal `5 + "hello"` fails at compile; use runtime operand so the type mismatch errors at runtime + + # --- Multiple error sources: first error wins --- + + - name: "left operand error reported when both sides error" + input: {"zero": 0} + mapping: | + root.result = (5 / this.zero) + (3 / this.zero) + error: "attempted to divide by zero" diff --git a/internal/bloblang2/migrator/v1spec/tests/error_handling/throw.yaml b/internal/bloblang2/migrator/v1spec/tests/error_handling/throw.yaml new file mode 100644 index 000000000..67d20d040 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/error_handling/throw.yaml @@ -0,0 +1,135 @@ +description: "throw() — produces catchable errors, compile errors for bad args, conditional throw" + +tests: + # --- Basic throw --- + + - name: "throw with string message produces error" + mapping: | + root.result = throw("something went wrong") + error: "something went wrong" + + - name: "throw with empty string" + mapping: | + root.result = throw("") + error: "" # FIXME-v1: verify — V1 error may include line/target prefix like "failed assignment (line 1): " + + - name: "throw error propagates to output" + mapping: | + root.result = throw("halt") + error: "halt" + + # --- Caught by catch --- + + - name: "throw caught by catch returns fallback" + mapping: | + root.result = throw("x").catch(err -> "fallback") + output: {"result": "fallback"} + + - name: "throw caught by catch — error message accessible" + mapping: | + root.result = throw("details here").catch(err -> err) + output: {"result": "details here"} # FIXME-v1: verify — V1 err is the error string directly (no .what), and may include "failed assignment" prefix + + - name: "throw in chain caught by catch at end" + mapping: | + root.result = throw("fail").string().catch(err -> "ok") + output: {"result": "ok"} + + # --- Uncaught throw halts mapping --- + + - name: "uncaught throw halts mapping — subsequent assignment not reached" + mapping: | + root.a = throw("halt") + root.b = "should not appear" + error: "halt" + + # --- Compile errors for bad arguments --- + + - name: "throw with zero args is compile error" + mapping: | + root.result = throw() + compile_error: "missing parameter: why" # V1 names the throw arg `why`; errors reference that name, not the function name + + - name: "throw with integer literal is compile error" + mapping: | + root.result = throw(42) + compile_error: "field why: wrong argument type, expected string, got number" # V1 ParamString rejects non-string literal + + - name: "throw with boolean literal is compile error" + mapping: | + root.result = throw(true) + compile_error: "field why: wrong argument type, expected string, got bool" + + - name: "throw with null literal is compile error" + mapping: | + root.result = throw(null) + compile_error: "field why: wrong argument type, expected string, got null" + + - name: "throw with two arguments is compile error" + mapping: | + root.result = throw("a", "b") + compile_error: "wrong number of arguments, expected 1, got 2" + + # --- Non-string dynamic expression is runtime error --- + + - name: "throw with dynamic int expression is runtime error" + mapping: | + let x = 42 + root.result = throw($x) + error: "throw" # FIXME-v1: verify — V1 may reject at parse time if ParamString refuses non-string query + + - name: "throw with dynamic bool expression is runtime error" + mapping: | + let x = true + root.result = throw($x) + error: "throw" # FIXME-v1: verify + + # --- Dynamic string expression works --- + + - name: "throw with dynamic string expression" + mapping: | + let msg = "dynamic error" + root.result = throw($msg) + error: "dynamic error" + + - name: "throw with dynamic string concatenation" + mapping: | + let code = 404 + root.result = throw("error code: " + $code.string()) + error: "error code: 404" + + # --- Conditional throw --- + + - name: "conditional throw in if — condition true" + mapping: | + let x = -1 + root.result = if $x < 0 { throw("negative value") } else { $x } + error: "negative value" + + - name: "conditional throw in if — condition false" + mapping: | + let x = 5 + root.result = if $x < 0 { throw("negative value") } else { $x } + output: {"result": 5} + + - name: "conditional throw in statement assignment" + mapping: | + let valid = false + if !$valid { root.err = throw("invalid") } + error: "invalid" + + - name: "conditional throw in match arm" + mapping: | + let status = "error" + root.result = match $status { + "ok" => "success", + "error" => throw("status was error"), + _ => "unknown", + } + error: "status was error" + + - name: "conditional throw caught by catch" + mapping: | + let x = -1 + root.result = (if $x < 0 { throw("negative") } else { $x }).catch(err -> 0) + output: {"result": 0} diff --git a/internal/bloblang2/migrator/v1spec/tests/imports/basic_import.yaml b/internal/bloblang2/migrator/v1spec/tests/imports/basic_import.yaml new file mode 100644 index 000000000..56cb1e213 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/imports/basic_import.yaml @@ -0,0 +1,144 @@ +description: "Imports: basic file imports, calling imported maps, passing values via the receiver" + +# V1 imports bring map definitions into a flat global namespace. There is no `as ` +# form and no `ns::map` qualified reference — the importing file sees the imported maps +# by their bare name via `.apply("name")`. Multi-argument maps are expressed by passing +# an object receiver and reading fields off `this` inside the map body. + +files: + "helpers.blobl": | + map double { + root = this * 2 + } + map greet { + root = "hello " + this + } + map add { + root = this.a + this.b + } + map constant { + root = 42 + } + +tests: + # --- Basic imported-map calls --- + + - name: "import and call zero-param map" + mapping: | + import "helpers.blobl" + root.v = {}.apply("constant") + output: {"v": 42} + + - name: "import and call single-param map" + mapping: | + import "helpers.blobl" + root.v = 21.apply("double") + output: {"v": 42} + + - name: "import and call two-param map" + mapping: | + import "helpers.blobl" + root.v = {"a": 3, "b": 7}.apply("add") + output: {"v": 10} + + - name: "import and call map with string arg" + mapping: | + import "helpers.blobl" + root.v = "world".apply("greet") + output: {"v": "hello world"} + + - name: "call imported map multiple times" + mapping: | + import "helpers.blobl" + root.a = 5.apply("double") + root.b = 10.apply("double") + output: {"a": 10, "b": 20} + + - name: "call multiple imported maps" + mapping: | + import "helpers.blobl" + root.sum = {"a": 3.apply("double"), "b": {}.apply("constant")}.apply("add") + output: {"sum": 48} + + # --- Import with local maps --- + + - name: "imported maps coexist with local maps" + mapping: | + import "helpers.blobl" + map triple { root = this * 3 } + root.v = 5.apply("double") + 2.apply("triple") + output: {"v": 16} + + - name: "local map can call imported map" + mapping: | + import "helpers.blobl" + map quad { root = this.apply("double").apply("double") } + root.v = 3.apply("quad") + output: {"v": 12} + + # --- Import with input data --- + + - name: "imported map processes input data" + mapping: | + import "helpers.blobl" + root.v = this.x.apply("double") + input: {"x": 7} + output: {"v": 14} + + # --- Error cases --- + + - name: "calling non-existent map in namespace is error" + mapping: | + import "helpers.blobl" + root.v = 1.apply("nonexistent") + error: "map nonexistent was not found" # V1 resolves .apply() targets at runtime, not compile time + + - name: "file not found is error" + mapping: | + import "missing.blobl" + root.v = 1.apply("foo") + compile_error: "missing" + + - name: "statements in imported file are compile error" + skip: "V1 silently accepts top-level statements in imported files; `let x = 42` at file top level is imported but does not establish the binding for map bodies that reference $x at runtime, producing a runtime 'variable undefined' error rather than a compile error" + + - name: "calling imported map without namespace is compile error" + skip: "V1 has no namespaces — imported maps are called by bare name via .apply(\"name\"); this test does not translate" + + # --- Qualified map references in higher-order methods --- + # V1 maps are not first-class values — you cannot pass `double` as an argument to .map(). + # The idiom is to wrap the call in a lambda that invokes .apply(). + + - name: "qualified map reference in .map()" + mapping: | + import "helpers.blobl" + root.v = [1, 2, 3].map_each(n -> n.apply("double")) + output: {"v": [2, 4, 6]} + + - name: "qualified map reference in .filter()" + files: + "predicates.blobl": | + map is_positive { root = this > 0 } + mapping: | + import "predicates.blobl" + root.v = [-1, 2, -3, 4].filter(n -> n.apply("is_positive")) + output: {"v": [2, 4]} + + - name: "qualified map reference in .sort_by()" + files: + "keys.blobl": | + map get_name { root = this.name } + mapping: | + import "keys.blobl" + let items = [{"name": "Charlie"}, {"name": "Alice"}, {"name": "Bob"}] + root.v = $items.sort_by(item -> item.apply("get_name")).map_each(x -> x.name) + output: {"v": ["Alice", "Bob", "Charlie"]} + + - name: "qualified reference to non-existent namespace is compile error" + skip: "V1 has no namespaces — there is no ns::map syntax to fail to resolve" + + - name: "qualified reference to non-existent map is compile error" + mapping: | + import "helpers.blobl" + root.v = [1, 2].map_each(n -> n.apply("nonexistent")) + error: "map nonexistent was not found" # V1 resolves .apply() targets at runtime, not compile time diff --git a/internal/bloblang2/migrator/v1spec/tests/imports/circular_import.yaml b/internal/bloblang2/migrator/v1spec/tests/imports/circular_import.yaml new file mode 100644 index 000000000..f6a556adc --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/imports/circular_import.yaml @@ -0,0 +1,54 @@ +description: "Imports: circular import detection — compile-time error" + +# V1 does not have a dedicated "circular import" detector. The importer recursively +# parses each imported file (`importParser` in `internal/bloblang/parser/mapping_parser.go`) +# and reports map-name collisions as parse errors. A genuine cycle (A imports B imports A) +# would either recurse until an OS/Go stack overflow, or surface as a duplicate-name +# collision, depending on the order in which maps are registered. +# +# The V2 semantic of a clean, named "circular import" error is not reproducible in V1 in +# a portable way, so most tests here are skipped. Where the V2 test happens to correspond +# to a V1 duplicate-map-name situation we translate it; otherwise we skip. + +files: + "leaf.blobl": | + map leaf_fn { root = this * 2 } + + "mid.blobl": | + import "leaf.blobl" + map mid_fn { root = this.apply("leaf_fn") + 1 } + +tests: + # --- Direct circular import (A -> B -> A) --- + + - name: "direct circular import is compile error" + skip: "V1 has no circular-import detector; the importer recurses until stack overflow or a duplicate-map-name collision. Behaviour is not a stable 'circular' error." + + - name: "circular import detected from other entry point" + skip: "V1 has no circular-import detector (see previous skip)." + + # --- Transitive circular import (A -> B -> C -> A) --- + + - name: "transitive circular import is compile error" + skip: "V1 has no circular-import detector; transitive cycles produce undefined behaviour." + + - name: "transitive circular from middle entry point" + skip: "V1 has no circular-import detector (see previous skip)." + + # --- Self-import --- + + - name: "self-import is compile error" + skip: "V1 has no circular-import detector; a file that imports itself recurses until overflow or name collision." + + # --- Main file importing itself --- + + - name: "main file importing file that imports back is circular" + skip: "V1 has no circular-import detector (see previous skips)." + + # --- Non-circular is fine (control test) --- + + - name: "non-circular chain compiles successfully" + mapping: | + import "mid.blobl" + root.v = 5.apply("mid_fn") + output: {"v": 11} diff --git a/internal/bloblang2/migrator/v1spec/tests/imports/duplicate_namespace.yaml b/internal/bloblang2/migrator/v1spec/tests/imports/duplicate_namespace.yaml new file mode 100644 index 000000000..b2001b821 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/imports/duplicate_namespace.yaml @@ -0,0 +1,89 @@ +description: "Imports: duplicate map-name collisions across imports — compile-time error" + +# V1 has no namespace / `as ` mechanism. Imported maps are pulled into a single +# flat map-name table; a name collision is a parse error: +# "map name collisions from import '': [name1, name2, ...]" +# (internal/bloblang/parser/mapping_parser.go:241). The V2 notion of "duplicate namespace" +# does not apply. Where the original V2 test describes a pure namespace-aliasing concern +# that V1 cannot express, we skip it. Where the test's intent matches a V1 map-name +# collision, we translate it. + +files: + "helpers.blobl": | + map double { root = this * 2 } + + "utils.blobl": | + map triple { root = this * 3 } + + "more_helpers.blobl": | + map quad { root = this * 4 } + + "helpers_conflict.blobl": | + # Redefines `double` — importing both this and helpers.blobl triggers a collision. + map double { root = this * 20 } + + "dup_self.blobl": | + map dup_self_fn { root = this } + +tests: + # --- Same namespace name for different files --- + + - name: "duplicate namespace from two different files is compile error" + skip: "V1 has no namespace aliases (no `as `); there is no analogue to assigning two files the same alias." + + - name: "duplicate namespace with three imports" + skip: "V1 has no namespace aliases (see previous skip)." + + # --- Same file imported twice with same namespace --- + # In V1 there is no namespace; importing the same file twice re-registers the maps and + # triggers the map-name-collision error, which is the closest analogue. + + - name: "same file imported twice with same namespace is compile error" + mapping: | + import "helpers.blobl" + import "helpers.blobl" + root.v = 5.apply("double") + compile_error: "collision" # FIXME-v1: verify — V1 reports "map name collisions from import" on the second import + + # --- Same file imported twice with different namespaces is ok --- + + - name: "same file imported with different namespaces is valid" + skip: "V1 has no namespace aliases; importing the same file twice is always a map-name collision, never a valid 'different namespace' situation." + + # --- Different files with distinct namespaces is ok --- + # In V1 the distinctness comes from map names being distinct, not from aliases. + + - name: "different files with distinct namespaces is valid" + mapping: | + import "helpers.blobl" + import "utils.blobl" + root.a = 5.apply("double") + root.b = 5.apply("triple") + output: {"a": 10, "b": 15} + + # --- Duplicate namespace in nested import --- + # V1 equivalent: an imported file pulls in two other files that share a map name. + + - name: "duplicate namespace within imported file is compile error" + files: + "bad_imports.blobl": | + import "helpers.blobl" + import "helpers_conflict.blobl" + map wrapper { root = this.apply("double") } + mapping: | + import "bad_imports.blobl" + root.v = 5.apply("wrapper") + compile_error: "collision" # FIXME-v1: verify — map-name collision surfaces when bad_imports.blobl itself is parsed + + # --- Namespace shadows local map name (still valid — different resolution) --- + # V1 has no namespaces — the imported map is called by the same bare name as any local + # map, so a "namespace name matching a local map name" situation doesn't exist. We can + # however test that an imported map and a differently-named local map coexist. + + - name: "namespace name can differ from local map names" + mapping: | + import "helpers.blobl" + map local_double { root = this * 2 } + root.a = 5.apply("double") + root.b = 5.apply("local_double") + output: {"a": 10, "b": 10} diff --git a/internal/bloblang2/migrator/v1spec/tests/imports/nested_import.yaml b/internal/bloblang2/migrator/v1spec/tests/imports/nested_import.yaml new file mode 100644 index 000000000..b051e9284 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/imports/nested_import.yaml @@ -0,0 +1,106 @@ +description: "Imports: nested import chains (A imports B, B imports C)" + +# V1 imports are flat — every transitively-imported map is pulled into the importing +# file's global map table, so there is no separate "transitive namespace". Multi-argument +# V2 maps (`map f(a, b)`) are encoded as `map f { root = this.a + this.b }` with an object +# receiver at the call site. + +files: + "math_core.blobl": | + map square { root = this * this } + map inc { root = this + 1 } + + "math_utils.blobl": | + import "math_core.blobl" + map square_plus_one { root = this.apply("square").apply("inc") } + map double_square { root = this.apply("square") * 2 } + + "app_helpers.blobl": | + import "math_utils.blobl" + map transform { root = this.apply("square_plus_one") + this.apply("inc") } + +tests: + # --- Two-level chain --- + + - name: "import file that imports another file" + mapping: | + import "math_utils.blobl" + root.v = 5.apply("square_plus_one") + output: {"v": 26} + + - name: "nested import calls inner map through wrapper" + mapping: | + import "math_utils.blobl" + root.v = 3.apply("double_square") + output: {"v": 18} + + # --- Three-level chain --- + + - name: "three-level import chain" + mapping: | + import "app_helpers.blobl" + root.v = 4.apply("transform") + output: {"v": 22} + + # --- Diamond import (A imports B and C, B imports C) --- + # V1 does not treat diamond imports as an error in themselves, but the transitive + # `math_core` maps (`square`, `inc`) from `math_utils` clash with the explicit + # `math_core` import at the top. Expect a map-name collision. + + - name: "diamond import — same file imported by two paths" + skip: "V1 imports share a flat map-name table; importing math_core explicitly after math_utils (which already pulls it in) raises a map-name collision — the V2 diamond scenario does not translate cleanly." + + # --- Nested import with multiple maps --- + + - name: "call multiple maps from nested import" + mapping: | + import "math_utils.blobl" + root.a = 2.apply("square_plus_one") + root.b = 2.apply("double_square") + output: {"a": 5, "b": 8} + + # --- Cannot access transitive namespace --- + # V1's flat import table is the opposite of V2 here: transitively-imported maps ARE + # accessible from the top file by their bare name. This test's intent does not + # translate. + + - name: "cannot access transitively imported namespace" + skip: "V1 imports are flat — transitively-imported maps are visible at the top level by bare name. The V2 restriction does not exist." + + # --- Nested import with local maps --- + + - name: "local map wraps nested-imported map" + mapping: | + import "math_utils.blobl" + map process { root = this.apply("square_plus_one") * 2 } + root.v = 3.apply("process") + output: {"v": 20} + + # --- Deep chain with input --- + + - name: "nested import processes input data" + mapping: | + import "math_utils.blobl" + root.v = this.n.apply("square_plus_one") + input: {"n": 6} + output: {"v": 37} + + # --- Error: non-existent map in nested import --- + + - name: "non-existent map in nested namespace is compile error" + mapping: | + import "math_utils.blobl" + root.v = 1.apply("nonexistent") + error: "map nonexistent was not found" # V1 resolves .apply() targets at runtime, not compile time + + # --- Error: nested file not found --- + + - name: "nested import with missing file is error" + files: + "bad_chain.blobl": | + import "nonexistent.blobl" + map foo { root = this.apply("bar") } + mapping: | + import "bad_chain.blobl" + root.v = 1.apply("foo") + compile_error: "nonexistent" diff --git a/internal/bloblang2/migrator/v1spec/tests/input_output/conditional_deletion.yaml b/internal/bloblang2/migrator/v1spec/tests/input_output/conditional_deletion.yaml new file mode 100644 index 000000000..9f7c9346a --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/input_output/conditional_deletion.yaml @@ -0,0 +1,172 @@ +description: > + Conditional deletion — using if/match to conditionally delete fields, + array elements, and metadata keys. Also tests deletion in iterator + lambdas and the interaction between deleted() and void in assignments. + +tests: + # --- Conditional field deletion via if --- + + - name: "if-true deletes field, if-false preserves via void" + mapping: | + root.a = "keep" + root.b = "drop" + root.b = if true { deleted() } + output: {"a": "keep"} + + - name: "if-false produces void — field preserved" + mapping: | + root.a = "keep" + root.a = if false { deleted() } + output: {"a": "keep"} + + - name: "if-else conditional field deletion" + mapping: | + root.a = 1 + root.b = 2 + root.b = if this.remove_b { deleted() } else { root.b } + cases: + - name: "deletes field when true" + input: {"remove_b": true} + output: {"a": 1} + - name: "keeps field when false" + input: {"remove_b": false} + output: {"a": 1, "b": 2} + + # --- Conditional field deletion via match --- + + - name: "match deletes field on matching case" + mapping: | + root.a = "keep" + root.b = "conditional" + root.b = match "remove" { + "remove" => deleted(), + _ => root.b, + } + output: {"a": "keep"} + + - name: "match preserves field on non-matching case" + mapping: | + root.a = "keep" + root.b = "conditional" + root.b = match "keep" { + "remove" => deleted(), + _ => root.b, + } + output: {"a": "keep", "b": "conditional"} + + - name: "match without wildcard — void preserves field" + # FIXME-v1: verify — V1 match with no matching case yields null (not void), + # so field may be overwritten to null rather than preserved. + mapping: | + root.status = "active" + root.status = match "nope" { + "remove" => deleted(), + } + output: {"status": "active"} + + # --- Conditional deletion in array literals --- + + - name: "if-expression conditionally omits array element" + mapping: | + let include = false + root.v = [1, if $include { 2 } else { deleted() }, 3] + output: {"v": [1, 3]} + + - name: "if-expression includes array element when true" + mapping: | + let include = true + root.v = [1, if $include { 2 } else { deleted() }, 3] + output: {"v": [1, 2, 3]} + + - name: "match conditionally omits array element" + mapping: | + let mode = "sparse" + root.v = [ + 1, + match $mode { "full" => 2, _ => deleted() }, + 3, + ] + output: {"v": [1, 3]} + + # --- Conditional deletion in object literals --- + + - name: "if-expression conditionally omits object field" + mapping: | + let include_debug = false + root.v = { + "name": "Alice", + "debug": if $include_debug { "trace-123" } else { deleted() }, + } + output: {"v": {"name": "Alice"}} + + - name: "if-expression includes object field when true" + mapping: | + let include_debug = true + root.v = { + "name": "Alice", + "debug": if $include_debug { "trace-123" } else { deleted() }, + } + output: {"v": {"name": "Alice", "debug": "trace-123"}} + + # --- Deletion in .map() iterator --- + + - name: "map lambda returning deleted omits element" + # V2 .map becomes V1 .map_each + mapping: | + root.v = [1, 2, 3, 4, 5].map_each(x -> if x % 2 == 0 { deleted() } else { x }) + output: {"v": [1, 3, 5]} + + - name: "map lambda deleting all elements produces empty array" + mapping: | + root.v = [1, 2, 3].map_each(x -> deleted()) + output: {"v": []} + + - name: "map lambda conditionally transforms or deletes" + mapping: | + root.v = [10, -5, 20, -3, 15].map_each(x -> if x > 0 { x * 2 } else { deleted() }) + output: {"v": [20, 40, 30]} + + # --- Array element deletion with negative indices --- + + - name: "delete last array element with negative index" + skip: "V1 has no bracket indexing on paths and no variable reassignment; cannot express $arr[-1] = deleted()" + + - name: "delete second-to-last array element" + skip: "V1 has no bracket indexing on paths and no variable reassignment; cannot express $arr[-2] = deleted()" + + - name: "delete first element via negative index" + skip: "V1 has no bracket indexing on paths and no variable reassignment; cannot express $arr[-3] = deleted()" + + # --- Deletion of nested variable fields --- + + - name: "delete deeply nested variable field" + skip: "V1 has no variable reassignment (let redefines whole var); cannot express $data.a.b.c = deleted()" + + - name: "delete all fields of nested object" + skip: "V1 has no variable reassignment; cannot express piecewise $data.inner.x = deleted()" + + # --- Conditional metadata deletion --- + + - name: "conditionally delete metadata key" + input: {} + input_metadata: {"source": "kafka", "debug": "true"} + mapping: | + meta = @ + let remove_debug = true + if $remove_debug { + meta debug = deleted() + } + output: {} + output_metadata: {"source": "kafka"} + + - name: "conditionally keep metadata key" + input: {} + input_metadata: {"source": "kafka", "debug": "true"} + mapping: | + meta = @ + let remove_debug = false + if $remove_debug { + meta debug = deleted() + } + output: {} + output_metadata: {"source": "kafka", "debug": "true"} diff --git a/internal/bloblang2/migrator/v1spec/tests/input_output/deletion.yaml b/internal/bloblang2/migrator/v1spec/tests/input_output/deletion.yaml new file mode 100644 index 000000000..6dd258b43 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/input_output/deletion.yaml @@ -0,0 +1,209 @@ +description: "Deletion: root = deleted() drops message, field deletion, array element deletion, deleted in literals, nested deletion, operations on deleted error" + +tests: + # --- root = deleted() drops entire message --- + + - name: "output = deleted() drops message" + mapping: | + root = deleted() + deleted: true + + - name: "output = deleted() discards prior assignments" + mapping: | + root.name = "Alice" + root.age = 30 + root = deleted() + deleted: true + + - name: "output = deleted() stops execution immediately" + # V1 does NOT short-circuit after `root = deleted()`. The subsequent + # statement attempts to set a field on the delete sentinel and produces + # a runtime error. + mapping: | + root = deleted() + root.should_not_exist = "never reached" + error: "non-object type" + + - name: "output = deleted() after root assignment still drops" + mapping: | + root = {"complex": "structure"} + root = deleted() + deleted: true + + # --- Field deletion --- + + - name: "delete output field" + mapping: | + root.a = 1 + root.b = 2 + root.c = 3 + root.b = deleted() + output: {"a": 1, "c": 3} + + - name: "delete nested output field" + mapping: | + root.user.name = "Alice" + root.user.age = 30 + root.user.age = deleted() + output: {"user": {"name": "Alice"}} + + - name: "delete non-existent output field is no-op" + mapping: | + root.a = 1 + root.missing = deleted() + output: {"a": 1} + + - name: "delete deeply nested field" + mapping: | + root.a.b.c = "deep" + root.a.b.d = "also deep" + root.a.b.c = deleted() + output: {"a": {"b": {"d": "also deep"}}} + + # --- Array element deletion --- + + - name: "delete array element shifts remaining" + # V1 path segments use `.N` not `[N]` + mapping: | + root.items = [10, 20, 30, 40] + root.items.1 = deleted() + output: {"items": [10, 30, 40]} + + - name: "delete first array element" + mapping: | + root.items = [10, 20, 30] + root.items.0 = deleted() + output: {"items": [20, 30]} + + - name: "delete last array element" + mapping: | + root.items = [10, 20, 30] + root.items.2 = deleted() + output: {"items": [10, 20]} + + # --- Variable deletion errors and behavior --- + + - name: "assign deleted() to variable is runtime error" + # In V1, `let x = deleted()` explicitly deletes the variable (spec §7.2); + # it is not a runtime error. The expectation flips. + mapping: | + let var = deleted() + # FIXME-v1: verify — V1 silently deletes the binding; no error produced. + output: null + no_output_check: true + + - name: "delete field from variable object" + skip: "V1 has no variable reassignment; cannot express $obj.b = deleted() piecewise" + + - name: "delete element from variable array shifts remaining" + skip: "V1 has no variable reassignment and no bracket index on paths; cannot express $arr[0] = deleted()" + + # --- Deleted in array literals --- + + - name: "deleted() in array literal omits element" + # FIXME-v1: verify — V1 may not omit deleted() inside literal array constructors; + # sentinel handling inside literals is stricter than in map_each. + mapping: | + root.v = [1, deleted(), 3] + output: {"v": [1, 3]} + + - name: "multiple deleted() in array literal" + # FIXME-v1: verify + mapping: | + root.v = [deleted(), 1, deleted(), 2, deleted()] + output: {"v": [1, 2]} + + - name: "all deleted() in array literal produces empty array" + # FIXME-v1: verify + mapping: | + root.v = [deleted(), deleted(), deleted()] + output: {"v": []} + + - name: "deleted() at beginning of array literal" + # FIXME-v1: verify + mapping: | + root.v = [deleted(), "a", "b"] + output: {"v": ["a", "b"]} + + - name: "deleted() at end of array literal" + # FIXME-v1: verify + mapping: | + root.v = ["a", "b", deleted()] + output: {"v": ["a", "b"]} + + # --- Deleted in object literals --- + + - name: "deleted() value in object literal omits field" + # FIXME-v1: verify — object literal semantics for deleted() may differ in V1. + mapping: | + root.v = {"a": 1, "b": deleted(), "c": 3} + output: {"v": {"a": 1, "c": 3}} + + - name: "all deleted() values in object literal produces empty object" + # FIXME-v1: verify + mapping: | + root.v = {"a": deleted(), "b": deleted()} + output: {"v": {}} + + - name: "deleted() in nested object literal" + # FIXME-v1: verify + mapping: | + root.v = {"outer": {"keep": 1, "drop": deleted()}} + output: {"v": {"outer": {"keep": 1}}} + + # --- Deletion propagates at each level independently --- + + - name: "deleted in nested array within object" + # FIXME-v1: verify + mapping: | + root.v = {"items": [1, deleted(), 3]} + output: {"v": {"items": [1, 3]}} + + - name: "deleted in object within array" + # FIXME-v1: verify + mapping: | + root.v = [{"a": 1, "b": deleted()}, {"c": 3}] + output: {"v": [{"a": 1}, {"c": 3}]} + + # --- Operations on deleted() are errors --- + + - name: "arithmetic on deleted() is error" + # V1 reports this as a compile-time type error ("cannot add types delete"). + mapping: | + root.v = deleted() + 1 + compile_error: "cannot add types delete" + + - name: "comparison on deleted() is error" + # V1 tolerates comparison with deleted(): `==` treats the sentinel as + # null-like and returns false rather than erroring. + mapping: | + root.v = deleted() == null + output: {"v": false} + + - name: "method call on deleted() is error" + # V1 coerces deleted() to null inside a method call; .string() returns "null". + mapping: | + root.v = deleted().string() + output: {"v": "null"} + + - name: "field access on deleted() is error" + # V1 coerces deleted() to null for field access; result is null. + mapping: | + root.v = deleted().field + output: {"v": null} + + # --- .or() and .catch() rescue deleted --- + + - name: "or rescues deleted()" + # V1 .or() treats deleted() as null-like and replaces with fallback (§12.2). + mapping: | + root.v = deleted().or("fallback") + output: {"v": "fallback"} + + - name: "catch passes through deleted()" + # V1 .catch() preserves sentinels (spec §9.4/§12.2), so deleted() survives + # and removes the `v` field — leaving {} as the result. + # V1 .catch() signature takes an expression, not a lambda; adjust. + mapping: | + root.v = deleted().catch("caught") + output: {} diff --git a/internal/bloblang2/migrator/v1spec/tests/input_output/dynamic_metadata.yaml b/internal/bloblang2/migrator/v1spec/tests/input_output/dynamic_metadata.yaml new file mode 100644 index 000000000..752cadd64 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/input_output/dynamic_metadata.yaml @@ -0,0 +1,109 @@ +description: > + Dynamic metadata access — computed keys for reading and writing metadata, + metadata in expressions, and metadata COW with variables. + +tests: + # --- Dynamic metadata write with variable key --- + + - name: "write metadata with variable key" + # V1 has no `meta(expr) = value` form; dynamic keys require a wholesale + # `meta = { ... }` assignment with a dynamic key object literal. + input: {} + mapping: | + let key = "source" + meta = {($key): "kafka"} + output: {} + output_metadata: {"source": "kafka"} + + - name: "write multiple metadata keys from loop data" + input: {} + mapping: | + let keys = ["env", "region"] + let vals = ["prod", "us-east"] + meta = {($keys.0): $vals.0, ($keys.1): $vals.1} + output: {} + output_metadata: {"env": "prod", "region": "us-east"} + + # --- Dynamic metadata read with variable key --- + + - name: "read metadata with variable key" + input: {} + input_metadata: {"source": "kafka", "topic": "events"} + mapping: | + let key = "topic" + root.v = meta($key) + output: {"v": "events"} + # V1 carries input metadata forward by default. + output_metadata: {"source": "kafka", "topic": "events"} + + - name: "read missing metadata with variable key returns null" + # V1 `meta("missing")` errors; `.or(null)` rescues to null. + input: {} + input_metadata: {"source": "kafka"} + mapping: | + let key = "missing" + root.v = meta($key).or(null) + output: {"v": null} + output_metadata: {"source": "kafka"} + + # --- Dynamic metadata delete --- + + - name: "delete metadata key with variable" + # V1 has no computed-key meta target. To delete by dynamic key, rebuild + # the whole meta object without it via without(). + input: {} + input_metadata: {"keep": "yes", "drop": "no"} + mapping: | + let remove_key = "drop" + meta = @.without($remove_key) + output: {} + output_metadata: {"keep": "yes"} + + # --- Metadata values used in expressions --- + + - name: "metadata value in arithmetic" + input: {} + input_metadata: {"count": 5} + mapping: | + root.doubled = @count * 2 + output: {"doubled": 10} + output_metadata: {"count": 5} + + - name: "metadata value in string concatenation" + input: {} + input_metadata: {"prefix": "hello"} + mapping: | + root.greeting = @prefix + " world" + output: {"greeting": "hello world"} + output_metadata: {"prefix": "hello"} + + - name: "metadata value as condition" + input: {} + input_metadata: {"debug": true} + mapping: | + root.level = if @debug { "trace" } else { "info" } + output: {"level": "trace"} + output_metadata: {"debug": true} + + # --- Metadata round-trip through variable --- + + - name: "metadata to variable to output metadata" + input: {} + input_metadata: {"trace_id": "abc-123"} + mapping: | + let trace = @trace_id + meta trace_id = $trace + output: {} + output_metadata: {"trace_id": "abc-123"} + + # --- Metadata from computed expression key --- + + - name: "write metadata with expression key" + # V1 cannot write meta with a computed key; emulate via wholesale + # `meta = { ... }` with dynamic key in object literal. + input: {} + mapping: | + let prefix = "x" + meta = {($prefix + "_header"): "value"} + output: {} + output_metadata: {"x_header": "value"} diff --git a/internal/bloblang2/migrator/v1spec/tests/input_output/input_access.yaml b/internal/bloblang2/migrator/v1spec/tests/input_output/input_access.yaml new file mode 100644 index 000000000..f368e01a1 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/input_output/input_access.yaml @@ -0,0 +1,168 @@ +description: "Input access: reading fields, metadata, various input types, immutability guarantees, non-existent fields return null" + +tests: + # --- Basic input field access --- + + - name: "access top-level input field" + input: {"name": "Alice"} + mapping: | + root.v = this.name + output: {"v": "Alice"} + + - name: "access nested input field" + input: {"user": {"name": "Bob", "age": 30}} + mapping: | + root.name = this.user.name + root.age = this.user.age + output: {"name": "Bob", "age": 30} + + - name: "access deeply nested input field" + input: {"a": {"b": {"c": {"d": "deep"}}}} + mapping: | + root.v = this.a.b.c.d + output: {"v": "deep"} + + - name: "access input array element by index" + # V1 has no bracket indexing — use path segment `.N` for non-negative indices. + input: [10, 20, 30] + mapping: | + root.first = this.0 + root.second = this.1 + root.third = this.2 + output: {"first": 10, "second": 20, "third": 30} + + - name: "access nested array in input object" + input: {"items": ["a", "b", "c"]} + mapping: | + root.v = this.items.1 + output: {"v": "b"} + + - name: "access object inside input array" + input: [{"name": "Alice"}, {"name": "Bob"}] + mapping: | + root.v = this.1.name + output: {"v": "Bob"} + + # --- Input types --- + + - name: "input passthrough for various types" + mapping: | + root.v = this + cases: + - name: "string" + input: "hello world" + output: {"v": "hello world"} + - name: "number" + input: 42 + output: {"v": 42} + - name: "float" + input: 3.14 + output: {"v": 3.14} + - name: "boolean" + input: true + output: {"v": true} + - name: "null" + input: null + output: {"v": null} + - name: "empty object" + input: {} + output: {"v": {}} + - name: "empty array" + input: [] + output: {"v": []} + - name: "defaults to null when not specified" + output: {"v": null} + + # --- Non-existent fields return null --- + + - name: "non-existent top-level field returns null" + input: {"name": "Alice"} + mapping: | + root.v = this.missing + output: {"v": null} + + - name: "non-existent nested field returns null" + input: {"user": {}} + mapping: | + root.v = this.user.name + output: {"v": null} + + - name: "deep path through null intermediate is error" + # V1 null-safe path access yields null rather than erroring. + input: {"a": 1} + mapping: | + root.v = this.x.y.z + output: {"v": null} + + - name: "out of bounds array index is error" + # V1 path indexing (`this.N`) on a shorter array yields null rather than + # erroring; the erroring bounds behaviour lives on `.index(N)`. + input: [1, 2, 3] + mapping: | + root.v = this.10 + output: {"v": null} + + # --- Input metadata access --- + + - name: "read single input metadata key" + input: {"data": 1} + input_metadata: {"source": "kafka"} + mapping: | + root.v = @source + output: {"v": "kafka"} + output_metadata: {"source": "kafka"} + + - name: "read all input metadata as object" + input: {"data": 1} + input_metadata: {"source": "kafka", "topic": "events"} + mapping: | + root.v = @ + output: {"v": {"source": "kafka", "topic": "events"}} + output_metadata: {"source": "kafka", "topic": "events"} + + - name: "undefined metadata key returns null" + # V1 `@missing` returns null, whereas `meta("missing")` errors. The @-form + # gives null-default semantics. + input: {"data": 1} + input_metadata: {"source": "kafka"} + mapping: | + root.v = @missing + output: {"v": null} + output_metadata: {"source": "kafka"} + + - name: "metadata with no input_metadata is empty object" + input: {"data": 1} + mapping: | + root.v = @ + output: {"v": {}} + + - name: "metadata value can be any type" + input: {} + input_metadata: {"count": 42, "active": true, "tags": ["a", "b"]} + mapping: | + root.count = @count + root.active = @active + root.tags = @tags + output: {"count": 42, "active": true, "tags": ["a", "b"]} + output_metadata: {"count": 42, "active": true, "tags": ["a", "b"]} + + - name: "nested metadata path access" + input: {} + input_metadata: {"routing": {"region": "us-west", "zone": "a"}} + mapping: | + root.v = @routing.region + output: {"v": "us-west"} + output_metadata: {"routing": {"region": "us-west", "zone": "a"}} + + # --- Input immutability --- + + - name: "input is not modified by output assignment from input" + input: {"name": "Alice", "age": 30} + mapping: | + root = this + root.name = "Bob" + root.original = this.name + output: {"name": "Bob", "age": 30, "original": "Alice"} + + - name: "input array is not modified by variable mutation" + skip: "V1 has no variable reassignment ($copy[0] = 99 is not expressible)" diff --git a/internal/bloblang2/migrator/v1spec/tests/input_output/metadata.yaml b/internal/bloblang2/migrator/v1spec/tests/input_output/metadata.yaml new file mode 100644 index 000000000..490eafbc8 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/input_output/metadata.yaml @@ -0,0 +1,215 @@ +description: "Metadata: read/write/delete metadata keys, clear all, copy from input, nested paths, type restrictions, input metadata access" + +tests: + # --- Write metadata --- + + - name: "write single metadata key" + input: {} + mapping: | + meta source = "kafka" + output: {} + output_metadata: {"source": "kafka"} + + - name: "write multiple metadata keys" + input: {} + mapping: | + meta source = "kafka" + meta topic = "events" + meta partition = 3 + output: {} + output_metadata: {"source": "kafka", "topic": "events", "partition": 3} + + - name: "metadata value can be any type" + input: {} + mapping: | + meta str = "hello" + meta num = 42 + meta flag = true + meta arr = [1, 2, 3] + meta obj = {"nested": "value"} + meta nothing = null + output: {} + output_metadata: {"str": "hello", "num": 42, "flag": true, "arr": [1, 2, 3], "obj": {"nested": "value"}, "nothing": null} + + - name: "overwrite metadata key" + input: {} + mapping: | + meta key = "first" + meta key = "second" + output: {} + output_metadata: {"key": "second"} + + # --- Nested metadata paths with auto-creation --- + + - name: "nested metadata path auto-creates intermediate objects" + # V1 `meta ` targets a single key (flat); it does not support dotted + # auto-creation. Express via an object value instead. + input: {} + mapping: | + meta routing = {"region": "us-west"} + output: {} + output_metadata: {"routing": {"region": "us-west"}} + + - name: "deeply nested metadata path" + input: {} + mapping: | + meta a = {"b": {"c": "deep"}} + output: {} + output_metadata: {"a": {"b": {"c": "deep"}}} + + - name: "sibling nested metadata paths" + # V1 cannot write sub-keys of a meta value piecewise; the whole object + # must be assigned in one statement. + input: {} + mapping: | + meta routing = {"region": "us-west", "zone": "a"} + output: {} + output_metadata: {"routing": {"region": "us-west", "zone": "a"}} + + # --- Delete metadata key --- + + - name: "delete metadata key with deleted()" + input: {} + mapping: | + meta keep = "yes" + meta remove = "no" + meta remove = deleted() + output: {} + output_metadata: {"keep": "yes"} + + - name: "delete non-existent metadata key is no-op" + input: {} + mapping: | + meta key = "value" + meta missing = deleted() + output: {} + output_metadata: {"key": "value"} + + # --- Clear all metadata --- + + - name: "clear all metadata with empty object" + # V1: `meta = {}` replaces the whole metadata object. + input: {} + mapping: | + meta a = 1 + meta b = 2 + meta = {} + output: {} + output_metadata: {} + + - name: "clear then set new metadata" + input: {} + mapping: | + meta old = "stale" + meta = {} + meta fresh = "new" + output: {} + output_metadata: {"fresh": "new"} + + # --- Copy all metadata from input --- + + - name: "copy all metadata from input" + input: {} + input_metadata: {"source": "kafka", "topic": "events"} + mapping: | + meta = @ + output: {} + output_metadata: {"source": "kafka", "topic": "events"} + + - name: "copy metadata from input then add more" + input: {} + input_metadata: {"source": "kafka"} + mapping: | + meta = @ + meta extra = "added" + output: {} + output_metadata: {"source": "kafka", "extra": "added"} + + - name: "copy metadata from input then overwrite key" + input: {} + input_metadata: {"source": "kafka"} + mapping: | + meta = @ + meta source = "http" + output: {} + output_metadata: {"source": "http"} + + - name: "copy metadata from input is COW" + # V1 does not implement COW between `@` (input metadata snapshot) and + # `meta` (output metadata). `@` reads from the *same* underlying metadata + # map that `meta` writes to, so after `meta key = "modified"` the read of + # `@key` reflects the mutation. + input: {} + input_metadata: {"key": "original"} + mapping: | + meta = @ + meta key = "modified" + root.input_meta = @key + output: {"input_meta": "modified"} + output_metadata: {"key": "modified"} + + # --- Type restrictions --- + + - name: "output@ = deleted() is error" + # V1: `meta = deleted()` clears all metadata; it is NOT an error. + input: {} + mapping: | + meta = deleted() + output: {} + output_metadata: {} + + - name: "output@ = string is error" + # V1: setting root meta to a non-object type errors with + # "setting root meta object requires object value". + mapping: | + meta = "not an object" + error: "object value" + + - name: "output@ = integer is error" + mapping: | + meta = 42 + error: "object value" + + - name: "output@ = array is error" + mapping: | + meta = [1, 2, 3] + error: "object value" + + - name: "output@ = boolean is error" + mapping: | + meta = true + error: "object value" + + # --- Read input metadata --- + + - name: "read input metadata key" + input: {} + input_metadata: {"source": "kafka"} + mapping: | + root.v = @source + output: {"v": "kafka"} + output_metadata: {"source": "kafka"} + + - name: "read all input metadata" + input: {} + input_metadata: {"a": 1, "b": 2} + mapping: | + root.v = @ + output: {"v": {"a": 1, "b": 2}} + output_metadata: {"a": 1, "b": 2} + + - name: "undefined input metadata key returns null" + input: {} + input_metadata: {} + mapping: | + root.v = @missing + output: {"v": null} + output_metadata: {} + + - name: "read nested input metadata value" + input: {} + input_metadata: {"config": {"timeout": 30}} + mapping: | + root.v = @config.timeout + output: {"v": 30} + output_metadata: {"config": {"timeout": 30}} diff --git a/internal/bloblang2/migrator/v1spec/tests/input_output/output_assignment.yaml b/internal/bloblang2/migrator/v1spec/tests/input_output/output_assignment.yaml new file mode 100644 index 000000000..21d5570c1 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/input_output/output_assignment.yaml @@ -0,0 +1,171 @@ +description: "Output assignment: building output incrementally, auto-creation of intermediate objects/arrays, collision errors, gap filling, sequential references" + +tests: + # --- Basic incremental building --- + + - name: "assign single field to output" + mapping: | + root.name = "Alice" + output: {"name": "Alice"} + + - name: "assign multiple fields to output" + mapping: | + root.name = "Alice" + root.age = 30 + root.active = true + output: {"name": "Alice", "age": 30, "active": true} + + - name: "output starts as empty object" + # V1 `root` is unset (null) before any assignment. `root.v = root` reads + # `root` as null and the assignment resolves to void, leaving root + # untouched — the mapping produces no root assignment so the input + # passes through unchanged. + input: {} + mapping: | + root.v = root + output: {} + + # --- Auto-creation of intermediate objects --- + + - name: "auto-create nested object" + mapping: | + root.user.name = "Alice" + output: {"user": {"name": "Alice"}} + + - name: "auto-create deeply nested object" + mapping: | + root.user.address.city = "London" + output: {"user": {"address": {"city": "London"}}} + + - name: "auto-create very deeply nested object" + mapping: | + root.a.b.c.d.e = 42 + output: {"a": {"b": {"c": {"d": {"e": 42}}}}} + + - name: "auto-create and add sibling fields" + mapping: | + root.user.name = "Alice" + root.user.age = 30 + output: {"user": {"name": "Alice", "age": 30}} + + # --- Auto-creation with array index --- + + - name: "auto-create array with index zero" + # V1 path auto-creation is always OBJECT-valued, even for numeric segments. + # `root.items.0 = "x"` on unset `items` creates `{"items": {"0": "x"}}`, + # not an array. To get an array you must first assign an array literal. + mapping: | + root.items.0 = "first" + output: {"items": {"0": "first"}} + + - name: "auto-create array with object elements" + # As above: auto-created intermediates are objects keyed by the literal + # segment text (here "0"). + mapping: | + root.items.0.name = "first" + output: {"items": {"0": {"name": "first"}}} + + - name: "auto-create array then add more elements" + # V1 auto-creates an object (not array) when the base is unset. + mapping: | + root.items.0 = "a" + root.items.1 = "b" + root.items.2 = "c" + output: {"items": {"0": "a", "1": "b", "2": "c"}} + + # --- Dynamic index: string creates object, int creates array --- + + - name: "dynamic string index creates object field" + skip: "V1 has no dynamic bracket indexing in assignment targets (§6.4 forbids root.(expr) / root[expr])" + + - name: "dynamic int index creates array element" + skip: "V1 has no dynamic bracket indexing in assignment targets" + + # --- Array gap filling --- + + - name: "gap filling with null" + # V1 has no array gap-filling — the intermediate is an object keyed by "2". + mapping: | + root.items.2 = "x" + output: {"items": {"2": "x"}} + + - name: "gap filling after existing elements" + # V1 builds an object keyed by numeric strings; no gap filling. + mapping: | + root.items.0 = "a" + root.items.3 = "d" + output: {"items": {"0": "a", "3": "d"}} + + - name: "gap filling with zero-based first element then gap" + mapping: | + root.arr.0 = 10 + root.arr.5 = 50 + output: {"arr": {"0": 10, "5": 50}} + + # --- Sequential references to earlier output --- + + - name: "reference previously assigned output field" + mapping: | + root.x = 10 + root.y = root.x + 5 + output: {"x": 10, "y": 15} + + - name: "reference nested output field" + mapping: | + root.user.name = "Alice" + root.greeting = "Hello, " + root.user.name + output: {"user": {"name": "Alice"}, "greeting": "Hello, Alice"} + + - name: "reference output array element" + # V1 auto-creates an object (not array) for numeric path segments. + mapping: | + root.items.0 = 100 + root.items.1 = root.items.0 * 2 + output: {"items": {"0": 100, "1": 200}} + + - name: "overwrite previously assigned field" + mapping: | + root.status = "pending" + root.status = "done" + output: {"status": "done"} + + # --- Collision errors --- + + - name: "collision: field access on string value" + # V1 errors with "unable to set target path ... non-object type". + mapping: | + root.user = "Alice" + root.user.name = "Alice" + error: "non-object type" + + - name: "collision: field access on integer value" + mapping: | + root.count = 42 + root.count.value = 42 + error: "non-object type" + + - name: "collision: field access on boolean value" + mapping: | + root.flag = true + root.flag.sub = false + error: "non-object type" + + - name: "collision: index access on string value" + # V1 has no bracket indexing; `.0` on a string value errors the same way + # as field access on a non-object. + mapping: | + root.data = "hello" + root.data.0 = "H" + error: "non-object type" + + # --- Mixed nesting --- + + - name: "build complex nested structure incrementally" + # V1 auto-creates nested objects keyed by numeric strings, not arrays. + mapping: | + root.users.0.name = "Alice" + root.users.0.age = 30 + root.users.1.name = "Bob" + root.users.1.age = 25 + root.count = 2 + output: {"users": {"0": {"name": "Alice", "age": 30}, "1": {"name": "Bob", "age": 25}}, "count": 2} diff --git a/internal/bloblang2/migrator/v1spec/tests/input_output/output_root.yaml b/internal/bloblang2/migrator/v1spec/tests/input_output/output_root.yaml new file mode 100644 index 000000000..a827d0fde --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/input_output/output_root.yaml @@ -0,0 +1,149 @@ +description: "Output root assignment: root = expr replaces entire output, any type, COW from input, continued building after root assignment" + +tests: + # --- Root assignment replaces entire output --- + + - name: "root assignment replaces empty output with string" + mapping: | + root = "hello" + output: "hello" + + - name: "root assignment replaces empty output with integer" + mapping: | + root = 42 + output: 42 + + - name: "root assignment replaces empty output with float" + mapping: | + root = 3.14 + output: 3.14 + + - name: "root assignment replaces empty output with boolean" + mapping: | + root = true + output: true + + - name: "root assignment replaces empty output with null" + mapping: | + root = null + output: null + + - name: "root assignment replaces empty output with array" + mapping: | + root = [1, 2, 3] + output: [1, 2, 3] + + - name: "root assignment replaces empty output with object" + mapping: | + root = {"name": "Alice", "age": 30} + output: {"name": "Alice", "age": 30} + + # --- Previous assignments discarded --- + + - name: "root assignment discards previous field assignments" + mapping: | + root.name = "Alice" + root.age = 30 + root = "replaced" + output: "replaced" + + - name: "root assignment discards complex previous structure" + mapping: | + root.user.name = "Alice" + root.user.address.city = "London" + root.items.0 = "a" + root = 99 + output: 99 + + - name: "multiple root assignments keep last one" + mapping: | + root = "first" + root = "second" + root = "third" + output: "third" + + # --- COW from input --- + + - name: "output = input copies various types" + mapping: | + root = this + cases: + - name: "object" + input: {"name": "Alice", "age": 30} + output: {"name": "Alice", "age": 30} + - name: "array" + input: [1, 2, 3] + output: [1, 2, 3] + - name: "string" + input: "hello" + output: "hello" + - name: "null" + input: null + output: null + + - name: "output = input is logical copy with COW" + input: {"name": "Alice", "score": 100} + mapping: | + root = this + root.name = "Bob" + root.original = this.name + output: {"name": "Bob", "score": 100, "original": "Alice"} + + # --- Continued building after root assignment --- + + - name: "continue building after root assignment to object" + mapping: | + root = {"name": "Alice"} + root.age = 30 + output: {"name": "Alice", "age": 30} + + - name: "continue building nested after root assignment" + mapping: | + root = {} + root.user.name = "Alice" + root.user.age = 30 + output: {"user": {"name": "Alice", "age": 30}} + + - name: "root assign from input then extend" + input: {"name": "Alice"} + mapping: | + root = this + root.greeting = "Hello, " + this.name + output: {"name": "Alice", "greeting": "Hello, Alice"} + + - name: "root assign object then overwrite field" + mapping: | + root = {"status": "pending", "count": 0} + root.status = "done" + root.count = 1 + output: {"status": "done", "count": 1} + + # --- Root assignment to non-object then field access is error --- + + - name: "field assignment after root assign to string is error" + # FIXME-v1: verify — V1 may silently overwrite rather than error. + mapping: | + root = "hello" + root.field = "x" + error: "field" + + - name: "field assignment after root assign to integer is error" + # FIXME-v1: verify + mapping: | + root = 42 + root.field = "x" + error: "field" + + - name: "field assignment after root assign to array is error" + # FIXME-v1: verify + mapping: | + root = [1, 2, 3] + root.field = "x" + error: "field" + + - name: "index assignment after root assign to array works" + # V1 uses `.N` path segments, not `[N]` brackets. + mapping: | + root = [10, 20, 30] + root.1 = 99 + output: [10, 99, 30] diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/basic.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/basic.yaml new file mode 100644 index 000000000..a43d89bb9 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/basic.yaml @@ -0,0 +1,128 @@ +description: "Lambda expressions — single/multi param, block bodies, nested expressions, higher-order methods" + +tests: + # --- Single parameter lambdas --- + + - name: "single param lambda in map" + mapping: | + root.result = [1, 2, 3].map_each(x -> x * 2) + output: {"result": [2, 4, 6]} + + - name: "single param lambda in filter" + mapping: | + root.result = [1, 2, 3, 4, 5].filter(x -> x > 3) + output: {"result": [4, 5]} + + - name: "single param lambda in sort_by" + mapping: | + root.result = [{"n": 3}, {"n": 1}, {"n": 2}].sort_by(x -> x.n) + output: {"result": [{"n": 1}, {"n": 2}, {"n": 3}]} + + - name: "single param lambda with string method" + mapping: | + root.result = ["hello", "world"].map_each(s -> s.uppercase()) + output: {"result": ["HELLO", "WORLD"]} + + - name: "single param lambda accessing nested fields" + mapping: | + let items = [{"price": 10, "qty": 2}, {"price": 5, "qty": 4}] + root.totals = $items.map_each(item -> item.price * item.qty) + output: {"totals": [20, 20]} + + # --- Multi parameter lambdas --- + # V1 fold lambda takes a single parameter which is an object with `tally` and `value` fields. + + - name: "two param lambda in fold" + mapping: | + root.sum = [1, 2, 3, 4].fold(0, item -> item.tally + item.value) + output: {"sum": 10} + + - name: "two param lambda in map_entries" + # V1 has no map_entries — .map_each on an object gives {key,value} entries; lambda returns the new value. + # The V2 semantics (returning {"key":..., "value":...} to set both) have no direct V1 equivalent, + # but we can simulate via fold into an object. + mapping: | + root.result = {"a": 1, "b": 2}.key_values().fold({}, item -> item.tally.merge({item.value.key.uppercase(): item.value.value * 10})) + output: {"result": {"A": 10, "B": 20}} + + - name: "two param lambda in filter_entries" + mapping: | + root.result = {"a": 1, "b": 5, "c": 3}.filter(item -> item.value > 2) + output: {"result": {"b": 5, "c": 3}} + + - name: "fold with string accumulator" + mapping: | + root.result = ["a", "b", "c"].fold("", item -> item.tally + item.value) + output: {"result": "abc"} + + # --- Block body lambdas --- + + - name: "block body with variable declarations" + skip: "V1 lambda bodies are single expressions — no block form with inline let statements" + + - name: "block body with multiple variables" + skip: "V1 lambda bodies are single expressions — no block form with inline let statements" + + - name: "block body must end with expression" + skip: "V1 has no block-body lambda form — this V2-only compile error does not apply" + + - name: "block body with conditional expression" + skip: "V1 lambda bodies are single expressions — no block form with inline let statements" + + # --- Nested lambdas --- + + - name: "nested lambda — map inside map" + mapping: | + let matrix = [[1, 2], [3, 4]] + root.result = $matrix.map_each(row -> row.map_each(x -> x * 10)) + output: {"result": [[10, 20], [30, 40]]} + + - name: "filter inside map" + mapping: | + let groups = [[1, 2, 3], [4, 5, 6]] + root.result = $groups.map_each(g -> g.filter(x -> x % 2 == 0)) + output: {"result": [[2], [4, 6]]} + + # --- Passing map names to higher-order methods --- + + - name: "pass map name directly to .map()" + skip: "V1 map definitions take no parameters and are invoked via .apply('name'); cannot pass map name as a value to .map_each" + + - name: "pass map name directly to .filter()" + skip: "V1 map definitions take no parameters and are invoked via .apply('name'); cannot pass map name as a value to .filter" + + - name: "pass map name directly to .sort_by()" + skip: "V1 map definitions take no parameters and are invoked via .apply('name'); cannot pass map name as a value to .sort_by" + + # --- Lambda is not a value --- + # V1 lambdas ARE first-class query expressions (§8.5), but assigning them to root or let-binding + # them produces behaviour that isn't a clean parse error — flag for review. + + - name: "cannot store lambda in variable" + # V1 lambdas ARE query expressions and parse in a `let` RHS, but the RHS is evaluated eagerly + # against the current `this`. With input `{}`, `x` resolves against `this` (not an unbound param) + # and the body executes — `null * 2` is a runtime type error. + mapping: | + let fn = x -> x * 2 + root.ok = true + error: "cannot multiply types" + + - name: "cannot assign lambda to output" + # Same as above: `root.fn = x -> x * 2` eagerly evaluates `x * 2` (where `x` is a field read + # from `this`) and errors at runtime. + mapping: | + root.fn = x -> x * 2 + error: "cannot multiply types" + + # --- Parameter is read-only --- + + - name: "lambda parameter cannot be assigned to" + skip: "V1 lambda bodies are single expressions — there is no syntax for assigning to a lambda parameter" + + # --- map_values with single param lambda --- + + - name: "map_values with lambda" + # V1 has no map_values, but .map_each on an object returns the (transformed) value, matching map_values semantics exactly. + mapping: | + root.result = {"x": 1, "y": 2}.map_each(item -> item.value + 100) + output: {"result": {"x": 101, "y": 102}} diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/complex_iterators.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/complex_iterators.yaml new file mode 100644 index 000000000..5812b7862 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/complex_iterators.yaml @@ -0,0 +1,111 @@ +description: > + Complex iterator patterns — nested iterator chains, lambdas with control + flow, and iterators operating on map call results. + +tests: + # --- Nested iterator chains --- + + - name: "map then filter chain" + # V1 parser rejects a newline immediately before `.` (§2.1). Break *after* the dot instead. + mapping: | + root.v = [1, 2, 3, 4, 5]. + map_each(x -> x * x). + filter(x -> x > 5) + output: {"v": [9, 16, 25]} + + - name: "filter then map then sort" + mapping: | + root.v = [5, 1, 4, 2, 3]. + filter(x -> x > 2). + map_each(x -> x * 10). + sort() + output: {"v": [30, 40, 50]} + + - name: "nested map produces 2D array" + mapping: | + root.v = [1, 2].map_each(x -> [10, 20].map_each(y -> x + y)) + output: {"v": [[11, 21], [12, 22]]} + + - name: "flat map via map + flatten" + mapping: | + root.v = [[1, 2], [3, 4]].map_each(arr -> arr.map_each(x -> x * 10)).flatten() + output: {"v": [10, 20, 30, 40]} + + # --- Lambda with control flow --- + + - name: "map with if expression in lambda" + # V1 lambda bodies must start on the same line as `->` (§2.1). + mapping: | + root.v = [1, 2, 3, 4].map_each(x -> if x > 2 { "big" } else { "small" }) + output: {"v": ["small", "small", "big", "big"]} + + - name: "filter with match expression in lambda" + # V1 lambda bodies must start on the same line as `->`. Keep the match on one line. + mapping: | + root.v = ["apple", "banana", "avocado", "cherry"].filter(s -> match s.slice(0, 1) { "a" => true, _ => false }) + output: {"v": ["apple", "avocado"]} + + - name: "map with block body and local variables" + skip: "V1 lambda bodies are single expressions — no block form with inline let statements" + + # --- Iterators on map call results --- + # V1 maps take no parameters; the receiver is `this` inside the map. + + - name: "map call result piped to iterator" + mapping: | + map get_items { root = this.items } + root.v = this.apply("get_items").map_each(x -> x * 2) + input: {"items": [1, 2, 3]} + output: {"v": [2, 4, 6]} + + - name: "chained map calls with iterators" + mapping: | + map extract { root = this.values } + map sum { root = this.fold(0, item -> item.tally + item.value) } + root.v = this.apply("extract").apply("sum") + input: {"values": [10, 20, 30]} + output: {"v": 60} + + # --- Fold with complex accumulator --- + + - name: "fold concatenates strings" + # V1 lambda bodies must start on the same line as `->` — collapse the if onto one line. + mapping: | + root.v = ["a", "b", "c"].fold("", item -> if item.tally == "" { item.value } else { item.tally + "," + item.value }) + output: {"v": "a,b,c"} + + - name: "fold builds object from array" + # V1 has no block-body lambdas with inline let statements; use .merge() to build the object inline. + mapping: | + root.v = [ + {"k": "a", "v": 1}, + {"k": "b", "v": 2} + ].fold({}, item -> item.tally.merge({(item.value.k): item.value.v})) + output: {"v": {"a": 1, "b": 2}} + + - name: "fold with string accumulator" + # V1 lambda bodies must start on the same line as `->`. + mapping: | + root.v = ["hello", "world"].fold("", item -> if item.tally == "" { item.value } else { item.tally + " " + item.value }) + output: {"v": "hello world"} + + # --- any/all with complex predicates --- + + - name: "any with method chain in predicate" + mapping: | + let items = [ + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 17} + ] + root.has_minor = $items.any(p -> p.age < 18) + output: {"has_minor": true} + + - name: "all with outer variable in predicate" + mapping: | + let min_age = 18 + let items = [ + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 30} + ] + root.all_adult = $items.all(p -> p.age >= $min_age) + output: {"all_adult": true} diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/defaults.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/defaults.yaml new file mode 100644 index 000000000..0395d5ad9 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/defaults.yaml @@ -0,0 +1,56 @@ +description: "Default parameter values in lambda expressions" + +tests: + # V1 lambdas have a single named parameter (or `_`) and no default-value syntax. + # Every V2-only default-parameter feature is untranslatable. + + # --- Basic defaults --- + + - name: "single param with default — value provided" + skip: "V1 lambdas do not support default parameter values" + + - name: "lambda default integer literal" + skip: "V1 lambdas do not support default parameter values" + + - name: "lambda default string literal" + skip: "V1 lambdas do not support default parameter values" + + - name: "lambda default boolean literal" + skip: "V1 lambdas do not support default parameter values" + + - name: "lambda default null literal" + skip: "V1 lambdas do not support default parameter values" + + - name: "lambda default float literal" + skip: "V1 lambdas do not support default parameter values" + + # --- Positional omission --- + + - name: "trailing default params omitted" + skip: "V1 lambdas do not support default parameter values" + + # --- Defaults must come after required --- + + - name: "default before required is compile error" + skip: "V1 lambdas do not support default parameter values — this V2-only compile error does not apply" + + # --- Default values must be literals --- + + - name: "default value expression is compile error" + skip: "V1 lambdas do not support default parameter values — this V2-only compile error does not apply" + + - name: "default value variable reference is compile error" + skip: "V1 lambdas do not support default parameter values — this V2-only compile error does not apply" + + - name: "default value function call is compile error" + skip: "V1 lambdas do not support default parameter values — this V2-only compile error does not apply" + + # --- Multiple defaults --- + + - name: "multiple default params all using defaults" + skip: "V1 lambdas do not support default parameter values" + + # --- Discard cannot have default --- + + - name: "discard param with default is compile error" + skip: "V1 lambdas do not support default parameter values — this V2-only compile error does not apply" diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/discard_params.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/discard_params.yaml new file mode 100644 index 000000000..48523779c --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/discard_params.yaml @@ -0,0 +1,95 @@ +description: "Discard parameters (_) in lambda expressions" + +tests: + # V1 lambdas take a single named parameter (or `_`). Multi-param lambdas like `(k, v)` or `(acc, x)` + # are a V2-only syntax — V1 supplies multi-argument info via a single object (e.g. fold gives + # {tally, value}; map_each on objects gives {key, value}). So "discard key/value" mostly collapses + # into "ignore a field of that object" which doesn't need `_` at all. + + # --- Basic discard --- + + - name: "discard key in map_entries" + # In V1, .map_each on an object returns just the new value; the key is preserved. + # To change the key, we'd rebuild via fold. The V2 semantics here set key="x" and value=v*2 + # which produces a single-entry object since all keys collapse to "x". + mapping: | + root.result = {"a": 1}.key_values().fold({}, item -> item.tally.merge({"x": item.value.value * 2})) + output: {"result": {"x": 2}} + + - name: "discard value in map_entries" + mapping: | + root.result = {"a": 1, "b": 2}.key_values().fold({}, item -> item.tally.merge({(item.value.key.uppercase()): 0})) + output: {"result": {"A": 0, "B": 0}} + + - name: "discard accumulator in fold" + mapping: | + root.result = [10, 20, 30].fold(0, item -> item.value) + output: {"result": 30} + + - name: "discard element in fold" + mapping: | + root.result = [10, 20, 30].fold(0, item -> item.tally + 1) + output: {"result": 3} + + - name: "discard in filter_entries — use value only" + mapping: | + root.result = {"a": 1, "b": 5, "c": 3}.filter(item -> item.value > 2) + output: {"result": {"b": 5, "c": 3}} + + - name: "discard in filter_entries — use key only" + mapping: | + root.result = {"aa": 1, "b": 5, "cc": 3}.filter(item -> item.key.length() > 1) + output: {"result": {"aa": 1, "cc": 3}} + + # --- Multiple discards --- + + - name: "both params discarded returns constant" + # V1: `.merge()` on a key that already exists does NOT overwrite — it appends into an array. + # So repeatedly merging `{"z": 99}` yields `{"z": [99, 99]}` after two iterations. + mapping: | + root.result = {"a": 1, "b": 2}.key_values().fold({}, item -> item.tally.merge({"z": 99})) + output: {"result": {"z": [99, 99]}} + + - name: "both params discarded in fold" + mapping: | + root.result = [1, 2, 3].fold(0, _ -> 42) + output: {"result": 42} + + # --- Referencing _ in body is compile error --- + + - name: "referencing discarded param is compile error" + skip: "V1 multi-param lambdas use a single item object (item.key, item.value) — there is no per-param discard/reference concept to probe" + + - name: "referencing _ when both discarded is compile error" + # V1: `_` as a lambda param binds no name, but referencing `_` in the body is not a compile error — + # the parser treats the bare `_` as a path segment against `this`, so it resolves to a (missing) field. + # No error at compile OR run time; the body just returns null. + mapping: | + root.result = [1, 2].fold(0, _ -> _) + output: {"result": null} + + - name: "referencing _ in single param discard" + # V1: `_` binds no name, so `_ * 2` attempts `this._ * 2` — `this._` is null and the runtime + # arithmetic errors. Not a compile error. + mapping: | + root.result = [1, 2].map_each(_ -> _ * 2) + error: "cannot multiply types" + + # --- Discard with non-discard params --- + + - name: "discard first keep second in two-param lambda" + mapping: | + root.result = [10, 20, 30].fold("start", item -> item.value.string()) + output: {"result": "30"} + + - name: "keep first discard second in two-param lambda" + mapping: | + root.result = [10, 20, 30].fold(0, item -> item.tally + 100) + output: {"result": 300} + + # --- Discard in map (single param) --- + + - name: "discard single param in map returns constant array" + mapping: | + root.result = [1, 2, 3].map_each(_ -> "x") + output: {"result": ["x", "x", "x"]} diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/fold_patterns.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/fold_patterns.yaml new file mode 100644 index 000000000..e78fd7804 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/fold_patterns.yaml @@ -0,0 +1,87 @@ +description: > + Fold patterns that stress variable slot management — building objects + from arrays, nested folds, fold calling maps, fold with conditional + accumulation, and fold combined with other iterators. + +tests: + # --- Build object from key-value pairs --- + + - name: "fold builds object from pairs array" + # V1 fold lambda takes a single item param with .tally and .value; block-body lambdas do not exist, + # so we build via .merge() inline. + mapping: | + root.v = [ + {"k": "name", "v": "Alice"}, + {"k": "age", "v": 30} + ].fold({}, item -> item.tally.merge({(item.value.k): item.value.v})) + output: {"v": {"name": "Alice", "age": 30}} + + - name: "fold builds object with computed keys" + mapping: | + root.v = ["x", "y", "z"].enumerated().fold({}, item -> item.tally.merge({("key_" + item.value.value): item.value.index + 1})) + output: {"v": {"key_x": 1, "key_y": 2, "key_z": 3}} + + # --- Fold with conditional accumulation --- + + - name: "fold conditionally adds to accumulator" + # V1 lambda bodies must start on the same line as `->`. + mapping: | + root.v = [1, -2, 3, -4, 5].fold([], item -> if item.value > 0 { item.tally.append(item.value) } else { item.tally }) + output: {"v": [1, 3, 5]} + + - name: "fold with match in body" + skip: "V1 lambda bodies are single expressions — this V2 test uses an inline let statement inside the fold body (block body) which has no V1 equivalent" + + # --- Nested fold --- + + - name: "nested fold — outer builds rows, inner sums columns" + mapping: | + let matrix = [[1, 2, 3], [4, 5, 6]] + root.v = $matrix.fold([], row -> row.tally.append(row.value.fold(0, x -> x.tally + x.value))) + output: {"v": [6, 15]} + + - name: "fold inside map inside fold" + # V2 uses block bodies with let; V1 must use inline .merge(). Each iteration produces a map of + # one new key -> sum of its items, merged into the accumulator. + mapping: | + let groups = [ + {"name": "a", "items": [1, 2]}, + {"name": "b", "items": [3, 4, 5]} + ] + root.v = $groups.fold({}, g -> g.tally.merge({(g.value.name): g.value.items.fold(0, x -> x.tally + x.value)})) + output: {"v": {"a": 3, "b": 12}} + + # --- Fold calling user maps --- + + - name: "fold body calls user map" + # V1 maps take no parameters — the value is `this` inside the map body, invoked via .apply. + mapping: | + map transform { root = this * this } + root.v = [1, 2, 3, 4].fold(0, item -> item.tally + item.value.apply("transform")) + output: {"v": 30} + + - name: "fold body calls user map that returns object" + # V2 passes two args to a map; V1 maps take only `this`. Reshape the map to take an object + # with both fields via the receiver. V1 lambda bodies must start on the same line as `->`, + # and method chains must not be broken immediately before `.`. + mapping: | + map make_entry { root = {"key": this.key, "value": this.val} } + root.v = ["a", "b", "c"].enumerated().fold({}, item -> item.tally.merge({(item.value.value): item.value.index})) + output: {"v": {"a": 0, "b": 1, "c": 2}} + + # --- Fold preserves accumulator across iterations --- + + - name: "fold accumulator carries forward correctly" + # V1: no block-body lambdas with assignment statements — rebuild with .merge() each iteration. + mapping: | + root.v = [10, 20, 30].fold({"sum": 0, "count": 0}, item -> { + "sum": item.tally.sum + item.value, + "count": item.tally.count + 1 + }) + output: {"v": {"sum": 60, "count": 3}} + + - name: "fold with string accumulator and separator" + # V1 lambda bodies must start on the same line as `->`. + mapping: | + root.v = ["a", "b", "c"].fold("", item -> if item.tally == "" { item.value } else { item.tally + ", " + item.value }) + output: {"v": "a, b, c"} diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/outer_capture.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/outer_capture.yaml new file mode 100644 index 000000000..616a80591 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/outer_capture.yaml @@ -0,0 +1,91 @@ +description: > + Lambda capture of outer variables — lambdas reading variables from enclosing + scopes, including nested lambdas and lambdas inside maps. + +tests: + # --- Simple outer variable capture --- + + - name: "lambda reads single outer variable" + mapping: | + let factor = 10 + root.v = [1, 2, 3].map_each(x -> x * $factor) + output: {"v": [10, 20, 30]} + + - name: "lambda reads multiple outer variables" + mapping: | + let offset = 100 + let scale = 3 + root.v = [1, 2].map_each(x -> x * $scale + $offset) + output: {"v": [103, 106]} + + - name: "lambda reads outer variable in filter condition" + mapping: | + let threshold = 5 + root.v = [1, 3, 7, 9, 2].filter(x -> x > $threshold) + output: {"v": [7, 9]} + + - name: "lambda reads outer variable in fold" + mapping: | + let base = 100 + root.v = [1, 2, 3].fold($base, item -> item.tally + item.value) + output: {"v": 106} + + # --- Nested lambda capture --- + + - name: "inner lambda captures outer lambda parameter" + mapping: | + root.v = [1, 2].map_each(x -> [10, 20].map_each(y -> x * 100 + y)) + output: {"v": [[110, 120], [210, 220]]} + + - name: "inner lambda captures top-level variable through outer lambda" + mapping: | + let prefix = "item" + root.v = [1, 2].map_each(x -> $prefix + "_" + x.string()) + output: {"v": ["item_1", "item_2"]} + + - name: "triple nested lambda captures all levels" + mapping: | + let base = 1000 + root.v = [1].map_each(a -> [2].map_each(b -> [3].map_each(c -> $base + a * 100 + b * 10 + c))) + output: {"v": [[[1123]]]} + + # --- Capture with shadowing --- + + - name: "lambda parameter does not modify outer variable" + mapping: | + let x = "outer" + let result = ["inner"].map_each(x -> x + "_mapped") + root.outer = $x + root.mapped = $result + output: + outer: "outer" + mapped: ["inner_mapped"] + + - name: "lambda local variable does not modify outer variable" + # V1 has no block-body lambdas and `let` inside lambda body is not a thing. + # Additionally, V1 `let` has no block scope — an inline let would leak anyway. + skip: "V1 lambda bodies are single expressions — no inline let statements; V1 let has no block scope regardless" + + # --- Capture across iterator chains --- + + - name: "chained iterators both capture outer variable" + # V1 parser rejects a newline immediately before `.` — break *after* the dot instead. + mapping: | + let min = 2 + let scale = 10 + root.v = [1, 2, 3, 4, 5]. + filter(x -> x >= $min). + map_each(x -> x * $scale) + output: {"v": [20, 30, 40, 50]} + + - name: "sort_by with outer variable in key function" + # V1 has no bracket indexing: `x[$field]` is a parse error (§6.2). Use .get($field) instead. + mapping: | + let field = "priority" + let items = [ + {"name": "c", "priority": 3}, + {"name": "a", "priority": 1}, + {"name": "b", "priority": 2} + ] + root.v = $items.sort_by(x -> x.get($field)).map_each(x -> x.name) + output: {"v": ["a", "b", "c"]} diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/purity.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/purity.yaml new file mode 100644 index 000000000..8f6e21529 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/purity.yaml @@ -0,0 +1,112 @@ +description: "Lambda purity — no output assignment, variable shadowing, context inheritance" + +tests: + # --- Cannot assign to output from lambda --- + # V1 lambda bodies are single expressions — assignments inside lambda bodies are simply not + # representable, so the V2-specific "compile_error: output" cases have no V1 equivalent. + + - name: "lambda cannot assign to output field" + skip: "V1 lambda bodies are single expressions — assignments inside lambda bodies are unrepresentable" + + - name: "lambda cannot assign to output root" + skip: "V1 lambda bodies are single expressions — assignments inside lambda bodies are unrepresentable" + + - name: "lambda cannot assign to output metadata" + skip: "V1 lambda bodies are single expressions — assignments inside lambda bodies are unrepresentable" + + - name: "nested lambda cannot assign to output" + skip: "V1 lambda bodies are single expressions — assignments inside lambda bodies are unrepresentable" + + # --- Variable shadowing (not mutation) in expression context --- + # V1 has NO block scope for let variables (§7.2), and lambda bodies are single expressions + # with no `let` inside. All V2 "shadow" tests exhibit fundamentally different semantics. + + - name: "lambda shadows outer variable" + skip: "V1 lambda bodies cannot contain let statements (single-expression bodies)" + + - name: "lambda shadows — outer unchanged after map" + skip: "V1 lambda bodies cannot contain let statements; V1 also has no block scope for let" + + - name: "fold lambda shadows outer variable" + skip: "V1 lambda bodies cannot contain let statements (single-expression bodies)" + + - name: "block body variable does not leak" + skip: "V1 lambda bodies are single expressions — no block form" + + # --- Context inheritance: top-level lambda reads input and output --- + # V1 has no `input` / `output` keywords. At the top level, `this` is the input payload + # and `root` is the evolving output. Lambdas pop the context (§6.5) but top-level variables + # and metadata remain accessible. + + - name: "top-level lambda reads input" + input: {"multiplier": 10} + mapping: | + root.result = [1, 2, 3].map_each(x -> x * this.multiplier) + output: {"result": [10, 20, 30]} + # Note: `this` inside a lambda body refers to the OUTER `this` (§6.5), which at top level + # is the input payload. So `this.multiplier` reads input.multiplier correctly. + + - name: "top-level lambda reads output" + # V1 `root` inside a lambda body refers to the partial output so far. This works. + mapping: | + root.base = 100 + root.result = [1, 2, 3].map_each(x -> x + root.base) + output: {"base": 100, "result": [101, 102, 103]} + + - name: "top-level lambda reads input metadata" + # V1 preserves input metadata through to the output by default — declare the expected output metadata + # so the runner doesn't treat the carried-over `scale` key as an unexpected entry. + input: null + input_metadata: {"scale": "5"} + mapping: | + root.result = [1, 2].map_each(x -> x.string() + @scale) + output: {"result": ["15", "25"]} + output_metadata: {"scale": "5"} + + - name: "top-level lambda reads outer variable" + mapping: | + let factor = 3 + root.result = [10, 20].map_each(x -> x * $factor) + output: {"result": [30, 60]} + + # --- Lambda inside map cannot access input --- + # V1 named maps: inside the body, `this` is the receiver (the value passed to .apply), + # NOT the original input. So `this` inside a lambda inside a map body is the OUTER `this` + # of the map, which is the receiver, not the original input payload. + + - name: "lambda inside map cannot read input" + # V1 analogue: inside the map body, `this` is the map's receiver, not the original input. + # So if caller did `.apply("transform")` on [1,2], inside the map `this` is [1,2] and + # `this.value` is undefined. This is a runtime error — V1 wording is about multiplying null. + input: {"value": 42} + mapping: | + map transform { root = this.map_each(x -> x * this.value) } + root.result = [1, 2].apply("transform") + error: "cannot multiply types" + + - name: "lambda inside map cannot read output" + # V1: inside a map, `root` is a FRESH value, not the outer root. + # The outer root is inaccessible. Reading root.base inside the map gets the fresh map-local root's base (null). + mapping: | + root.base = 10 + map transform { root = this.map_each(x -> x + root.base) } + root.result = [1, 2].apply("transform") + error: "cannot add types" + + - name: "lambda inside map can read map parameter" + # V1 maps have no parameters — the single input is `this`. Reshape to pass an object. + mapping: | + map scale { root = this.items.map_each(x -> x * this.factor) } + root.result = {"items": [1, 2, 3], "factor": 5}.apply("scale") + output: {"result": [5, 10, 15]} + + - name: "lambda inside map can read map local variable" + # V1 maps have their variable env reset on entry; a let set inside the map is visible + # to lambdas within that map body. + mapping: | + map transform { + let offset = 100 + root = this.map_each(x -> x + $offset) + } + root.result = [1, 2, 3].apply("transform") + output: {"result": [101, 102, 103]} diff --git a/internal/bloblang2/migrator/v1spec/tests/lambdas/return_values.yaml b/internal/bloblang2/migrator/v1spec/tests/lambdas/return_values.yaml new file mode 100644 index 000000000..7b734c837 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/lambdas/return_values.yaml @@ -0,0 +1,173 @@ +description: "Lambda return values — void errors, deleted() omission, filter boolean requirement, catch handler semantics" + +tests: + # --- Void in .map() is error --- + # V1 does not have a "void" concept — `if` expressions without else return `null`. + # .map_each returning null keeps the element as null (it does not error). + + - name: "map lambda returning void is error" + # V1: an if-without-else returns `nothing()` (not null). `.map_each` treats a `nothing()` + # return as "leave the element unchanged", so the input array is returned verbatim. + mapping: | + root.result = [1, 2, 3].map_each(x -> if x > 10 { x }) + output: {"result": [1, 2, 3]} + + - name: "map lambda void from match is error" + # V1: a non-exhaustive match yields `nothing()` (not null). `.map_each` treats `nothing()` + # as "leave the element unchanged", so the input array comes back verbatim. + mapping: | + root.result = [1, 2].map_each(x -> match x { 99 => "found" }) + output: {"result": [1, 2]} + + # --- Void in .filter() is error --- + + - name: "filter lambda returning void is error" + mapping: | + root.result = [1, 2, 3].filter(x -> if x > 10 { true }) + output: {"result": []} + # FIXME-v1: verify — V1 filter requires boolean true to keep an element; null is not true, so all are filtered out. No error is raised. + + # --- Filter requires boolean --- + # V1: "if this query resolves to any value other than a boolean `true` the element will be removed" — no error. + + - name: "filter lambda returning non-boolean is error" + mapping: | + root.result = [1, 2, 3].filter(x -> x * 2) + output: {"result": []} + # FIXME-v1: verify — V1 filter is tolerant: non-true values just cause the element to be dropped, no error + + - name: "filter lambda returning string is error" + mapping: | + root.result = [1, 2, 3].filter(x -> "yes") + output: {"result": []} + # FIXME-v1: verify — V1 filter drops elements for non-true (no error on string) + + - name: "filter lambda returning null is error" + mapping: | + root.result = [1, 2, 3].filter(x -> null) + output: {"result": []} + # FIXME-v1: verify — V1 filter drops elements for non-true (no error on null) + + - name: "filter lambda returning boolean works" + mapping: | + root.result = [1, 2, 3, 4].filter(x -> x % 2 == 0) + output: {"result": [2, 4]} + + # --- deleted() in .map() omits element --- + + - name: "map lambda returning deleted omits element" + mapping: | + root.result = [1, -2, 3, -4].map_each(x -> if x > 0 { x } else { deleted() }) + output: {"result": [1, 3]} + + - name: "map lambda all deleted returns empty array" + mapping: | + root.result = [1, 2, 3].map_each(x -> deleted()) + output: {"result": []} + + - name: "map lambda deleted from match" + mapping: | + root.result = ["a", "bb", "c", "dd"].map_each(s -> match { + s.length() > 1 => s.uppercase(), + _ => deleted() + }) + output: {"result": ["BB", "DD"]} + + # --- deleted() in .map_values() omits entry --- + # V1 has no .map_values; use .map_each on the object (which yields {key,value} entries) + # and return deleted() to drop that entry. + + - name: "map_values deleted omits entry" + mapping: | + root.result = {"a": 1, "b": -2, "c": 3}.map_each(item -> if item.value > 0 { item.value } else { deleted() }) + output: {"result": {"a": 1, "c": 3}} + + # --- .catch() handler returning deleted --- + # V1 .catch can take a lambda `err -> ...` where err is the error message string (not an object with .what). + + - name: "catch handler returning deleted removes field" + mapping: | + root.value = "not a number".number().catch(err -> deleted()) + output: {} + + - name: "catch handler returning deleted with prior value" + mapping: | + root.value = "prior" + root.value = "not a number".number().catch(err -> deleted()) + output: {} + + # --- .catch() handler returning void --- + # V1 has no void; `if false { 0 }` returns null. Assigning null is not skipped like V2's nothing(). + # To skip an assignment in V1 use `nothing()` explicitly. + + - name: "catch handler returning void skips assignment" + # V1: if-without-else yields `nothing()`, not null. A `nothing()` RHS causes the entire + # assignment to be skipped, so the prior value is preserved. + mapping: | + root.value = "prior" + root.value = "not a number".number().catch(err -> if false { 0 }) + output: {"value": "prior"} + + - name: "catch handler void on fresh field leaves it absent" + # V1: the assignment is skipped (RHS is `nothing()`), so no `root` is written. + # With no input and no root assignment, the mapping preserves the (null) input. + mapping: | + root.value = "not a number".number().catch(err -> if false { 0 }) + output: null + + # --- .catch() handler normal value --- + + - name: "catch handler provides fallback value" + mapping: | + root.value = "not a number".number().catch(err -> -1) + output: {"value": -1} + # V1 numbers are not typed (no int64 wrapper); -1 is just -1 + + - name: "catch handler can access error message" + # V1: the catch lambda's param is the error MESSAGE STRING (not an object). + mapping: | + root.msg = throw("custom error").catch(err -> err) + output: {"msg": "custom error"} + # FIXME-v1: verify — V1 err is the bare string; may include path prefix like "failed assignment: custom error" + + # --- Void in .fold() is error --- + + - name: "fold lambda returning void is error" + # V1: a `nothing()` return (from if-without-else) propagates through fold — the whole + # `root.result = ...` assignment is skipped. With no other root writes and null input, + # the mapping preserves the input (null). + mapping: | + root.result = [1, 2, 3].fold(0, item -> if item.value > 10 { item.tally + item.value }) + output: null + + # --- Void in .map_values() is error --- + + - name: "map_values lambda returning void is error" + # V1: an if-without-else yields `nothing()`, and `.map_each` on an object treats a + # `nothing()` return as "leave the entry unchanged". So the value survives. + mapping: | + root.result = {"a": 1}.map_each(item -> if item.value > 10 { item.value }) + output: {"result": {"a": 1}} + + # --- Void in .sort_by() is error --- + + - name: "sort_by lambda returning void is error" + mapping: | + root.result = [1, 2].sort_by(x -> if x > 10 { x }) + error: "expected number or string value" # FIXME-v1: verify — V1 sort_by requires strings or numbers; null may error + + # --- .catch() passes through on success --- + + - name: "catch not triggered on success" + # V1 `.number()` on a string returns a float64 (not int64), so expect 42.0 here. + mapping: | + root.value = "42".number().catch(err -> -1) + output: {"value": 42.0} + + # --- deleted() in filter is error (not boolean) --- + + - name: "filter lambda returning deleted is error" + mapping: | + root.result = [1, 2, 3].filter(x -> deleted()) + output: {"result": []} + # FIXME-v1: verify — V1 filter drops elements for non-true; deleted() is sentinel, not bool true, so all dropped (no error) diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/basic.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/basic.yaml new file mode 100644 index 000000000..0e5398e84 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/basic.yaml @@ -0,0 +1,183 @@ +description: "Maps: zero/single/multi parameter definitions, hoisting, return values, duplicate name errors" + +tests: + # --- Zero parameters --- + # V1 maps have no parameter list; the receiver `this` is passed via .apply(). + # For zero-argument-style maps we just ignore `this` inside the body. + + - name: "zero parameter map returns literal object" + mapping: | + map headers { + root = {"content_type": "json"} + } + root.h = {}.apply("headers") + output: {"h": {"content_type": "json"}} + + - name: "zero parameter map returns literal string" + mapping: | + map greeting { + root = "hello world" + } + root.msg = {}.apply("greeting") + output: {"msg": "hello world"} + + - name: "zero parameter map returns literal integer" + mapping: | + map answer { + root = 42 + } + root.v = {}.apply("answer") + output: {"v": 42} + + - name: "zero parameter map called multiple times" + mapping: | + map tag { + root = "v1" + } + root.a = {}.apply("tag") + root.b = {}.apply("tag") + output: {"a": "v1", "b": "v1"} + + # --- Single parameter --- + # V1 style: the single argument is `this` inside the map body. + + - name: "single parameter map" + mapping: | + map double { + root = this * 2 + } + root.v = 5.apply("double") + output: {"v": 10} + + - name: "single parameter map with string" + mapping: | + map shout { + root = this.uppercase() + } + root.v = "hello".apply("shout") + output: {"v": "HELLO"} + + - name: "single parameter map with object access" + mapping: | + map get_name { + root = this.name + } + root.v = {"name": "Alice", "age": 30}.apply("get_name") + output: {"v": "Alice"} + + # --- Multiple parameters --- + # V1 has no parameter list; pass an object as the receiver. + + - name: "two parameter map" + mapping: | + map add { + root = this.a + this.b + } + root.v = {"a": 3, "b": 7}.apply("add") + output: {"v": 10} + + - name: "three parameter map" + mapping: | + map calc { + root = this.x + this.y * this.z + } + root.v = {"x": 1, "y": 2, "z": 3}.apply("calc") + output: {"v": 7} + + - name: "map with variables in body" + mapping: | + map total { + let tax = this.subtotal * this.tax_rate + root = this.subtotal + $tax + } + root.v = {"subtotal": 100, "tax_rate": 0.1}.apply("total") + output: {"v": 110.0} + + - name: "map with multiple variables in body" + mapping: | + map combine { + let ab = this.a + this.b + let abc = $ab + this.c + root = $abc + } + root.v = {"a": "hello", "b": " ", "c": "world"}.apply("combine") + output: {"v": "hello world"} + + # --- Return value semantics --- + + - name: "map returns last expression value" + mapping: | + map last_wins { + let unused = this + 1 + root = this * 10 + } + root.v = 5.apply("last_wins") + output: {"v": 50} + + - name: "map returns array" + mapping: | + map wrap { + root = [this, this, this] + } + root.v = 7.apply("wrap") + output: {"v": [7, 7, 7]} + + - name: "map returns null" + mapping: | + map nothing_map { + root = null + } + root.v = 42.apply("nothing_map") + output: {"v": null} + + - name: "map returns boolean" + mapping: | + map is_positive { + root = this > 0 + } + root.v = 5.apply("is_positive") + output: {"v": true} + + # --- Hoisting --- + # V1 hoists map names — map can be .apply()'d before its definition appears in the file. + + - name: "map called before declaration" + mapping: | + root.v = 21.apply("double") + map double { + root = this * 2 + } + output: {"v": 42} + + - name: "map called before declaration with multiple maps" + mapping: | + root.v = {"a": 2.apply("triple"), "b": 1}.apply("add") + map triple { root = this * 3 } + map add { root = this.a + this.b } + output: {"v": 7} + + # --- Duplicate map name --- + + - name: "duplicate map name in same file is compile error" + # V1 wording is "map name collision". + mapping: | + map foo { root = this } + map foo { root = this * 2 } + root.v = 1.apply("foo") + compile_error: "map name collision" + + # --- Parameter is read-only --- + + - name: "assigning to parameter is compile error" + skip: "V1 maps have no named parameters — the receiver is `this`, which is never an assignment target" + + # --- Arity errors --- + + - name: "too few positional arguments is error" + skip: "V1 .apply() takes a single receiver value and no argument list — there is no arity check" + + - name: "too many positional arguments is error" + skip: "V1 .apply() takes a single receiver value and no argument list — there is no arity check" + + - name: "zero args to parameterised map is error" + skip: "V1 .apply() always takes a receiver — the zero-arg case is expressed by ignoring `this` in the body" diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/defaults.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/defaults.yaml new file mode 100644 index 000000000..35559b64e --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/defaults.yaml @@ -0,0 +1,167 @@ +description: "Default parameter values: literal defaults, positional/named omission, non-literal default errors, dynamic defaults pattern" + +# V1 maps have no parameter list and therefore no default values. +# The idiomatic V1 equivalent is to pass an object and use `.or(default)` on each field inside the map body. +# Most of these tests probe V2-only default-value syntax and are skipped. + +tests: + # --- Basic default values --- + + - name: "single default parameter omitted uses default" + mapping: | + map greet { + root = this.greeting.or("Hello") + ", " + this.name + } + root.v = {"name": "Alice"}.apply("greet") + output: {"v": "Hello, Alice"} + + - name: "single default parameter provided overrides default" + mapping: | + map greet { + root = this.greeting.or("Hello") + ", " + this.name + } + root.v = {"name": "Alice", "greeting": "Hi"}.apply("greet") + output: {"v": "Hi, Alice"} + + - name: "multiple default parameters all omitted" + # V1 `.round()` takes no arguments (rounds to nearest whole number). For decimal formatting, + # build a format string and use `.format()` instead. + mapping: | + map fmt { + let currency = this.currency.or("USD") + let decimals = this.decimals.or(2) + let fmt_str = "%." + $decimals.string() + "f" + root = $currency + " " + $fmt_str.format(this.amount) + } + root.v = {"amount": 99.99}.apply("fmt") + output: {"v": "USD 99.99"} + + - name: "multiple defaults first provided second omitted" + mapping: | + map fmt { + let currency = this.currency.or("USD") + let decimals = this.decimals.or(2) + let fmt_str = "%." + $decimals.string() + "f" + root = $currency + " " + $fmt_str.format(this.amount) + } + root.v = {"amount": 99.99, "currency": "EUR"}.apply("fmt") + output: {"v": "EUR 99.99"} + + - name: "multiple defaults all provided" + # With `%.0f`, Go's fmt produces "100" (no trailing .0), so the expected string lacks the `.0`. + mapping: | + map fmt { + let currency = this.currency.or("USD") + let decimals = this.decimals.or(2) + let fmt_str = "%." + $decimals.string() + "f" + root = $currency + " " + $fmt_str.format(this.amount) + } + root.v = {"amount": 99.999, "currency": "EUR", "decimals": 0}.apply("fmt") + output: {"v": "EUR 100"} + + # --- Default value types --- + + - name: "default integer literal" + mapping: | + map inc { + root = this.x + this.step.or(1) + } + root.v = {"x": 10}.apply("inc") + output: {"v": 11} + + - name: "default string literal" + mapping: | + map tag { + root = this.prefix.or("tag") + ":" + this.value + } + root.v = {"value": "foo"}.apply("tag") + output: {"v": "tag:foo"} + + - name: "default true literal" + mapping: | + map check { + let strict = this.strict.or(true) + root = if $strict { this.x > 0 } else { this.x >= 0 } + } + root.v = {"x": 0}.apply("check") + output: {"v": false} + + - name: "default false literal" + mapping: | + map check { + let lenient = this.lenient.or(false) + root = if $lenient { true } else { this.x > 0 } + } + root.v = {"x": -1}.apply("check") + output: {"v": false} + + - name: "default null literal" + mapping: | + map maybe { + # V1: .or(fallback) replaces null or missing. `this.x` being null yields null; + # to apply a real default the caller passes a value directly or we use the provided fallback. + root = this.x.or(this.fallback) + } + root.v = {"x": null, "fallback": "backup"}.apply("maybe") + output: {"v": "backup"} + + # --- Named arguments with defaults --- + # V1 has no named-argument call form for maps. + + - name: "named args skip middle optional parameter" + skip: "V1 maps have no parameter list — there are no named arguments at the call site" + + - name: "named args provide only required" + skip: "V1 maps have no parameter list — there are no named arguments at the call site" + + - name: "named args override all defaults" + skip: "V1 maps have no parameter list — there are no named arguments at the call site" + + # --- Non-literal defaults are compile errors --- + # V1 has no default-value syntax, so these compile-error cases don't translate. + + - name: "expression as default value is compile error" + skip: "V1 maps have no default-value syntax — the non-literal-default restriction has no V1 analogue" + + - name: "function call as default value is compile error" + skip: "V1 maps have no default-value syntax — the non-literal-default restriction has no V1 analogue" + + - name: "variable reference as default value is compile error" + skip: "V1 maps have no default-value syntax — the non-literal-default restriction has no V1 analogue" + + - name: "parameter reference as default value is compile error" + skip: "V1 maps have no default-value syntax — the non-literal-default restriction has no V1 analogue" + + # --- Default before required is compile error --- + + - name: "default parameter before required parameter is compile error" + skip: "V1 maps have no parameter list — required/optional ordering has no V1 analogue" + + # --- Dynamic defaults pattern --- + + - name: "dynamic default with null and or" + mapping: | + map connect { + let p = this.port.or(if this.host.has_prefix("https") { 443 } else { 80 }) + root = this.host + ":" + $p.string() + } + root.v = {"host": "https://example.com"}.apply("connect") + output: {"v": "https://example.com:443"} + + - name: "dynamic default overridden by caller" + mapping: | + map connect { + let p = this.port.or(if this.host.has_prefix("https") { 443 } else { 80 }) + root = this.host + ":" + $p.string() + } + root.v = {"host": "http://example.com", "port": 8080}.apply("connect") + output: {"v": "http://example.com:8080"} + + - name: "dynamic default with http fallback" + mapping: | + map connect { + let p = this.port.or(if this.host.has_prefix("https") { 443 } else { 80 }) + root = this.host + ":" + $p.string() + } + root.v = {"host": "http://example.com"}.apply("connect") + output: {"v": "http://example.com:80"} diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/discard_params.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/discard_params.yaml new file mode 100644 index 000000000..e22ddb868 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/discard_params.yaml @@ -0,0 +1,68 @@ +description: "Discard parameters (_): ignoring arguments, referencing _ error, multiple _, no defaults, named call restriction" + +# V1 maps take a single receiver (`this`) rather than a parameter list. There are no positional +# arguments to discard, no `_` placeholder, and no named-argument call form. All discard-parameter +# semantics are V2-only and cannot be ported. + +tests: + # --- Basic discard --- + + - name: "single discard parameter ignores first arg" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + - name: "discard last parameter" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + - name: "discard middle parameter" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + # --- Multiple discards --- + + - name: "multiple discard parameters" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + - name: "all parameters discarded except one" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + - name: "all parameters discarded" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + # --- Referencing _ is compile error --- + + - name: "referencing _ in body is compile error" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + - name: "referencing _ as method target in body is compile error" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + - name: "referencing _ in variable declaration is compile error" + skip: "V1 maps have no parameter list — there is no `_` discard placeholder" + + # --- Discard with defaults is compile error --- + + - name: "discard parameter with default value is compile error" + skip: "V1 maps have no parameter list, defaults, or `_` — the combination is not representable" + + - name: "discard after default parameters is compile error" + skip: "V1 maps have no parameter list, defaults, or `_` — the combination is not representable" + + # --- Named call to map with discard is compile error --- + + - name: "named call to map with discard parameter is compile error" + skip: "V1 maps are invoked via .apply(receiver) — there is no named-argument call form" + + - name: "named call to map with multiple discards is compile error" + skip: "V1 maps are invoked via .apply(receiver) — there is no named-argument call form" + + # --- Positional call still works --- + + - name: "positional call to map with discard works normally" + skip: "V1 maps have no parameter list — the V2 positional-with-discard call form is not representable" + + # --- Arity still enforced with discards --- + + - name: "too few args with discard is error" + skip: "V1 maps have no parameter list — arity is not checked at the call site" + + - name: "too many args with discard is error" + skip: "V1 maps have no parameter list — arity is not checked at the call site" diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/higher_order.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/higher_order.yaml new file mode 100644 index 000000000..38bd2b50b --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/higher_order.yaml @@ -0,0 +1,125 @@ +description: "Higher-order maps: maps as arguments to .map()/.filter()/.sort_by(), store-in-variable error, bare-name error" + +# V1 has no first-class map references — a named map cannot be passed by bare identifier to +# .map_each()/.filter()/.sort_by(). The idiomatic V1 equivalent is an explicit lambda that +# calls .apply("name") on the element: `.map_each(v -> v.apply("name"))`. +# The V2 bare-map-name forms (`.map(double)`, etc.) are not representable in V1, so tests +# that rely on that syntax are skipped. + +tests: + # --- Maps as argument to .map() --- + + - name: "map name passed to .map() method" + mapping: | + map double { + root = this * 2 + } + root.v = [1, 2, 3].map_each(v -> v.apply("double")) + output: {"v": [2, 4, 6]} + + - name: "map name passed to .map() is same as lambda" + mapping: | + map inc { + root = this + 1 + } + root.direct = [10, 20].map_each(v -> v.apply("inc")) + root.lambda = [10, 20].map_each(v -> v.apply("inc")) + output: {"direct": [11, 21], "lambda": [11, 21]} + + - name: "map with string transformation passed to .map()" + mapping: | + map shout { + root = this.uppercase() + } + root.v = ["hello", "world"].map_each(v -> v.apply("shout")) + output: {"v": ["HELLO", "WORLD"]} + + - name: "map returning object passed to .map()" + mapping: | + map wrap { + root = {"value": this} + } + root.v = [1, 2].map_each(v -> v.apply("wrap")) + output: {"v": [{"value": 1}, {"value": 2}]} + + - name: "map passed to .map() from input" + input: {"items": [5, 10, 15]} + mapping: | + map halve { + root = this / 2 + } + root.v = this.items.map_each(v -> v.apply("halve")) + output: {"v": [2.5, 5.0, 7.5]} + + # --- Maps as argument to .filter() --- + + - name: "map name passed to .filter() method" + mapping: | + map is_positive { + root = this > 0 + } + root.v = [-1, 0, 1, 2, -3].filter(v -> v.apply("is_positive")) + output: {"v": [1, 2]} + + - name: "map name passed to .filter() with strings" + mapping: | + map is_long { + root = this.length() > 3 + } + root.v = ["hi", "hello", "yo", "world"].filter(v -> v.apply("is_long")) + output: {"v": ["hello", "world"]} + + # --- Maps as argument to .sort_by() --- + + - name: "map name passed to .sort_by() method" + skip: "V1 has no .sort_by() method; use .sort(left, right -> left.x < right.x) with inline field access instead" + + # --- Chaining higher-order calls --- + + - name: "chained .filter() and .map() with map names" + mapping: | + map is_even { + root = this % 2 == 0 + } + map double { + root = this * 2 + } + root.v = [1, 2, 3, 4, 5].filter(v -> v.apply("is_even")).map_each(v -> v.apply("double")) + output: {"v": [4, 8]} + + # --- Map defined after use (hoisting with higher-order) --- + + - name: "map name used in .map() before declaration" + mapping: | + root.v = [1, 2, 3].map_each(v -> v.apply("triple")) + map triple { + root = this * 3 + } + output: {"v": [3, 6, 9]} + + # --- Cannot store map reference in variable --- + + - name: "storing map name in variable is compile error" + skip: "V1 has no first-class map references — a bare map name is not an expression, so this restriction has no V1 analogue" + + - name: "storing map name in variable without calling is compile error" + skip: "V1 has no first-class map references — a bare map name is not an expression" + + # --- Cannot use bare map name as expression --- + + - name: "bare map name assigned to output is compile error" + skip: "V1 has no first-class map references — a bare map name is not an expression" + + - name: "bare map name in array literal is compile error" + skip: "V1 has no first-class map references — a bare map name is not an expression" + + - name: "bare map name in object value is compile error" + skip: "V1 has no first-class map references — a bare map name is not an expression" + + # --- Only single-param maps work with higher-order (arity mismatch) --- + + - name: "multi-param map passed to .map() is error" + skip: "V1 maps have no parameter list — there is no arity check when a map is invoked via .apply()" + + - name: "zero-param map passed to .map() is error" + skip: "V1 maps have no parameter list — there is no arity check when a map is invoked via .apply()" diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/isolation.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/isolation.yaml new file mode 100644 index 000000000..5f1a6ad88 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/isolation.yaml @@ -0,0 +1,141 @@ +description: "Map isolation: no access to input, output, or top-level $variables from map body; lambdas inside maps also isolated" + +# V1 map isolation rules differ substantially from V2: +# - V1 has no `input` keyword — the top-level message is `this`, which inside a map body +# is rebound to the receiver. The outer message is simply inaccessible from inside a map. +# - V1 has no `output` keyword — writes go to `root`, and inside a map `root` is a fresh +# value scoped to the map. The outer `root` is not accessible from inside a map. +# - V1 variables ARE cleared at `.apply()` boundaries — a top-level `$global` is NOT visible +# inside the map body; accessing it yields unset (`null` on read, error if depended on). +# - V1 does not flag access-to-outer as a compile error; it silently sees a reset env. +# These V2 compile-error cases therefore have no V1 analogue and are skipped. + +tests: + # --- Cannot access input --- + + - name: "accessing input in map body is compile error" + skip: "V1 has no `input` keyword — `this` is rebound to the receiver inside the map; the outer message is simply not accessible" + + - name: "accessing input directly in map body is compile error" + skip: "V1 has no `input` keyword — `this` is rebound to the receiver inside the map" + + # --- Cannot access output --- + + - name: "accessing output in map body is compile error" + skip: "V1 has no `output` keyword — `root` inside the map is a fresh value scoped to the map; the outer `root` is not accessible" + + - name: "assigning to output in map body is compile error" + skip: "V1 has no `output` keyword — writes inside the map go to the map's own fresh `root`" + + # --- Cannot access top-level variables --- + # V1 clears the $var environment on .apply() entry, so a top-level $global is not visible inside the map. + # V1 does not raise a compile-time undeclared-variable error for this — reading an unset variable + # is a runtime condition. These are adjusted to runtime behaviour (map evaluates $global as unset). + + - name: "accessing top-level variable in map body is compile error" + # V1's error wording reports the bare name (without the `$` sigil). + mapping: | + let global = 100 + map bad { + root = $global + this + } + root.v = 1.apply("bad") + error: "variable 'global' undefined" + + - name: "top-level variable with same name not accessible in map" + mapping: | + let x = 99 + map get { + root = $x + } + root.v = 1.apply("get") + error: "variable 'x' undefined" + + # --- Local variables inside map are fine --- + + - name: "local variables declared inside map work" + mapping: | + map compute { + let doubled = this * 2 + root = $doubled + 1 + } + root.v = 5.apply("compute") + output: {"v": 11} + + - name: "multiple local variables inside map work" + mapping: | + map transform { + let sum = this.a + this.b + let product = this.a * this.b + root = {"sum": $sum, "product": $product} + } + root = {"a": 3, "b": 4}.apply("transform") + output: {"sum": 7, "product": 12} + + # --- Parameters are accessible --- + + - name: "parameters are accessible by name" + mapping: | + map echo { + root = this + } + root.v = "test".apply("echo") + output: {"v": "test"} + + - name: "all parameters accessible in multi-param map" + mapping: | + map triple { + root = [this.a, this.b, this.c] + } + root.v = {"a": 1, "b": 2, "c": 3}.apply("triple") + output: {"v": [1, 2, 3]} + + # --- Lambdas inside maps are also isolated --- + + - name: "lambda inside map cannot access input" + skip: "V1 has no `input` keyword — inside a lambda nested in a map, `this` is the outer (map's) receiver, not the message" + + - name: "lambda inside map cannot access output" + skip: "V1 has no `output` keyword — writes go to `root`, which inside a map is the map-scoped fresh value" + + - name: "lambda inside map cannot access top-level variable" + # V1's error wording reports the bare name (without the `$` sigil). + mapping: | + let factor = 10 + map scale { + root = this.map_each(x -> x * $factor) + } + root.v = [1, 2, 3].apply("scale") + error: "variable 'factor' undefined" + + - name: "lambda inside map can access map parameters" + mapping: | + map scale { + # `this` is captured in a var because lambda bodies pop the context stack — + # inside `x -> ...` the bare `this` reverts to the outer context (the map's receiver's parent). + let factor = this.factor + root = this.items.map_each(x -> x * $factor) + } + root.v = {"items": [1, 2, 3], "factor": 10}.apply("scale") + output: {"v": [10, 20, 30]} + + - name: "lambda inside map can access map local variables" + mapping: | + map process { + let prefix = "item" + root = this.map_each(x -> $prefix + ":" + x) + } + root.v = ["a", "b"].apply("process") + output: {"v": ["item:a", "item:b"]} + + # --- Passing external context as parameter works --- + + - name: "pass input data as parameter to map" + input: {"items": [1, 2, 3], "multiplier": 10} + mapping: | + map scale { + let multiplier = this.multiplier + root = this.items.map_each(x -> x * $multiplier) + } + root.v = this.apply("scale") + output: {"v": [10, 20, 30]} diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/named_args.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/named_args.yaml new file mode 100644 index 000000000..45e2b9c81 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/named_args.yaml @@ -0,0 +1,79 @@ +description: "Named arguments: syntax, order independence, mixing error, duplicate error, unknown arg error" + +# V1 maps are invoked with `.apply("name")`, which takes a single receiver value and no +# argument list. There is no named-argument call form, so every V2 test here probes a +# feature that has no V1 analogue. All entries are skipped. + +tests: + # --- Basic named argument syntax --- + + - name: "single named argument" + skip: "V1 maps have no parameter list — .apply() takes a single receiver, not named args" + + - name: "two named arguments" + skip: "V1 maps have no parameter list — .apply() takes a single receiver, not named args" + + - name: "three named arguments" + skip: "V1 maps have no parameter list — .apply() takes a single receiver, not named args" + + # --- Order independence --- + + - name: "named arguments in reverse order" + skip: "V1 maps have no parameter list — .apply() takes a single receiver, not named args" + + - name: "named arguments in arbitrary order" + skip: "V1 maps have no parameter list — .apply() takes a single receiver, not named args" + + - name: "named arguments shuffled with three params" + skip: "V1 maps have no parameter list — .apply() takes a single receiver, not named args" + + # --- Named with defaults --- + + - name: "named args with some defaults omitted" + skip: "V1 maps have no parameter list, no defaults, and no named-argument call form" + + - name: "named args override specific default" + skip: "V1 maps have no parameter list, no defaults, and no named-argument call form" + + - name: "named args override all defaults" + skip: "V1 maps have no parameter list, no defaults, and no named-argument call form" + + # --- Mixing positional and named is compile error --- + + - name: "mixing positional and named arguments is compile error" + skip: "V1 maps have no parameter list — the mixing restriction has no V1 analogue" + + - name: "mixing named then positional is compile error" + skip: "V1 maps have no parameter list — the mixing restriction has no V1 analogue" + + # --- Duplicate named argument is compile error --- + + - name: "duplicate named argument is compile error" + skip: "V1 maps have no parameter list — the duplicate-name restriction has no V1 analogue" + + - name: "duplicate named argument second param is compile error" + skip: "V1 maps have no parameter list — the duplicate-name restriction has no V1 analogue" + + # --- Unknown named argument is error --- + + - name: "unknown named argument is error" + skip: "V1 maps have no parameter list — the unknown-name restriction has no V1 analogue" + + - name: "all unknown named arguments is error" + skip: "V1 maps have no parameter list — the unknown-name restriction has no V1 analogue" + + # --- Missing required named argument is error --- + + - name: "missing required named argument is error" + skip: "V1 maps have no parameter list — arity is not checked at the call site" + + - name: "missing all required named arguments is error" + skip: "V1 maps have no parameter list — arity is not checked at the call site" + + # --- Named arguments with expressions --- + + - name: "named argument values can be expressions" + skip: "V1 maps have no parameter list — there are no named arguments at the call site" + + - name: "named argument values from input" + skip: "V1 maps have no parameter list — there are no named arguments at the call site" diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/parameter_shadowing.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/parameter_shadowing.yaml new file mode 100644 index 000000000..4aeeaffdc --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/parameter_shadowing.yaml @@ -0,0 +1,66 @@ +description: > + Parameter shadowing — parameter names shadow map names within the map + body. The parameter always wins. Lambda parameters can also shadow. + +# V1 maps have no named parameters — the receiver is `this`. There is no parameter-versus-map-name +# shadowing to test. Lambda-parameter shadowing does still apply in V1, so those cases are translated. + +tests: + # --- Parameter shadows map name --- + + - name: "parameter with same name as another map shadows it" + skip: "V1 maps have no named parameters — there is no parameter-versus-map-name shadowing" + + - name: "shadowed map name not callable in map body" + skip: "V1 maps have no named parameters — there is no parameter-versus-map-name shadowing" + + - name: "non-shadowed maps still callable" + skip: "V1 maps have no named parameters — there is no parameter-versus-map-name shadowing" + + - name: "parameter shadows map but map callable outside" + skip: "V1 maps have no named parameters — there is no parameter-versus-map-name shadowing" + + # --- Lambda parameter shadows outer variable --- + + - name: "lambda parameter shadows outer variable" + mapping: | + let x = 100 + root.v = [1, 2, 3].map_each(x -> x * 2) + output: {"v": [2, 4, 6]} + + - name: "lambda parameter shadows map parameter" + mapping: | + map process { + # The V2 test passes 999 as `x`; the equivalent in V1 is to ignore `this`. + root = [1, 2, 3].map_each(x -> x * 10) + } + root.v = 999.apply("process") + output: {"v": [10, 20, 30]} + + - name: "nested lambda parameters shadow each level" + # V1 rejects named-parameter collisions between nested lambdas in the same chain + # ("would shadow a parent context") — so the V2 "inner x shadows outer x" form + # is a V1 compile error. Rename the inner lambda parameter to y. + mapping: | + root.v = [1, 2].map_each(x -> [10, 20].map_each(y -> y * 100)) + output: {"v": [[1000, 2000], [1000, 2000]]} + + # --- Parameter shadow does not affect caller --- + + - name: "shadowing does not leak to caller" + skip: "V1 maps have no named parameters — there is no parameter-versus-map-name shadowing" + + # --- Discard parameter does not shadow --- + + - name: "discard parameter does not shadow anything" + skip: "V1 maps have no parameter list and no `_` discard placeholder" + + # --- Multiple parameters, one shadows --- + + - name: "one parameter shadows map, other does not" + skip: "V1 maps have no named parameters — there is no parameter-versus-map-name shadowing" + + # --- Parameter read-only --- + + - name: "parameter is read-only in map body" + skip: "V1 maps have no named parameters — `this` (the receiver) is not an assignment target anywhere" diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/recursion.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/recursion.yaml new file mode 100644 index 000000000..8c330b607 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/recursion.yaml @@ -0,0 +1,158 @@ +description: "Recursion: self-recursion, mutual recursion, depth limits, uncatchable recursion errors" + +# V1 maps recurse via `.apply("name")` inside the map body. The receiver carries +# the "parameter" value, so where V2 passes (n-1), V1 passes `(this - 1)` or an object literal +# capturing multiple named fields. Environment.WithMaxMapRecursion(n) bounds depth. + +tests: + # --- Self recursion --- + + - name: "simple self recursion with base case" + mapping: | + map factorial { + root = if this <= 1 { 1 } else { this * (this - 1).apply("factorial") } + } + root.v = 5.apply("factorial") + output: {"v": 120} + + - name: "self recursion base case returns immediately" + mapping: | + map factorial { + root = if this <= 1 { 1 } else { this * (this - 1).apply("factorial") } + } + root.v = 0.apply("factorial") + output: {"v": 1} + + - name: "recursive countdown to build array" + # V1's `.concat()` lives in `internal/impl/pure` and isn't loaded in the migrator's bare + # environment. Use `.merge()` instead — it joins two arrays into one. + mapping: | + map countdown { + root = if this <= 0 { [] } else { [this].merge((this - 1).apply("countdown")) } + } + root.v = 5.apply("countdown") + output: {"v": [5, 4, 3, 2, 1]} + + - name: "recursive sum of array" + # V2 indexes with items[idx]; V1 uses .index(idx). + mapping: | + map sum_list { + root = if this.idx >= this.items.length() { + 0 + } else { + this.items.index(this.idx) + {"items": this.items, "idx": this.idx + 1}.apply("sum_list") + } + } + root.v = {"items": [10, 20, 30], "idx": 0}.apply("sum_list") + output: {"v": 60} + + - name: "recursive string repeat" + mapping: | + map repeat { + root = if this.n <= 0 { "" } else { this.s + {"s": this.s, "n": this.n - 1}.apply("repeat") } + } + root.v = {"s": "ab", "n": 3}.apply("repeat") + output: {"v": "ababab"} + + # --- Mutual recursion --- + + - name: "mutual recursion is_even and is_odd" + mapping: | + map is_even { + root = if this == 0 { true } else { (this - 1).apply("is_odd") } + } + map is_odd { + root = if this == 0 { false } else { (this - 1).apply("is_even") } + } + root.even = 4.apply("is_even") + root.odd = 3.apply("is_odd") + output: {"even": true, "odd": true} + + - name: "mutual recursion with larger values" + mapping: | + map is_even { + root = if this == 0 { true } else { (this - 1).apply("is_odd") } + } + map is_odd { + root = if this == 0 { false } else { (this - 1).apply("is_even") } + } + root.v = 100.apply("is_even") + output: {"v": true} + + - name: "mutual recursion called before declaration" + mapping: | + root.v = 5.apply("ping") + map ping { + root = if this <= 0 { "done" } else { (this - 1).apply("pong") } + } + map pong { + root = if this <= 0 { "done" } else { (this - 1).apply("ping") } + } + output: {"v": "done"} + + # --- Recursion depth support (at least 1000) --- + + - name: "recursion supports 1000 levels deep" + mapping: | + map deep { + root = if this <= 0 { 0 } else { 1 + (this - 1).apply("deep") } + } + root.v = 1000.apply("deep") + output: {"v": 1000} # FIXME-v1: verify — V1 default max-recursion depth is environment-dependent; host may need WithMaxMapRecursion(>=1000) + + # --- Recursion depth limit exceeded --- + + - name: "exceeding recursion limit is runtime error" + mapping: | + map infinite { + root = 1 + this.apply("infinite") + } + root.v = 0.apply("infinite") + error: "recursion" # FIXME-v1: verify exact error substring + + - name: "mutual recursion exceeding limit is runtime error" + mapping: | + map ping { root = this.apply("pong") } + map pong { root = this.apply("ping") } + root.v = 0.apply("ping") + error: "recursion" # FIXME-v1: verify exact error substring + + # --- Recursion limit error cannot be caught --- + # NOTE: V1 .catch takes an expression fallback, not a lambda. The V2 test uses `err -> "recovered"` + # which is not the V1 form. V1 has no way to destructure the caught error into a lambda arg directly. + + - name: "recursion limit error can be caught with catch" + # V1 treats the recursion-limit error as a normal runtime error, so `.catch()` does recover. + # (This diverges from V2, which makes recursion limits uncatchable.) + mapping: | + map infinite { + root = 1 + this.apply("infinite") + } + root.v = 0.apply("infinite").catch("recovered") + output: {"v": "recovered"} + + - name: "recursion limit in nested call can be caught" + # V1 `.catch()` catches the recursion error just like any other runtime error. + mapping: | + map infinite { + root = this.apply("infinite") + } + map wrapper { + root = this.apply("infinite").catch("safe") + } + root.v = 0.apply("wrapper") + output: {"v": "safe"} + + # --- Recursion with match expression --- + + - name: "recursive map using match" + mapping: | + map fib { + root = match { + this <= 0 => 0, + this == 1 => 1, + _ => (this - 1).apply("fib") + (this - 2).apply("fib"), + } + } + root.v = 10.apply("fib") + output: {"v": 55} diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/recursion_advanced.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/recursion_advanced.yaml new file mode 100644 index 000000000..2233d829d --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/recursion_advanced.yaml @@ -0,0 +1,173 @@ +description: > + Advanced recursion patterns — mutual recursion with accumulator, + recursion through match, recursive data structure processing, + recursion with lambdas, and recursion limit interaction with + map composition. + +# V1 maps recurse via `.apply("name")`. Multi-argument recursion is encoded by +# passing an object receiver with named fields. Some V2 patterns (fold with a two-step +# `(acc, item) ->` lambda, object-literal-with-computed-key assignments inside a lambda, +# `.iter()` / `.map_values()`) have V1 analogues that differ in spelling. + +tests: + # --- Mutual recursion with accumulator --- + + - name: "mutual recursion with accumulator" + mapping: | + map collatz_steps { + root = if this.n == 1 { + this.count + } else { + this.apply("collatz_next") + } + } + map collatz_next { + root = if this.n % 2 == 0 { + {"n": this.n / 2, "count": this.count + 1}.apply("collatz_steps") + } else { + {"n": this.n * 3 + 1, "count": this.count + 1}.apply("collatz_steps") + } + } + root.steps = {"n": 6, "count": 0}.apply("collatz_steps") + output: {"steps": 8} # FIXME-v1: verify — V1 integer division of `this.n / 2` produces a float (V1 `/` is always float); "n": this.n / 2 may yield 3.0 not 3, which could affect comparisons + + # --- Recursive data structure traversal --- + + - name: "recursive tree depth calculation" + # V1 does not allow `let` inside the `{ ... }` block of an `if`-as-expression (the braces + # after `=` parse as an expression block, not a statement block). Inline the recursive calls + # instead. + mapping: | + map depth { + root = if this == null { 0 } else if this.left.apply("depth") > this.right.apply("depth") { 1 + this.left.apply("depth") } else { 1 + this.right.apply("depth") } + } + let tree = { + "val": 1, + "left": {"val": 2, "left": {"val": 4, "left": null, "right": null}, "right": null}, + "right": {"val": 3, "left": null, "right": null} + } + root.d = $tree.apply("depth") + output: {"d": 3} + + - name: "recursive tree node count" + mapping: | + map count_nodes { + root = if this == null { 0 } else { + 1 + this.left.apply("count_nodes") + this.right.apply("count_nodes") + } + } + let tree = { + "val": 1, + "left": {"val": 2, "left": null, "right": null}, + "right": {"val": 3, "left": null, "right": {"val": 4, "left": null, "right": null}} + } + root.count = $tree.apply("count_nodes") + output: {"count": 4} + + # --- Recursion through match --- + + - name: "recursive map using match with multiple arms" + mapping: | + map gcd { + root = match { + this.b == 0 => this.a, + _ => {"a": this.b, "b": this.a % this.b}.apply("gcd"), + } + } + root.v = {"a": 48, "b": 18}.apply("gcd") + output: {"v": 6} + + - name: "recursive map using equality match" + mapping: | + map describe { + root = match this { + 0 => "zero", + 1 => "one", + _ => (this - 1).apply("describe") + "+", + } + } + root.v = 3.apply("describe") + output: {"v": "one++"} + + # --- Recursion with iterators --- + + - name: "recursive map called from within lambda" + # V1 fold is NOT curried — its lambda takes a single item parameter which is an object + # `{tally, value}`. Lambda body must start on the same line as `->`. + mapping: | + map sum_nested { + root = this.fold(0, item -> item.tally + (if item.value.type() == "array" { item.value.apply("sum_nested") } else { item.value })) + } + root.v = [1, [2, 3], [4, [5, 6]]].apply("sum_nested") + output: {"v": 21} + + # --- Recursion with local variables --- + + - name: "recursive map with local variables per frame" + mapping: | + map sum_tree { + let val = this.value + let left = if this.left != null { this.left.apply("sum_tree") } else { 0 } + let right = if this.right != null { this.right.apply("sum_tree") } else { 0 } + root = $val + $left + $right + } + let tree = { + "value": 10, + "left": {"value": 20, "left": null, "right": null}, + "right": {"value": 30, "left": {"value": 5, "left": null, "right": null}, "right": null} + } + root.total = $tree.apply("sum_tree") + output: {"total": 65} + + - name: "recursive flatten with fold and dynamic keys" + skip: "V2-only patterns combined here: .iter() yields {key,value} pairs in V1 (so available), but the V2 test uses imperative `$acc[$key] = ...` mutation inside a lambda body, which V1 does not support — V1 lambda bodies are single expressions and there is no bracket-indexing assignment. A full V1 rewrite would require .merge() and is a substantive behaviour change, not a mechanical translation" + + # --- Mutual recursion depth limit --- + + - name: "three-way mutual recursion works within depth limit" + mapping: | + map a { root = if this <= 0 { "done" } else { (this - 1).apply("b") } } + map b { root = if this <= 0 { "done" } else { (this - 1).apply("c") } } + map c { root = if this <= 0 { "done" } else { (this - 1).apply("a") } } + root.v = 9.apply("a") + output: {"v": "done"} + + - name: "three-way mutual recursion exceeds limit" + mapping: | + map a { root = this.apply("b") } + map b { root = this.apply("c") } + map c { root = this.apply("a") } + root.v = 0.apply("a") + error: "recursion" # FIXME-v1: verify exact error substring + + # --- Recursion limit cannot be caught even deeply nested --- + + - name: "recursion limit from mutual recursion is catchable" + # V1 `.catch()` DOES catch a recursion-limit error (unlike V2). + mapping: | + map a { root = this.apply("b") } + map b { root = this.apply("a") } + root.v = 0.apply("a").catch("safe") + output: {"v": "safe"} + + - name: "recursion limit in lambda context is catchable" + # V1 `.catch()` catches the recursion error from inside a lambda as well. + mapping: | + map infinite { root = this.apply("infinite") } + root.v = [1].map_each(x -> x.apply("infinite")).catch("safe") + output: {"v": "safe"} + + # --- Recursion called before declaration (hoisting) --- + + - name: "map hoisting allows forward reference" + mapping: | + root.v = 5.apply("double") + map double { root = this * 2 } + output: {"v": 10} + + - name: "mutual recursion with both maps declared after use" + mapping: | + root.v = 6.apply("a") + map a { root = if this <= 0 { true } else { (this - 1).apply("b") } } + map b { root = if this <= 0 { false } else { (this - 1).apply("a") } } + output: {"v": true} diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/recursive_with_iterators.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/recursive_with_iterators.yaml new file mode 100644 index 000000000..e6b5fa61d --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/recursive_with_iterators.yaml @@ -0,0 +1,106 @@ +description: > + Recursive maps combined with iterators — recursive map called from + within fold, map, and filter lambdas. These stress variable stack + frame isolation during recursion. + +# V1 .fold has a two-lambda signature: .fold(init, tally -> value -> ...). V2 uses a single +# (acc, item) -> ... lambda pair. V1's equivalent of V2 .map_values is .map_each paired with +# .key_values()/.from_all, but the direct analogue used here is .map_values (same name exists in V1). +# .iter() yields {key,value} entries in both dialects. + +tests: + # --- Recursive map called from map lambda --- + + - name: "recursive deep_values extracts all leaf values" + # V1 fold is NOT curried — its lambda takes a single item parameter `{tally, value}`. + # `.concat()` is not registered in the migrator's bare env; use `.merge()` instead. + # Lambda bodies must start on the same line as `->`. + mapping: | + map deep_values { + root = this.key_values().fold([], item -> if item.value.value.type() == "object" { item.tally.merge(item.value.value.apply("deep_values")) } else { item.tally.merge([item.value.value]) }) + } + root.v = {"a": 1, "b": {"c": 2, "d": 3}}.apply("deep_values").sort() + output: {"v": [1, 2, 3]} + + - name: "recursive deep_values with three levels" + mapping: | + map deep_values { + root = this.key_values().fold([], item -> if item.value.value.type() == "object" { item.tally.merge(item.value.value.apply("deep_values")) } else { item.tally.merge([item.value.value]) }) + } + root.v = {"a": {"b": {"c": 1}}, "d": 2}.apply("deep_values").sort() + output: {"v": [1, 2]} + + # --- Recursive map with map_values --- + + - name: "recursive double_all doubles nested values" + # V1 has no `.map_values` — for objects, `.map_each(item -> ...)` receives `{key, value}` + # entries and returns the new value (the key is preserved). + mapping: | + map double_all { + root = if this.type() == "object" { + this.map_each(item -> item.value.apply("double_all")) + } else if this.type() == "array" { + this.map_each(v -> v.apply("double_all")) + } else if this.type() == "number" { + this * 2 + } else { + this + } + } + root = {"a": 1, "b": [2, 3], "c": {"d": 4}}.apply("double_all") + output: {"a": 2, "b": [4, 6], "c": {"d": 8}} + + # --- Recursive count with filter --- + + - name: "recursive count_strings counts string leaves" + # V1 fold is NOT curried — use a single `item -> ...` lambda where item is `{tally, value}`. + mapping: | + map count_strings { + root = if this.type() == "string" { + 1 + } else if this.type() == "object" { + this.key_values().fold(0, item -> item.tally + item.value.value.apply("count_strings")) + } else if this.type() == "array" { + this.fold(0, item -> item.tally + item.value.apply("count_strings")) + } else { + 0 + } + } + root.v = {"name": "Alice", "age": 30, "tags": ["admin", "user"], "meta": {"role": "lead"}}.apply("count_strings") + output: {"v": 4} + + # --- Recursive with local variables in fold --- + + - name: "recursive sum_nested with accumulator variable" + # V1 fold is NOT curried — use a single `item -> ...` lambda. + mapping: | + map sum_nested { + root = if this.type() == "array" { + this.fold(0, item -> item.tally + item.value.apply("sum_nested")) + } else { + this + } + } + root.v = [1, [2, [3, 4]], 5].apply("sum_nested") + output: {"v": 15} + + # --- Recursive map returning modified copy --- + + - name: "recursive redact replaces strings with ***" + # V1 has no `.map_values` — use `.map_each(item -> ...)` on objects. Also: `match { ... }` + # rebinds `this` inside each arm to the match value, so `this.type()` as the match head makes + # `this` become a string inside arms. Use `if/else if` chains instead. + mapping: | + map redact { + root = if this.type() == "string" { + "***" + } else if this.type() == "object" { + this.map_each(item -> item.value.apply("redact")) + } else if this.type() == "array" { + this.map_each(v -> v.apply("redact")) + } else { + this + } + } + root = {"name": "Alice", "age": 30, "contacts": [{"email": "a@b.com"}]}.apply("redact") + output: {"name": "***", "age": 30, "contacts": [{"email": "***"}]} diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/transitive_calls.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/transitive_calls.yaml new file mode 100644 index 000000000..3ebebf0be --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/transitive_calls.yaml @@ -0,0 +1,97 @@ +description: > + Transitive map calls — maps calling other maps, maps called from lambdas, + and maps with complex parameter passing patterns. + +# V1 maps have no parameter list; multi-argument calls are encoded by passing an object +# receiver. Maps call other maps the same way they recurse: `receiver.apply("name")`. + +tests: + # --- Map calling another map --- + + - name: "map A calls map B" + mapping: | + map double { root = this * 2 } + map quad { root = this.apply("double").apply("double") } + root.v = 3.apply("quad") + output: {"v": 12} + + - name: "three-level transitive call" + mapping: | + map inc { root = this + 1 } + map double_inc { root = this.apply("inc") * 2 } + map process { root = this.apply("double_inc") + 100 } + root.v = 5.apply("process") + output: {"v": 112} + + - name: "map passes its parameter to another map" + mapping: | + map format_map { root = "[" + this + "]" } + map wrap { root = this.apply("format_map") } + root.v = "hello".apply("wrap") + output: {"v": "[hello]"} + + # --- Map called from lambda --- + + - name: "top-level lambda calls a map" + mapping: | + map double { root = this * 2 } + root.v = [1, 2, 3].map_each(n -> n.apply("double")) + output: {"v": [2, 4, 6]} + + - name: "lambda calls map that calls another map" + mapping: | + map inc { root = this + 1 } + map double_inc { root = this.apply("inc") * 2 } + root.v = [1, 2, 3].map_each(n -> n.apply("double_inc")) + output: {"v": [4, 6, 8]} + + - name: "filter lambda calls a map" + mapping: | + map is_big { root = this > 5 } + root.v = [1, 3, 7, 9, 2].filter(n -> n.apply("is_big")) + output: {"v": [7, 9]} + + # --- Map returning complex values --- + + - name: "map returns object used in lambda" + mapping: | + map make_pair { + root = {"key": this.k, "value": this.v} + } + root.v = ["a", "b"].map_each(s -> {"k": s, "v": s.length()}.apply("make_pair")) + output: + v: + - key: "a" + value: 1 + - key: "b" + value: 1 + + # --- Map with local variables calling another map --- + + - name: "map with locals calls another map" + mapping: | + map add { root = this.a + this.b } + map process { + let doubled = this * 2 + root = {"a": $doubled, "b": 10}.apply("add") + } + root.v = 5.apply("process") + output: {"v": 20} + + # --- Frame isolation under transitive calls --- + # V1 clears the $var environment on every .apply() entry, so the inner $local is a fresh + # binding and does not clobber the outer frame's $local after control returns. + + - name: "transitive calls do not share local variables" + mapping: | + map inner { + let local = this + 100 + root = $local + } + map outer { + let local = this + 1 + let result = $local.apply("inner") + root = $local * 1000 + $result + } + root.v = 5.apply("outer") + output: {"v": 6106} diff --git a/internal/bloblang2/migrator/v1spec/tests/maps/void_returns.yaml b/internal/bloblang2/migrator/v1spec/tests/maps/void_returns.yaml new file mode 100644 index 000000000..97c00da68 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/maps/void_returns.yaml @@ -0,0 +1,117 @@ +description: > + Map void returns — when a map body's final expression produces void + (from if-without-else or match-without-wildcard), the map call produces + void. Void propagates to the calling context with normal semantics. + +# V1 has no `void` type. The closest sentinels are: +# * `null` — produced by an if-without-else whose condition is false, or by a non-exhaustive +# match. This IS a value and assigning it overwrites the target. +# * `nothing()` — an explicit sentinel that silently skips the enclosing assignment. +# * `deleted()` — a sentinel that removes the field/message entirely. +# V2's "void" behaviour (skip assignment, rescued by .or(), errors in variable binding / collection +# literals) doesn't map 1:1 to any V1 construct. The V1 analogue is to explicitly produce +# `nothing()` inside the map body, but the rescue/error semantics still differ (V1 `nothing()` +# is replaced by .or() fallbacks just like null, but it has no "error in var decl" behaviour). +# Tests here either adjust to V1 null / nothing() semantics or skip. + +tests: + # --- Map returns void from if-without-else --- + + - name: "map returns void — output assignment skipped" + # V1: if-without-else yields null, and assigning null overwrites. To get V2's "skip" + # behaviour we must explicitly return nothing() on the else path. + mapping: | + map maybe { + root = if this > 10 { this * 2 } else { nothing() } + } + root.v = "prior" + root.v = 5.apply("maybe") + output: {"v": "prior"} # FIXME-v1: verify — V1 assignment of nothing() is silently skipped (mapping/statement.go), so root.v retains "prior" + + - name: "map returns value when condition true" + mapping: | + map maybe { + root = if this > 10 { this * 2 } else { nothing() } + } + root.v = 20.apply("maybe") + output: {"v": 40} + + - name: "map returns void — variable declaration errors" + # V1: `let x = nothing()` deletes the variable rather than erroring (§7.2). No direct analogue. + skip: "V1 has no `void` type — `let x = nothing()` deletes $x rather than raising a void-error" + + - name: "map returns void — variable reassignment skipped" + # V1: `let x = nothing()` DELETES $x rather than leaving it unchanged. Behaviour differs. + skip: "V1 `let x = nothing()` deletes the variable (§7.2) rather than leaving it unchanged as V2's void would" + + # --- Map returns void from non-exhaustive match --- + + - name: "map returns void from match — assignment skipped" + # V1: a non-exhaustive match yields null, not void. Assigning null overwrites. + # To emulate V2 behaviour, add an explicit _ => nothing() arm. + mapping: | + map classify { + root = match this { + "a" => "alpha", + "b" => "beta", + _ => nothing(), + } + } + root.v = "default" + root.v = "c".apply("classify") + output: {"v": "default"} + + - name: "map returns value from match when case matches" + mapping: | + map classify { + root = match this { + "a" => "alpha", + "b" => "beta", + _ => nothing(), + } + } + root.v = "a".apply("classify") + output: {"v": "alpha"} + + # --- Void from map in collection literal errors --- + + - name: "void from map in array literal is error" + # V1: nothing() inside an array literal is a runtime error — nothing() cannot be used + # as an operand/value. But `deleted()` in an array literal drops the slot. Behaviour differs. + skip: "V1 has no `void` type — nothing() inside an array literal behaves differently from V2 void (and deleted() drops the element rather than erroring)" + + - name: "void from map in object literal is error" + skip: "V1 has no `void` type — nothing() / null inside an object literal does not produce a V2-style void error" + + # --- Void from map rescued with .or() --- + + - name: "void from map rescued with or" + # V1: nothing() is treated as null by .or() and replaced with the fallback (§12.2). + mapping: | + map maybe { + root = if this > 10 { this * 2 } else { nothing() } + } + root.v = 5.apply("maybe").or(0) + output: {"v": 0} # FIXME-v1: verify — V1 .or() replaces nothing() / null with fallback + + - name: "non-void from map not rescued by or" + mapping: | + map maybe { + root = if this > 10 { this * 2 } else { nothing() } + } + root.v = 20.apply("maybe").or(0) + output: {"v": 40} + + # --- Void from map as argument to another map errors --- + + - name: "void from map as argument to another map is error" + # V1: `nothing()` as an operand to `*` raises a "cannot multiply types nothing and number" error. + mapping: | + map maybe { + root = if this > 10 { this } else { nothing() } + } + map double { + root = this * 2 + } + root.v = 5.apply("maybe").apply("double") + error: "cannot multiply types nothing" diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/arithmetic.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/arithmetic.yaml new file mode 100644 index 000000000..c32498e82 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/arithmetic.yaml @@ -0,0 +1,205 @@ +description: "Arithmetic operators (+, -, *, /, %) with all type combos, division/modulo by zero, null errors" + +tests: + # --- Addition --- + + - name: "add int64 + int64" + mapping: | + root.result = 5 + 3 + output: {"result": 8} + + - name: "add float64 + float64" + mapping: | + root.result = 2.5 + 3.5 + output: {"result": 6.0} + + - name: "add int32 + int32 stays int32" + mapping: | + root.result = 5.int32() + 3.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "add uint64 + uint64 stays uint64" + mapping: | + root.result = 10.uint64() + 20.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "add float32 + float32 stays float32" + mapping: | + root.result = 1.5.float32() + 2.5.float32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Subtraction --- + + - name: "subtract int64 - int64" + mapping: | + root.result = 10 - 3 + output: {"result": 7} + + - name: "subtract resulting in negative int64" + mapping: | + root.result = 3 - 10 + output: {"result": -7} + + # --- Multiplication --- + + - name: "multiply int64 * int64" + mapping: | + root.result = 6 * 7 + output: {"result": 42} + + - name: "multiply float64 * float64" + mapping: | + root.result = 2.5 * 4.0 + output: {"result": 10.0} + + # --- Division (always produces float) --- + + - name: "divide int64 / int64 produces float64" + mapping: | + root.result = 7 / 2 + output: {"result": 3.5} + + - name: "divide int64 / int64 exact still produces float64" + mapping: | + root.result = 10 / 2 + output: {"result": 5.0} + + - name: "divide float32 / float32 produces float32" + mapping: | + root.result = 10.0.float32() / 4.0.float32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "divide int32 / int32 produces float64 not int" + mapping: | + root.result = 7.int32() / 2.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "divide negative int64" + mapping: | + root.result = -7 / 2 + output: {"result": -3.5} + + - name: "chained division is left-associative" + mapping: | + root.result = 20 / 4 / 2 + output: {"result": 2.5} + + # --- Modulo (V1 requires integer operands only) --- + + - name: "modulo int64 % int64 produces int64" + mapping: | + root.result = 7 % 2 + output: {"result": 1} + + - name: "modulo float64 % float64 (fmod)" + mapping: | + root.result = 7.5 % 2.0 + output: {"result": 1} # V1: silently truncates floats to int64 before mod (7 % 2 = 1) + + - name: "modulo int64 % float64 promotes to float64" + mapping: | + root.result = 7 % 2.0 + output: {"result": 1} # V1: silently truncates float RHS to int64 (7 % 2 = 1) + + - name: "modulo negative dividend (truncated division remainder)" + mapping: | + root.result = -7 % 2 + output: {"result": -1} + + - name: "modulo negative float dividend (fmod semantics)" + mapping: | + root.result = -7.5 % 2.0 + output: {"result": -1} # V1: silently truncates floats to int64 (-7 % 2 = -1) + + # --- Division by zero --- + + - name: "division by zero int64" + mapping: | + root.result = this.a / this.b + input: {"a": 7, "b": 0} + error: "divide by zero" # V1 constant-folds literal `/ 0` at compile time, so uses runtime operands here + + - name: "division by zero float64 (no Infinity)" + mapping: | + root.result = this.a / this.b + input: {"a": 7.0, "b": 0.0} + error: "divide by zero" # V1 constant-folds literal `/ 0.0` at compile time, so uses runtime operands here + + # --- Modulo by zero --- + + - name: "modulo by zero int64" + mapping: | + root.result = this.a % this.b + input: {"a": 7, "b": 0} + error: "divide by zero" # V1 constant-folds literal `% 0` at compile time, so uses runtime operands here + + - name: "modulo by zero float64" + mapping: | + root.result = this.a % this.b + input: {"a": 7.0, "b": 0.0} + error: "divide by zero" # V1 truncates floats to int, then treats as int % 0 → divide by zero + + # --- Integer overflow --- + + - name: "int64 addition overflow" + mapping: | + root.result = 9223372036854775807 + 1 + output: {"result": -9223372036854775808} # V1: silent Go int64 wrap, no error + + - name: "int64 multiplication overflow" + mapping: | + root.result = 9223372036854775807 * 2 + output: {"result": -2} # V1: silent Go int64 wrap, no error + + - name: "int32 overflow" + mapping: | + root.result = 2147483647.int32() + 1.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 no overflow where int64 would" + mapping: | + root.result = 9223372036854775807.uint64() + 1.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Null in arithmetic --- + + - name: "null + int64 error" + mapping: | + root.result = null + 5 + compile_error: "cannot add types null" # V1 constant-folds literal null + N at compile time + + - name: "null * int64 error" + mapping: | + root.result = null * 5 + compile_error: "cannot multiply types null" # V1 constant-folds literal null * N at compile time + + # --- Non-numeric operands --- + + - name: "int64 + string error" + mapping: | + root.result = 5 + "3" + compile_error: "cannot add types number" # V1 constant-folds literal number + string at compile time + + - name: "int64 * bool error" + mapping: | + root.result = 5 * true + compile_error: "cannot multiply types number" # V1 constant-folds literal number * bool at compile time + + - name: "string - string error" + mapping: | + root.result = "hello" - "world" + compile_error: "cannot subtract types string" # V1 constant-folds literal string - string at compile time + + # --- NaN and Infinity arithmetic --- + + - name: "special float addition" + mapping: | + root.result = this.val + 1.0 + cases: + - name: "NaN + float64 produces NaN" + input: {val: {_type: "float64", value: "NaN"}} + output: {"result": {_type: "float64", value: "NaN"}} + - name: "Infinity + float64 stays Infinity" + input: {val: {_type: "float64", value: "Infinity"}} + output: {"result": {_type: "float64", value: "Infinity"}} + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/comparison.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/comparison.yaml new file mode 100644 index 000000000..44c938347 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/comparison.yaml @@ -0,0 +1,210 @@ +description: "Comparison operators (>, >=, <, <=) for numeric, string, timestamp, bytes; cross-family errors, null errors, NaN" + +tests: + # --- Numeric comparisons (int64) --- + + - name: "int64 less than true" + mapping: | + root.result = 3 < 5 + output: {"result": true} + + - name: "int64 less than false" + mapping: | + root.result = 5 < 3 + output: {"result": false} + + - name: "int64 greater than" + mapping: | + root.result = 10 > 3 + output: {"result": true} + + - name: "int64 greater than or equal (equal case)" + mapping: | + root.result = 5 >= 5 + output: {"result": true} + + - name: "int64 less than or equal (less case)" + mapping: | + root.result = 4 <= 5 + output: {"result": true} + + - name: "int64 less than or equal false" + mapping: | + root.result = 6 <= 5 + output: {"result": false} + + # --- Numeric with promotion --- + + - name: "int32 < int64 with promotion" + mapping: | + root.result = 3.int32() < 5 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int64 > float64 with promotion" + mapping: | + root.result = 5 > 4.5 + output: {"result": true} + + - name: "float32 <= float64 with promotion" + mapping: | + root.result = 3.0.float32() <= 3.0 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 >= int32 with promotion to int64" + mapping: | + root.result = 10.uint32() >= 10.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- String comparisons (lexicographic by codepoint) --- + + - name: "string less than lexicographic" + mapping: | + root.result = "apple" < "banana" + output: {"result": true} + + - name: "string equal not less than" + mapping: | + root.result = "hello" <= "hello" + output: {"result": true} + + - name: "string prefix is less than longer string" + mapping: | + root.result = "abc" < "abcd" + output: {"result": true} + + - name: "string uppercase A < lowercase a (codepoint ordering)" + mapping: | + root.result = "A" < "a" + output: {"result": true} + + # --- Bytes comparisons (lexicographic by byte) --- + + - name: "bytes less than" + mapping: | + root.result = "abc".bytes() < "abd".bytes() + output: {"result": true} # FIXME-v1: verify — V1 may coerce bytes to string for comparison + + - name: "bytes greater than" + mapping: | + root.result = "xyz".bytes() > "abc".bytes() + output: {"result": true} # FIXME-v1: verify + + - name: "bytes prefix is less than longer bytes" + mapping: | + root.result = "ab".bytes() < "abc".bytes() + output: {"result": true} # FIXME-v1: verify + + # --- Timestamp comparisons --- + + - name: "earlier timestamp less than later" + input: + a: {_type: "timestamp", value: "2024-01-01T00:00:00Z"} + b: {_type: "timestamp", value: "2024-06-15T00:00:00Z"} + mapping: | + root.result = this.a < this.b + skip: "V2-only: typed timestamp (_type/value) inputs not modelled in V1" + + - name: "same timestamp greater than or equal" + input: + a: {_type: "timestamp", value: "2024-01-01T00:00:00Z"} + mapping: | + root.result = this.a >= this.a + skip: "V2-only: typed timestamp (_type/value) inputs not modelled in V1" + + # --- Cross-family comparisons: error (not just false) --- + + - name: "int64 < string is error" + mapping: | + root.result = 5 < "hello" + compile_error: "cannot compare types number" # V1 constant-folds literal number < string at compile time + + - name: "string > int64 is error" + mapping: | + root.result = "hello" > 5 + compile_error: "cannot compare types string" # V1 constant-folds literal string > number at compile time + + - name: "int64 < bool is error" + mapping: | + root.result = 5 < true + compile_error: "cannot compare types number" # V1 constant-folds literal number < bool at compile time + + - name: "string < bytes is error" + mapping: | + root.result = "hello" < "hello".bytes() + output: {"result": false} # FIXME-v1: verify — V1 likely coerces bytes→string, giving "hello" < "hello" = false + + - name: "timestamp < int64 is error" + input: + a: {_type: "timestamp", value: "2024-01-01T00:00:00Z"} + mapping: | + root.result = this.a < 5 + skip: "V2-only: typed timestamp (_type/value) inputs not modelled in V1" + + - name: "bool < bool is error (not comparable)" + mapping: | + root.result = true < false + compile_error: "cannot compare types bool" # V1 constant-folds literal bool < bool at compile time + + - name: "array < array is error (not comparable)" + mapping: | + root.result = [1, 2] < [3, 4] + compile_error: "cannot compare types array" # V1 constant-folds literal array < array at compile time + + # --- Null in comparison: error --- + + - name: "null < int64 is error" + mapping: | + root.result = null < 5 + compile_error: "cannot compare types null" # V1 constant-folds literal null < number at compile time + + - name: "int64 > null is error" + mapping: | + root.result = 5 > null + compile_error: "cannot compare types number" # V1 constant-folds literal number > null at compile time + + - name: "null <= null is error" + mapping: | + root.result = null <= null + compile_error: "cannot compare types null" # V1 constant-folds literal null <= null at compile time + + # --- NaN comparisons: all return false --- + + - name: "NaN < float64 is false" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + root.result = this.val < 1.0 + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + - name: "NaN > float64 is false" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + root.result = this.val > 1.0 + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + - name: "NaN >= NaN is false" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + root.result = this.val >= this.val + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + # --- Infinity comparisons --- + + - name: "Infinity > float64" + input: {val: {_type: "float64", value: "Infinity"}} + mapping: | + root.result = this.val > 999999.0 + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + - name: "negative Infinity < float64" + input: {val: {_type: "float64", value: "-Infinity"}} + mapping: | + root.result = this.val < -999999.0 + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + # --- Non-associative chaining is compile error --- + + - name: "chained comparison is compile error" + mapping: | + root.result = 1 < 2 < 3 + # V1 constant-folds `1 < 2` → `true` at compile time, then rejects `true < 3` at compile time. + compile_error: "cannot compare types bool" diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/division_modulo.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/division_modulo.yaml new file mode 100644 index 000000000..afdcf80d1 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/division_modulo.yaml @@ -0,0 +1,123 @@ +description: > + Division and modulo semantics — division always produces float, + modulo uses truncated division (fmod) semantics, division by zero + errors, and negative zero handling. + +tests: + # --- Division always produces float --- + + - name: "integer / integer produces float" + mapping: | + root.v = 7 / 2 + output: {"v": 3.5} + + - name: "even integer division produces float" + mapping: | + root.v = 6 / 2 + output: {"v": 3.0} + + - name: "large integer division produces float" + mapping: | + root.v = 1000000 / 3 + output: {"v": 333333.33333333331} + + - name: "float / float produces float" + mapping: | + root.v = 7.5 / 2.5 + output: {"v": 3.0} + + - name: "integer / float produces float" + mapping: | + root.v = 10 / 2.5 + output: {"v": 4.0} + + - name: "float / integer produces float" + mapping: | + root.v = 7.5 / 3 + output: {"v": 2.5} + + # --- Division by zero --- + + - name: "integer division by zero is error" + mapping: | + root.v = this.a / this.b + input: {"a": 5, "b": 0} + error: "divide by zero" # V1 constant-folds literal `/ 0` at compile time, so use runtime operands + + - name: "float division by zero is error" + mapping: | + root.v = this.a / this.b + input: {"a": 5.0, "b": 0.0} + error: "divide by zero" # V1 constant-folds literal `/ 0.0` at compile time, so use runtime operands + + - name: "zero divided by zero is error" + mapping: | + root.v = this.a / this.b + input: {"a": 0, "b": 0} + error: "divide by zero" # V1 constant-folds literal `0 / 0` at compile time, so use runtime operands + + # --- Modulo (V1 requires integer operands) --- + + - name: "integer modulo" + mapping: | + root.v = 7 % 3 + output: {"v": 1} + + - name: "integer modulo evenly divisible" + mapping: | + root.v = 9 % 3 + output: {"v": 0} + + - name: "negative dividend modulo (truncated division)" + mapping: | + root.v = -7 % 3 + output: {"v": -1} + + - name: "negative divisor modulo" + mapping: | + root.v = 7 % -3 + output: {"v": 1} + + - name: "both negative modulo" + mapping: | + root.v = -7 % -3 + output: {"v": -1} + + - name: "float modulo (fmod semantics)" + mapping: | + root.v = 7.5 % 2.5 + output: {"v": 1} # V1: silently truncates float operands to int64 (7 % 2 = 1) + + - name: "float modulo with remainder" + mapping: | + root.v = 7.0 % 2.5 + output: {"v": 1} # V1: silently truncates float operands to int64 (7 % 2 = 1) + + - name: "negative float modulo" + mapping: | + root.v = -7.5 % 2.5 + output: {"v": -1} # V1: silently truncates float operands to int64 (-7 % 2 = -1) + + - name: "modulo by zero is error" + mapping: | + root.v = this.a % this.b + input: {"a": 5, "b": 0} + error: "divide by zero" # V1 constant-folds literal `% 0` at compile time, so use runtime operands + + - name: "float modulo by zero is error" + mapping: | + root.v = this.a % this.b + input: {"a": 5.0, "b": 0.0} + error: "divide by zero" # V1 truncates floats to int, then divide-by-zero at runtime + + # --- Negative zero --- + + - name: "negative zero equals positive zero" + mapping: | + root.v = -0.0 == 0.0 + output: {"v": true} + + - name: "negative zero is not less than positive zero" + mapping: | + root.v = -0.0 < 0.0 + output: {"v": false} diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/equality.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/equality.yaml new file mode 100644 index 000000000..d89270dd4 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/equality.yaml @@ -0,0 +1,216 @@ +description: "Equality operators (==, !=) for all type combos, NaN, -0, cross-family, collections" + +tests: + # --- Numeric equality (same type) --- + + - name: "int64 == int64 true" + mapping: | + root.result = 5 == 5 + output: {"result": true} + + - name: "int64 != int64 true" + mapping: | + root.result = 5 != 6 + output: {"result": true} + + # --- Numeric equality with promotion --- + + - name: "int64 == float64 with promotion true" + mapping: | + root.result = 5 == 5.0 + output: {"result": true} + + - name: "int64 == float64 with promotion false" + mapping: | + root.result = 5 == 5.5 + output: {"result": false} + + - name: "int32 == int64 with promotion true" + mapping: | + root.result = 5.int32() == 5 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 == int64 with promotion true" + mapping: | + root.result = 42.uint32() == 42 + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- String equality --- + + - name: "string == string true" + mapping: | + root.result = "hello" == "hello" + output: {"result": true} + + - name: "empty string == empty string" + mapping: | + root.result = "" == "" + output: {"result": true} + + # --- Boolean equality --- + + - name: "true == true" + mapping: | + root.result = true == true + output: {"result": true} + + - name: "true == false" + mapping: | + root.result = true == false + output: {"result": false} + + # --- Null equality --- + + - name: "null == null true" + mapping: | + root.result = null == null + output: {"result": true} + + - name: "null != null false" + mapping: | + root.result = null != null + output: {"result": false} + + # --- Cross-family: always false, NOT error --- + + - name: "int64 == string is false (not error)" + mapping: | + root.result = 5 == "5" + output: {"result": false} + + - name: "int64 != string is true" + mapping: | + root.result = 5 != "5" + output: {"result": true} + + - name: "bool == int64 is false" + mapping: | + root.result = true == 1 + output: {"result": true} # V1 quirk: `bool == number` is asymmetric. LHS bool coerces RHS number→bool, so `true == 1` is true. + + - name: "null == int64 is false" + mapping: | + root.result = null == 0 + output: {"result": false} + + - name: "string == bytes is false" + mapping: | + root.result = "hello" == "hello".bytes() + output: {"result": true} # FIXME-v1: verify — V1 may treat bytes and string as equal after coercion + + # --- NaN equality --- + + - name: "self-equality for special floats" + mapping: | + root.result = this.val == this.val + cases: + - name: "NaN == NaN is false" + input: {val: {_type: "float64", value: "NaN"}} + output: {"result": false} + - name: "Infinity == Infinity is true" + input: {val: {_type: "float64", value: "Infinity"}} + output: {"result": true} + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + - name: "NaN != NaN is true" + input: {val: {_type: "float64", value: "NaN"}} + mapping: | + root.result = this.val != this.val + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + # --- Negative zero --- + + - name: "-0.0 == 0.0 is true" + input: {val: {_type: "float64", value: "-0.0"}} + mapping: | + root.result = this.val == 0.0 + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + - name: "Infinity != negative Infinity" + input: + a: {_type: "float64", value: "Infinity"} + b: {_type: "float64", value: "-Infinity"} + mapping: | + root.result = this.a != this.b + skip: "V2-only: typed numeric (_type/value) inputs not modelled in V1" + + # --- Bytes equality --- + + - name: "bytes == bytes true" + mapping: | + root.result = "hello".bytes() == "hello".bytes() + output: {"result": true} + + # --- Timestamp equality --- + + - name: "same timestamp == true" + input: + a: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + b: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + mapping: | + root.result = this.a == this.b + skip: "V2-only: typed timestamp (_type/value) inputs not modelled in V1" + + - name: "different timestamp != true" + input: + a: {_type: "timestamp", value: "2024-01-15T10:30:00Z"} + b: {_type: "timestamp", value: "2024-06-15T10:30:00Z"} + mapping: | + root.result = this.a != this.b + skip: "V2-only: typed timestamp (_type/value) inputs not modelled in V1" + + # --- Array equality --- + + - name: "array == array same order true" + mapping: | + root.result = [1, 2, 3] == [1, 2, 3] + output: {"result": true} + + - name: "array == array different order false (order matters)" + mapping: | + root.result = [1, 2, 3] == [3, 2, 1] + output: {"result": false} + + - name: "array == array different length false" + mapping: | + root.result = [1, 2] == [1, 2, 3] + output: {"result": false} + + - name: "empty array == empty array true" + mapping: | + root.result = [] == [] + output: {"result": true} + + # --- Object equality (order-independent) --- + + - name: "object == object same keys true" + mapping: | + root.result = {"a": 1, "b": 2} == {"a": 1, "b": 2} + output: {"result": true} + + - name: "object == object different key order still true" + mapping: | + root.result = {"a": 1, "b": 2} == {"b": 2, "a": 1} + output: {"result": true} + + - name: "object == object different values false" + mapping: | + root.result = {"a": 1, "b": 2} == {"a": 1, "b": 3} + output: {"result": false} + + - name: "empty object == empty object true" + mapping: | + root.result = {} == {} + output: {"result": true} + + - name: "nested object deep equality" + mapping: | + root.result = {"a": {"x": 1}} == {"a": {"x": 1}} + output: {"result": true} + + # --- Non-associative chaining is compile error --- + + - name: "chained equality is compile error" + mapping: | + root.result = 1 == 1 == true + output: {"result": true} # V1 parses `(1 == 1) == true` → `true == true` → true diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/logical.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/logical.yaml new file mode 100644 index 000000000..984a619e5 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/logical.yaml @@ -0,0 +1,266 @@ +description: "Logical operators (&&, ||, !) — boolean requirement, short-circuit evaluation, associativity" + +tests: + # --- Basic AND --- + - name: "and_true_true" + mapping: | + root = true && true + output: true + + - name: "and_true_false" + mapping: | + root = true && false + output: false + + - name: "and_false_true" + mapping: | + root = false && true + output: false + + - name: "and_false_false" + mapping: | + root = false && false + output: false + + # --- Basic OR --- + - name: "or_true_true" + mapping: | + root = true || true + output: true + + - name: "or_true_false" + mapping: | + root = true || false + output: true + + - name: "or_false_true" + mapping: | + root = false || true + output: true + + - name: "or_false_false" + mapping: | + root = false || false + output: false + + # --- Basic NOT --- + - name: "not_true" + mapping: | + root = !true + output: false + + - name: "not_false" + mapping: | + root = !false + output: true + + - name: "double_negation" + mapping: | + root = !(!true) + output: true # V1 note: `!!true` is a parse error; the `!` operator needs parens around the negated expression to double-negate. + + # --- Short-circuit AND --- + - name: "and_short_circuit_false_lhs" + mapping: | + root = false && throw("should not be evaluated") + output: false + + - name: "and_no_short_circuit_true_lhs" + mapping: | + root = true && throw("evaluated") + error: "evaluated" + + # --- Short-circuit OR --- + - name: "or_short_circuit_true_lhs" + mapping: | + root = true || throw("should not be evaluated") + output: true + + - name: "or_no_short_circuit_false_lhs" + mapping: | + root = false || throw("evaluated") + error: "evaluated" + + # --- Non-boolean operands: AND --- + - name: "and_int_lhs" + mapping: | + root = 5 && true + output: true # V1 quirk: `&&` coerces numbers to bool (non-zero → true); no type error. + + - name: "and_int_rhs" + mapping: | + root = true && 1 + output: true # V1 quirk: `&&` coerces numbers to bool (non-zero → true); no type error. + + - name: "and_string_lhs" + mapping: | + root = "yes" && true + error: "bool" # FIXME-v1: verify + + - name: "and_null_lhs" + mapping: | + root = null && true + error: "bool" # FIXME-v1: verify + + - name: "and_null_rhs" + mapping: | + root = true && null + error: "bool" # FIXME-v1: verify + + - name: "and_array_lhs" + mapping: | + root = [1, 2] && true + error: "bool" # FIXME-v1: verify + + - name: "and_object_lhs" + mapping: | + root = {"a": 1} && true + error: "bool" # FIXME-v1: verify + + # --- Non-boolean operands: OR --- + - name: "or_int_lhs" + mapping: | + root = 0 || false + output: false # V1 quirk: `||` coerces numbers to bool (0 → false); no type error. + + - name: "or_string_rhs" + mapping: | + root = false || "fallback" + error: "bool" # FIXME-v1: verify + + - name: "or_float_lhs" + mapping: | + root = 1.0 || true + output: true # V1 quirk: `||` coerces numbers to bool (non-zero → true); no type error. + + # --- Non-boolean operands: NOT --- + - name: "not_int" + mapping: | + root = !5 + error: "bool" # FIXME-v1: verify + + - name: "not_string" + mapping: | + root = !"hello" + error: "bool" # FIXME-v1: verify + + - name: "not_null" + mapping: | + root = !null + error: "bool" # FIXME-v1: verify + + - name: "not_zero" + mapping: | + root = !0 + error: "bool" # FIXME-v1: verify + + - name: "not_empty_string" + mapping: | + root = !"" + error: "bool" # FIXME-v1: verify + + # --- Short-circuit skips type check on unevaluated side --- + - name: "and_short_circuit_skips_type_check" + mapping: | + root = false && "not a bool" + output: false + + - name: "or_short_circuit_skips_type_check" + mapping: | + root = true || 42 + output: true + + # --- Left-associativity of AND --- + - name: "and_left_associative" + mapping: | + root = true && false && true + output: false + + - name: "and_chain_all_true" + mapping: | + root = true && true && true + output: true + + # --- Left-associativity of OR --- + - name: "or_left_associative" + mapping: | + root = false || false || true + output: true + + - name: "or_chain_all_false" + mapping: | + root = false || false || false + output: false + + # --- V1 WARNING: && and || are at the SAME precedence level, resolved left-to-right --- + # This differs from V2 (and most languages). `a || b && c` in V1 is `(a || b) && c`. + - name: "and_binds_tighter_than_or" + mapping: | + root = true || false && false + output: false # V1: (true || false) && false = true && false = false + + - name: "and_binds_tighter_than_or_reversed" + mapping: | + root = false && false || true + output: true # V1: (false && false) || true = false || true = true + + - name: "parens_override_and_or_precedence" + mapping: | + root = true || false && false + output: false # V1: same as "and_binds_tighter_than_or" — (true || false) && false + + - name: "parens_force_or_first" + mapping: | + root = (true || false) && false + output: false + + # --- Combinations --- + - name: "not_with_and" + mapping: | + root = !false && true + output: true + + - name: "not_with_or" + mapping: | + root = !true || false + output: false + + - name: "complex_logical_expression" + mapping: | + root = (true || false) && !(false && true) + output: true + + - name: "not_binds_tighter_than_and" + mapping: | + root = !false && !false + output: true + + # --- Short-circuit preserves left-to-right with associativity --- + - name: "and_chain_short_circuits_at_first_false" + mapping: | + root = true && false && throw("should not reach") + output: false + + - name: "or_chain_short_circuits_at_first_true" + mapping: | + root = false || true || throw("should not reach") + output: true + + # --- Using input values --- + - name: "and_with_input_booleans" + mapping: | + root = this.a && this.b + input: {"a": true, "b": false} + output: false + + - name: "or_with_input_booleans" + mapping: | + root = this.a || this.b + input: {"a": false, "b": true} + output: true + + - name: "not_with_input_boolean" + mapping: | + root = !this.flag + input: {"flag": true} + output: false diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion.yaml new file mode 100644 index 000000000..80e0d95e6 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion.yaml @@ -0,0 +1,189 @@ +description: "Numeric promotion rules: same type, width promotion, signed/unsigned, int/float, checked errors" + +tests: + # --- Same type: no promotion --- + + - name: "int64 + int64 stays int64" + mapping: | + root.result = 10 + 20 + output: {"result": 30} + + - name: "float64 + float64 stays float64" + mapping: | + root.result = 1.5 + 2.5 + output: {"result": 4.0} + + - name: "int32 + int32 stays int32" + mapping: | + root.result = 10.int32() + 20.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 + uint32 stays uint32" + mapping: | + root.result = 10.uint32() + 20.uint32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 + uint64 stays uint64" + mapping: | + root.result = 10.uint64() + 20.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "float32 + float32 stays float32" + mapping: | + root.result = 1.5.float32() + 2.5.float32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Same signedness, different width: promote to wider --- + + - name: "int32 + int64 promotes to int64" + mapping: | + root.result = 5.int32() + 10 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int64 + int32 promotes to int64" + mapping: | + root.result = 10 + 5.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 + uint64 promotes to uint64" + mapping: | + root.result = 5.uint32() + 10.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "float32 + float64 promotes to float64" + mapping: | + root.result = 1.5.float32() + 2.5 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "float64 + float32 promotes to float64" + mapping: | + root.result = 2.5 + 1.5.float32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Signed + unsigned integer: promote to int64 --- + + - name: "int32 + uint32 promotes to int64" + mapping: | + root.result = 5.int32() + 10.uint32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int64 + uint32 promotes to int64" + mapping: | + root.result = 100 + 50.uint32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int32 + uint64 promotes to int64 when value fits" + mapping: | + root.result = 5.int32() + 10.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int64 + uint64 promotes to int64 when value fits" + mapping: | + root.result = 5 + 10.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 exceeding int64 max causes error" + mapping: | + root.result = 5 + "9999999999999999999".uint64() + skip: "V2-only: uint64 overflow semantics not checked in V1" + + - name: "uint64 at int64 max boundary succeeds" + mapping: | + root.result = 0 + 9223372036854775807.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 just above int64 max causes error" + mapping: | + root.result = 0 + "9223372036854775808".uint64() + skip: "V2-only: uint64 overflow semantics not checked in V1" + + # --- Any integer + any float: promote to float64 --- + + - name: "int64 + float64 promotes to float64" + mapping: | + root.result = 5 + 3.0 + output: {"result": 8.0} + + - name: "float64 + int64 promotes to float64" + mapping: | + root.result = 3.0 + 5 + output: {"result": 8.0} + + - name: "int32 + float64 promotes to float64" + mapping: | + root.result = 5.int32() + 3.0 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 + float64 promotes to float64 when safe" + mapping: | + root.result = 100.uint64() + 1.5 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int32 + float32 promotes to float64" + mapping: | + root.result = 5.int32() + 3.0.float32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 + float32 promotes to float64" + mapping: | + root.result = 10.uint32() + 2.5.float32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Checked promotion: integer magnitude > 2^53 to float --- + + - name: "int64 exceeding float64 exact range causes error" + mapping: | + root.result = 9007199254740993 + 1.0 + # V1: silent float degradation, no safety check. 9007199254740993 converts to float64 as 9007199254740992 + # (precision loss above 2^53); +1.0 rounds back to 9007199254740992. + output: {"result": 9007199254740992.0} + + - name: "int64 at float64 exact limit succeeds" + mapping: | + root.result = 9007199254740992 + 1.0 + output: {"result": 9007199254740992.0} # V1: exact value 2^53 is representable, +1.0 rounds back to 2^53. + + - name: "negative int64 exceeding float64 exact range causes error" + mapping: | + root.result = -9007199254740993 + 1.0 + # V1: silent float degradation. -9007199254740993 converts to float64 as -9007199254740992, +1.0 = -9007199254740991. + output: {"result": -9007199254740991.0} + + # --- Promotion with comparison operators --- + + - name: "int32 < int64 promotes to int64 for comparison" + mapping: | + root.result = 5.int32() < 10 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int64 >= float64 promotes to float64 for comparison" + mapping: | + root.result = 5 >= 5.0 + output: {"result": true} + + - name: "uint32 > int32 promotes to int64 for comparison" + mapping: | + root.result = 10.uint32() > 5.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Promotion with equality --- + + - name: "int64 == float64 promotes to float64" + mapping: | + root.result = 5 == 5.0 + output: {"result": true} + + - name: "int32 == int64 promotes to int64" + mapping: | + root.result = 5.int32() == 5 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int32 == float64 promotes to float64" + mapping: | + root.result = 5.int32() == 5.0 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 == int64 promotes to int64" + mapping: | + root.result = 42.uint32() == 42 + skip: "V2-only: specific numeric width not distinguishable in V1" diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion_edge.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion_edge.yaml new file mode 100644 index 000000000..e93239659 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/numeric_promotion_edge.yaml @@ -0,0 +1,179 @@ +description: > + Numeric promotion edge cases — uint64 boundary values with signed ops, + large integers with float, promotion in subtraction/multiplication/modulo, + and cross-type comparison edge cases. + +tests: + # --- uint64 boundary with signed: 2^63-1 is max safe value --- + + - name: "uint64 above int64 max in subtraction errors" + mapping: | + root.result = 0 - "9223372036854775808".uint64() + skip: "V2-only: uint64 overflow semantics not checked in V1" + + - name: "uint64 value 1 in mixed arithmetic works" + mapping: | + root.result = 10 - 3.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 zero in mixed arithmetic works" + mapping: | + root.result = 5 + 0.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int32 + uint64 exceeding int64 max errors" + mapping: | + root.result = 1.int32() + "9223372036854775808".uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Large integers with float: 2^53 boundary --- + + - name: "int64 at 2^53 with float succeeds" + mapping: | + root.result = 9007199254740992 + 0.0 + output: {"result": 9007199254740992.0} + + - name: "int64 at 2^53+1 with float errors" + mapping: | + root.result = 9007199254740993 + 0.0 + output: {"result": 9007199254740992.0} # V1: silently degrades to float64; no safety check + + - name: "negative int64 at -2^53 with float succeeds" + mapping: | + root.result = -9007199254740992 + 0.0 + output: {"result": -9007199254740992.0} + + - name: "negative int64 at -(2^53+1) with float errors" + mapping: | + root.result = -9007199254740993 + 0.0 + output: {"result": -9007199254740992.0} # V1: silently degrades to float64; no safety check + + - name: "int64 * float64 checks promotion" + mapping: | + root.result = 9007199254740993 * 1.0 + output: {"result": 9007199254740992.0} # V1: silent float64 precision loss + + - name: "int64 at safe range with float multiplication" + mapping: | + root.result = 1000000 * 1.5 + output: {"result": 1500000.0} + + # --- Promotion in subtraction --- + + - name: "int32 - uint32 promotes to int64" + mapping: | + root.result = 100.int32() - 30.uint32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 - int32 promotes to int64" + mapping: | + root.result = 100.uint32() - 30.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "float64 - int64 promotes to float64" + mapping: | + root.result = 10.5 - 3 + output: {"result": 7.5} + + - name: "int64 - float64 promotes to float64" + mapping: | + root.result = 10 - 3.5 + output: {"result": 6.5} + + # --- Promotion in multiplication --- + + - name: "int32 * int64 promotes to int64" + mapping: | + root.result = 5.int32() * 10 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int32 * float64 promotes to float64" + mapping: | + root.result = 5.int32() * 2.5 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint32 * int64 promotes to int64" + mapping: | + root.result = 5.uint32() * 10 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 * uint64 stays uint64" + mapping: | + root.result = 5.uint64() * 10.uint64() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Promotion in modulo --- + + - name: "int32 % int64 promotes to int64" + mapping: | + root.result = 7.int32() % 3 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "int64 % float64 promotes to float64" + mapping: | + root.result = 7 % 2.5 + output: {"result": 1} # V1 quirk: `%` silently truncates float operands to int64 (7 % 2 = 1), no promotion to float + + - name: "uint32 % int32 promotes to int64" + mapping: | + root.result = 7.uint32() % 3.int32() + skip: "V2-only: specific numeric width not distinguishable in V1" + + # --- Promotion in comparison edge cases --- + + - name: "uint64 at int64 max compared with int64" + mapping: | + root.result = 9223372036854775807.uint64() == 9223372036854775807 + skip: "V2-only: specific numeric width not distinguishable in V1" + + - name: "uint64 above int64 max in comparison errors" + mapping: | + root.result = "9223372036854775808".uint64() > 0 + skip: "V2-only: uint64 overflow semantics not checked in V1" + + - name: "int64 at float64 limit compared with float" + mapping: | + root.result = 9007199254740992 == 9007199254740992.0 + output: {"result": true} + + - name: "int64 above float64 limit compared with float errors" + mapping: | + root.result = 9007199254740993 == 9007199254740993.0 + output: {"result": true} # V1: comparison via degradation, silent precision loss still equates + + # --- Cross-family comparison always false --- + + - name: "int vs string comparison is false" + mapping: | + root.result = 5 == "5" + output: {"result": false} + + - name: "float vs string comparison is false" + mapping: | + root.result = 5.0 == "5.0" + output: {"result": false} + + - name: "bool vs int comparison is false" + mapping: | + root.result = true == 1 + output: {"result": true} # V1 quirk: asymmetric equality. LHS bool coerces RHS number→bool, so `true == 1` is true. Compare to `1 == true` which is false. + + - name: "null vs empty string comparison is false" + mapping: | + root.result = null == "" + output: {"result": false} + + - name: "null vs zero comparison is false" + mapping: | + root.result = null == 0 + output: {"result": false} + + - name: "null vs false comparison is false" + mapping: | + root.result = null == false + output: {"result": false} + + - name: "array vs string comparison is false" + mapping: | + root.result = [] == "" + output: {"result": false} diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/precedence.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/precedence.yaml new file mode 100644 index 000000000..cd8feb31f --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/precedence.yaml @@ -0,0 +1,265 @@ +description: "Operator precedence, associativity, non-associative parse errors, unary minus + method call trap" + +tests: + # --- Multiplicative before additive --- + - name: "multiply_before_add" + mapping: | + root = 2 + 3 * 4 + output: 14 + + - name: "divide_before_subtract" + mapping: | + root = 10 - 6 / 2 + output: 7.0 + + - name: "modulo_before_add" + mapping: | + root = 10 + 7 % 3 + output: 11 + + - name: "parens_override_mult_add" + mapping: | + root = (2 + 3) * 4 + output: 20 + + # --- Additive before comparison --- + - name: "add_before_greater_than" + mapping: | + root = 3 + 2 > 4 + output: true + + - name: "add_before_less_than" + mapping: | + root = 1 + 1 < 3 + output: true + + - name: "subtract_before_greater_equal" + mapping: | + root = 10 - 5 >= 5 + output: true + + - name: "subtract_before_less_equal" + mapping: | + root = 10 - 5 <= 4 + output: false + + # --- Comparison before equality --- + - name: "comparison_before_equality" + mapping: | + root = 3 > 2 == true + output: true + + - name: "comparison_before_inequality" + mapping: | + root = 3 < 2 != true + output: true + + # --- Equality before logical AND --- + - name: "equality_before_and" + mapping: | + root = 1 == 1 && 2 == 2 + output: true + + - name: "inequality_before_and" + mapping: | + root = 1 != 2 && 3 != 4 + output: true + + # --- V1 WARNING: && and || share the same precedence, resolved left-to-right --- + # V1: `true || false && false` parses as `(true || false) && false` = false, NOT `true || (false && false)` = true. + - name: "and_before_or" + mapping: | + root = true || false && false + output: false # V1: (true || false) && false + + - name: "and_before_or_reversed" + mapping: | + root = false && false || true + output: true # V1: (false && false) || true + + - name: "parens_override_and_or" + mapping: | + root = (true || false) && false + output: false + + # --- Unary minus precedence --- + - name: "unary_minus_before_multiply" + mapping: | + root = -3 * 2 + output: -6 + + - name: "unary_minus_before_add" + mapping: | + root = -3 + 5 + output: 2 + + - name: "unary_not_before_and" + mapping: | + root = !false && true + output: true + + - name: "unary_not_before_or" + mapping: | + root = !true || true + output: true + + # --- Method calls bind tighter than unary minus --- + - name: "unary_minus_method_call_trap" + mapping: | + root = -10.string() + error: "types" # FIXME-v1: verify — V1: `-` on a string value is a type error + + - name: "unary_minus_method_call_with_parens" + mapping: | + root = (-10).string() + output: "-10" + + - name: "unary_minus_method_chain_trap" + mapping: | + root = -5.float64() + skip: "V2-only: .float64() method does not exist in V1 (use .number() instead)" + + # --- Field access / indexing binds tightest --- + - name: "field_access_before_arithmetic" + mapping: | + root = this.a + this.b * this.c + input: {"a": 2, "b": 3, "c": 4} + output: 14 + + - name: "indexing_before_arithmetic" + mapping: | + root = this.items.0 + this.items.1 + input: {"items": [10, 20]} + output: 30 + + - name: "method_call_before_arithmetic" + mapping: | + root = this.text.length() + 1 + input: {"text": "hello"} + output: 6 + + - name: "method_call_before_comparison" + mapping: | + root = this.text.length() > 3 + input: {"text": "hello"} + output: true + + # --- Left-associativity of arithmetic --- + - name: "subtraction_left_associative" + mapping: | + root = 10 - 5 - 2 + output: 3 + + - name: "division_left_associative" + mapping: | + root = 20 / 4 / 2 + output: 2.5 + + - name: "modulo_left_associative" + mapping: | + root = 17 % 10 % 5 + output: 2 + + - name: "addition_left_associative" + mapping: | + root = 1 + 2 + 3 + output: 6 + + - name: "multiplication_left_associative" + mapping: | + root = 2 * 3 * 4 + output: 24 + + # --- Non-associative: V1 parses chained comparisons but fails at runtime --- + - name: "chained_less_than" + mapping: | + root = 1 < 2 < 3 + # V1 constant-folds `1 < 2` → `true` at compile time, then rejects `true < 3` at compile time. + compile_error: "cannot compare types bool" + + - name: "chained_greater_than" + mapping: | + root = 3 > 2 > 1 + # V1 constant-folds `3 > 2` → `true` at compile time, then rejects `true > 1` at compile time. + compile_error: "cannot compare types bool" + + - name: "chained_less_equal" + mapping: | + root = 1 <= 2 <= 3 + compile_error: "cannot compare types bool" + + - name: "chained_greater_equal" + mapping: | + root = 3 >= 2 >= 1 + compile_error: "cannot compare types bool" + + - name: "chained_mixed_comparison" + mapping: | + root = 1 < 2 > 0 + compile_error: "cannot compare types bool" + + # --- Non-associative: equality chaining parses and runs in V1 --- + - name: "chained_equality" + mapping: | + root = 1 == 1 == true + output: true # V1: `(1==1)==true` → `true==true` → true + + - name: "chained_inequality" + mapping: | + root = 1 != 2 != 3 + # V1: `(1 != 2) != 3` → `true != 3`. Bool != number is asymmetric: LHS bool coerces RHS→bool (3→true), + # so `true != true` → false. + output: false + + # --- Correct way to express range checks --- + - name: "range_check_with_and" + mapping: | + root = 1 < 2 && 2 < 3 + output: true + + - name: "equality_chain_with_and" + mapping: | + root = 1 == 1 && 2 == 2 + output: true + + # --- Complex precedence combinations --- + - name: "full_precedence_chain" + mapping: | + root = 2 + 3 * 4 > 10 == true && !false + output: true + + - name: "arithmetic_in_comparison_in_logical" + mapping: | + root = 1 + 2 >= 3 && 4 * 2 <= 8 || false + output: true + + - name: "nested_parens_override_everything" + mapping: | + root = ((2 + 3) * (4 - 1)) > ((10 / 2) + 1) + output: true + + - name: "unary_minus_in_complex_expression" + mapping: | + root = -2 * 3 + 4 + output: -2 + + - name: "double_unary_minus" + mapping: | + root = - -5 + output: 5 + + - name: "unary_minus_with_parens_and_method" + mapping: | + root = (-3 * 2).string() + output: "-6" + + # --- Logical left-associativity --- + - name: "and_left_associativity_with_short_circuit" + mapping: | + root = true && false && throw("should not reach") + output: false + + - name: "or_left_associativity_with_short_circuit" + mapping: | + root = false || true || throw("should not reach") + output: true diff --git a/internal/bloblang2/migrator/v1spec/tests/operators/string_concat.yaml b/internal/bloblang2/migrator/v1spec/tests/operators/string_concat.yaml new file mode 100644 index 000000000..ad6a31a33 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/operators/string_concat.yaml @@ -0,0 +1,201 @@ +description: "String and bytes concatenation with + operator — same-type concat, cross-type errors, explicit conversion" + +tests: + # --- String + String --- + - name: "concat_two_strings" + mapping: | + root = "hello" + " world" + output: "hello world" + + - name: "concat_empty_left" + mapping: | + root = "" + "world" + output: "world" + + - name: "concat_empty_right" + mapping: | + root = "hello" + "" + output: "hello" + + - name: "concat_both_empty" + mapping: | + root = "" + "" + output: "" + + - name: "concat_multiple_strings" + mapping: | + root = "a" + "b" + "c" + output: "abc" + + - name: "concat_unicode_strings" + mapping: | + root = "café" + " ☕" + output: "café ☕" + + - name: "concat_emoji_strings" + mapping: | + root = "hello " + "😀" + output: "hello 😀" + + - name: "concat_with_escape_sequences" + mapping: | + root = "line1\n" + "line2" + output: "line1\nline2" + + - name: "concat_from_input" + mapping: | + root = this.first + " " + this.last + input: {"first": "John", "last": "Doe"} + output: "John Doe" + + # --- Bytes + Bytes (V1: coerces both to string and concatenates; result is string, not bytes) --- + - name: "concat_two_bytes" + mapping: | + root = "hello".bytes() + " world".bytes() + output: "hello world" # V1: bytes+bytes → string concatenation + + - name: "concat_empty_bytes_left" + mapping: | + root = "".bytes() + "world".bytes() + output: "world" # V1: bytes+bytes → string + + - name: "concat_empty_bytes_right" + mapping: | + root = "hello".bytes() + "".bytes() + output: "hello" # V1: bytes+bytes → string + + - name: "concat_both_empty_bytes" + mapping: | + root = "".bytes() + "".bytes() + output: "" # V1: bytes+bytes → string + + - name: "concat_multiple_bytes" + mapping: | + root = "a".bytes() + "b".bytes() + "c".bytes() + output: "abc" # V1: bytes+bytes → string + + # --- Cross-type: V1 string + number is error (same as V2) --- + - name: "string_plus_int" + mapping: | + root = "count: " + 5 + compile_error: "cannot add types string" # V1 constant-folds literal string + number at compile time + + - name: "int_plus_string" + mapping: | + root = 5 + " items" + compile_error: "cannot add types number" # V1 constant-folds literal number + string at compile time + + - name: "string_plus_float" + mapping: | + root = "value: " + 3.14 + compile_error: "cannot add types string" # V1 constant-folds literal string + number at compile time + + - name: "float_plus_string" + mapping: | + root = 3.14 + " meters" + compile_error: "cannot add types number" # V1 constant-folds literal number + string at compile time + + # --- Cross-type: V1 string + bytes works (both coerced to string) --- + - name: "string_plus_bytes" + mapping: | + root = "hello" + "world".bytes() + output: "helloworld" # V1: bytes operand → string concatenation + + - name: "bytes_plus_string" + mapping: | + root = "hello".bytes() + "world" + output: "helloworld" # V1: bytes operand → string concatenation + + # --- Cross-type errors: string + bool --- + - name: "string_plus_bool" + mapping: | + root = "flag: " + true + compile_error: "cannot add types string" # V1 constant-folds literal string + bool at compile time + + - name: "bool_plus_string" + mapping: | + root = true + " is set" + compile_error: "cannot add types bool" # V1 constant-folds literal bool + string at compile time + + # --- Cross-type errors: string + null --- + - name: "string_plus_null" + mapping: | + root = "value: " + null + compile_error: "cannot add types string" # V1 constant-folds literal string + null at compile time + + - name: "null_plus_string" + mapping: | + root = null + "value" + compile_error: "cannot add types null" # V1 constant-folds literal null + string at compile time + + # --- Cross-type errors: string + array --- + - name: "string_plus_array" + mapping: | + root = "items: " + [1, 2, 3] + compile_error: "cannot add types string" # V1 constant-folds literal string + array at compile time + + # --- Cross-type errors: string + object --- + - name: "string_plus_object" + mapping: | + root = "data: " + {"a": 1} + compile_error: "cannot add types string" # V1 constant-folds literal string + object at compile time + + # --- Cross-type errors: bytes + number --- + - name: "bytes_plus_int" + mapping: | + root = "data".bytes() + 42 + error: "types" # FIXME-v1: verify + + - name: "int_plus_bytes" + mapping: | + root = 42 + "data".bytes() + error: "types" # FIXME-v1: verify + + # --- Cross-type errors: bytes + bool --- + - name: "bytes_plus_bool" + mapping: | + root = "data".bytes() + true + error: "types" # FIXME-v1: verify + + # --- Cross-type errors: bytes + null --- + - name: "bytes_plus_null" + mapping: | + root = "data".bytes() + null + error: "types" # FIXME-v1: verify + + # --- Explicit conversion with .string() --- + - name: "int_to_string_concat" + mapping: | + root = 5.string() + "3" + output: "53" + + - name: "string_concat_int_rhs_converted" + mapping: | + root = "count: " + 42.string() + output: "count: 42" + + - name: "float_to_string_concat" + mapping: | + root = 3.14.string() + " meters" + output: "3.14 meters" + + - name: "bool_to_string_concat" + mapping: | + root = "flag is " + true.string() + output: "flag is true" + + - name: "null_to_string_concat" + mapping: | + root = "value is " + null.string() + output: "value is null" + + - name: "multiple_conversions" + mapping: | + root = "sum: " + 2.string() + " + " + 3.string() + " = " + 5.string() + output: "sum: 2 + 3 = 5" + + # --- Left-associativity of string concat --- + - name: "concat_left_associative" + mapping: | + root = "a" + "b" + "c" + "d" + output: "abcd" diff --git a/internal/bloblang2/migrator/v1spec/tests/optimizations/constant_folding.yaml b/internal/bloblang2/migrator/v1spec/tests/optimizations/constant_folding.yaml new file mode 100644 index 000000000..b187c78d9 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/optimizations/constant_folding.yaml @@ -0,0 +1,240 @@ +description: "Constant folding: literal-only expressions evaluated at compile time" + +tests: + # --- Integer arithmetic --- + + - name: "fold integer addition" + mapping: | + root.v = 2 + 3 + output: {"v": 5} + + - name: "fold integer subtraction" + mapping: | + root.v = 10 - 4 + output: {"v": 6} + + - name: "fold integer multiplication" + mapping: | + root.v = 3 * 7 + output: {"v": 21} + + - name: "fold integer modulo" + mapping: | + root.v = 17 % 5 + output: {"v": 2} + + - name: "fold chained integer arithmetic" + mapping: | + root.v = 2 + 3 * 4 + output: {"v": 14} + + - name: "fold left-associative subtraction" + mapping: | + root.v = 10 - 3 - 2 + output: {"v": 5} + + - name: "fold negative result" + mapping: | + root.v = 3 - 10 + output: {"v": -7} + + - name: "fold zero" + mapping: | + root.v = 5 - 5 + output: {"v": 0} + + # --- Integer overflow not folded (runtime error) --- + + - name: "integer overflow not folded" + mapping: | + root.v = 9223372036854775807 + 1 + output: {"v": -9223372036854775808} # FIXME-v1: verify — V1 integer overflow wraps silently (§14.26), no error + # V2-only: V2 detects overflow as error; V1 wraps per Go int64 semantics + + - name: "integer multiplication overflow not folded" + mapping: | + root.v = 9223372036854775807 * 2 + output: {"v": -2} # FIXME-v1: verify — V1 wraps silently + # V2-only: V2 detects overflow as error; V1 wraps per Go int64 semantics + + # --- Float arithmetic --- + + - name: "fold float addition" + mapping: | + root.v = 1.5 + 2.5 + output: {"v": 4.0} + + - name: "fold float subtraction" + mapping: | + root.v = 10.0 - 3.5 + output: {"v": 6.5} + + - name: "fold float multiplication" + mapping: | + root.v = 2.0 * 3.5 + output: {"v": 7.0} + + - name: "fold float division" + mapping: | + root.v = 10.0 / 4.0 + output: {"v": 2.5} + + - name: "fold float modulo" + mapping: | + root.v = 7.5 % 2.0 + skip: "V1 modulo requires integer operands (§5.3) — float % float is a type error" + + - name: "fold mixed int and float addition" + mapping: | + root.v = 5 + 3.0 + output: {"v": 8.0} + + - name: "division by zero not folded" + input: {"d": 0.0} + mapping: | + root.v = 10.0 / this.d + error: "divide by zero" + # V1 folds literal 10.0 / 0.0 at parse-time (compile error); use runtime divisor to exercise runtime path + + # --- Large int + float precision loss not folded --- + + - name: "large int plus float not folded due to precision" + mapping: | + root.v = 9007199254740993 + 1.0 + output: {"v": 9007199254740992.0} # V1 coerces int to float64, 9007199254740993 is not representable as float64 so becomes 9007199254740992, then +1.0 = 9007199254740992 (precision already lost) + # V2-only: V2 flags precision loss as error; V1 silently coerces + + - name: "safe int plus float is folded" + mapping: | + root.v = 100 + 1.5 + output: {"v": 101.5} + + # --- String concatenation --- + + - name: "fold string concatenation" + mapping: | + root.v = "hello" + " " + "world" + output: {"v": "hello world"} + + - name: "fold empty string concatenation" + mapping: | + root.v = "" + "abc" + output: {"v": "abc"} + + - name: "fold raw string concatenation" + mapping: | + root.v = """raw""" + " normal" + output: {"v": "raw normal"} + + # --- Boolean logic --- + + - name: "fold true AND true" + mapping: | + root.v = true && true + output: {"v": true} + + - name: "fold true AND false" + mapping: | + root.v = true && false + output: {"v": false} + + - name: "fold false OR true" + mapping: | + root.v = false || true + output: {"v": true} + + - name: "fold false OR false" + mapping: | + root.v = false || false + output: {"v": false} + + - name: "fold boolean equality" + mapping: | + root.v = true == true + output: {"v": true} + + - name: "fold boolean inequality" + mapping: | + root.v = true != false + output: {"v": true} + + # --- Unary --- + + - name: "fold unary minus on int" + mapping: | + root.v = -42 + output: {"v": -42} + + - name: "fold unary minus on float" + mapping: | + root.v = -3.14 + output: {"v": -3.14} + + - name: "fold unary not on true" + mapping: | + root.v = !true + output: {"v": false} + + - name: "fold unary not on false" + mapping: | + root.v = !false + output: {"v": true} + + # --- Equality of same-type literals --- + + - name: "fold string equality" + mapping: | + root.v = "abc" == "abc" + output: {"v": true} + + - name: "fold string inequality" + mapping: | + root.v = "abc" != "xyz" + output: {"v": true} + + - name: "fold integer equality" + mapping: | + root.v = 42 == 42 + output: {"v": true} + + - name: "fold integer inequality" + mapping: | + root.v = 42 != 43 + output: {"v": true} + + - name: "fold null equality" + mapping: | + root.v = null == null + output: {"v": true} + + # --- Cross-type equality --- + + - name: "fold cross-type equality int vs string" + mapping: | + root.v = 42 == "42" + output: {"v": false} + + - name: "fold cross-type inequality bool vs null" + mapping: | + root.v = true != null + output: {"v": true} + + # --- Non-foldable expressions preserved --- + + - name: "runtime value prevents folding" + input: {"x": 5} + mapping: | + root.v = this.x + 3 + output: {"v": 8} + + - name: "one literal one runtime not folded" + input: {"x": 10} + mapping: | + root.v = 2 * this.x + output: {"v": 20} + + - name: "variable prevents folding" + mapping: | + let x = 5 + root.v = $x + 3 + output: {"v": 8} diff --git a/internal/bloblang2/migrator/v1spec/tests/optimizations/dead_code_elimination.yaml b/internal/bloblang2/migrator/v1spec/tests/optimizations/dead_code_elimination.yaml new file mode 100644 index 000000000..eb9d2115c --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/optimizations/dead_code_elimination.yaml @@ -0,0 +1,150 @@ +description: "Dead code elimination: unreachable if/match branches with literal boolean conditions are pruned" + +tests: + # --- If expression: literal true --- + + - name: "if true returns then branch" + mapping: | + root.v = if true { "yes" } else { "no" } + output: {"v": "yes"} + + - name: "if true with complex else (else not evaluated)" + mapping: | + root.v = if true { 42 } else { throw("should not reach") } + output: {"v": 42} + + # --- If expression: literal false --- + + - name: "if false returns else branch" + mapping: | + root.v = if false { "yes" } else { "no" } + output: {"v": "no"} + + - name: "if false with complex then (then not evaluated)" + mapping: | + root.v = if false { throw("should not reach") } else { 42 } + output: {"v": 42} + + - name: "if false without else produces void" + mapping: | + root.before = "exists" + root.v = if false { "value" } + output: {"before": "exists"} # V1 elides the field entirely when the if-expression has no else branch and condition is false + + # --- Else-if chains --- + + - name: "else-if chain with literal true in first branch" + mapping: | + root.v = if true { "first" } else if true { "second" } else { "third" } + output: {"v": "first"} + + - name: "else-if chain with false then true" + mapping: | + root.v = if false { "first" } else if true { "second" } else { "third" } + output: {"v": "second"} + + - name: "else-if chain all false reaches else" + mapping: | + root.v = if false { "first" } else if false { "second" } else { "third" } + output: {"v": "third"} + + - name: "false branch with throw is eliminated" + mapping: | + root.v = if false { throw("dead") } else if true { "alive" } else { throw("also dead") } + output: {"v": "alive"} + + # --- If statement: literal true --- + + - name: "if true statement executes body" + mapping: | + if true { + root.v = "yes" + } + output: {"v": "yes"} + + - name: "if true statement eliminates else" + mapping: | + if true { + root.v = "yes" + } else { + root.v = "no" + } + output: {"v": "yes"} + + # --- If statement: literal false --- + + - name: "if false statement skips body" + mapping: | + root.before = "exists" + if false { + root.v = "yes" + } + output: {"before": "exists"} + + - name: "if false statement executes else" + mapping: | + if false { + root.v = "yes" + } else { + root.v = "no" + } + output: {"v": "no"} + + # --- Non-boolean literal conditions preserved for runtime error --- + + - name: "string condition is runtime error not eliminated" + mapping: | + root.v = if "hello" { 1 } else { 2 } + error: "bool" + + - name: "integer condition is runtime error not eliminated" + mapping: | + root.v = if 42 { 1 } else { 2 } + error: "bool" + + - name: "null condition takes else branch not eliminated" + mapping: | + root.v = if null { 1 } else { 2 } + output: {"v": 2} # V1 treats null as falsy in if-conditions (unlike string/int which error) + + - name: "string condition in statement is runtime error" + mapping: | + if 42 { + root.v = "yes" + } + error: "bool" + + # --- Dynamic conditions not affected --- + + - name: "dynamic condition works normally" + mapping: | + root.v = if this.flag { "yes" } else { "no" } + cases: + - name: "true branch" + input: {"flag": true} + output: {"v": "yes"} + - name: "false branch" + input: {"flag": false} + output: {"v": "no"} + + # --- Folded condition feeds into DCE --- + + - name: "folded true condition triggers DCE" + mapping: | + root.v = if !false { "yes" } else { throw("dead") } + output: {"v": "yes"} + + - name: "folded false condition triggers DCE" + mapping: | + root.v = if !true { throw("dead") } else { "no" } + output: {"v": "no"} + + - name: "folded boolean expression triggers DCE" + mapping: | + root.v = if true && true { "yes" } else { throw("dead") } + output: {"v": "yes"} + + - name: "folded false boolean expression triggers DCE" + mapping: | + root.v = if true && false { throw("dead") } else { "no" } + output: {"v": "no"} diff --git a/internal/bloblang2/migrator/v1spec/tests/optimizations/path_collapse.yaml b/internal/bloblang2/migrator/v1spec/tests/optimizations/path_collapse.yaml new file mode 100644 index 000000000..45e8e6f37 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/optimizations/path_collapse.yaml @@ -0,0 +1,171 @@ +description: "PathExpr collapse: chains of field access, indexing, and method calls are collapsed into flat path expressions" + +tests: + # --- Input field chains --- + + - name: "input single field" + input: {"x": 1} + mapping: | + root.v = this.x + output: {"v": 1} + + - name: "input two-level field chain" + input: {"a": {"b": 2}} + mapping: | + root.v = this.a.b + output: {"v": 2} + + - name: "input deep field chain" + input: {"a": {"b": {"c": {"d": {"e": 42}}}}} + mapping: | + root.v = this.a.b.c.d.e + output: {"v": 42} + + - name: "input field chain with index" + input: {"items": [10, 20, 30]} + mapping: | + root.v = this.items.index(1) + output: {"v": 20} + + - name: "input field chain with nested index" + input: {"data": {"items": [{"name": "Alice"}, {"name": "Bob"}]}} + mapping: | + root.v = this.data.items.index(1).name + output: {"v": "Bob"} + + - name: "input field chain with method call" + input: {"name": "Alice"} + mapping: | + root.v = this.name.uppercase() + output: {"v": "ALICE"} + + - name: "input field chain with method and further field access" + input: {"data": [3, 1, 2]} + mapping: | + root.v = this.data.sort().index(0) + output: {"v": 1} + + - name: "input field chain with multiple methods" + input: {"text": " Hello World "} + mapping: | + root.v = this.text.trim().lowercase() + output: {"v": "hello world"} + + # --- Null-safe chains --- + # V1 has no `?.` null-safe operator. Path access into null already yields null (§12.5), + # but method calls on null typically error. These tests don't translate cleanly. + + - name: "null-safe field chain short-circuits" + input: {"user": null} + mapping: | + root.v = this.user?.name?.trim() + skip: "V1 has no null-safe `?.` operator; closest equivalent would be nested .or(null) or try/catch" + + - name: "null-safe index chain short-circuits" + input: {"items": null} + mapping: | + root.v = this.items?.index(0)?.name + skip: "V1 has no null-safe `?.` operator" + + - name: "null-safe method chain short-circuits" + input: {"value": null} + mapping: | + root.v = this.value?.uppercase() + skip: "V1 has no null-safe `?.` operator; method on null would error" + + - name: "mixed null-safe and regular access" + input: {"user": {"address": null}} + mapping: | + root.v = this.user.address.city + output: {"v": null} # V1 path access into null yields null without needing `?.` (§12.5) + + # --- Output field chains --- + + - name: "output field chain read after write" + mapping: | + root.user = {"name": "Alice", "age": 30} + root.v = root.user.name + output: {"user": {"name": "Alice", "age": 30}, "v": "Alice"} + + - name: "output field chain with method" + mapping: | + root.data = {"name": "Alice"} + root.v = root.data.keys().length() + output: {"data": {"name": "Alice"}, "v": 1} + + # --- Variable field chains --- + + - name: "variable field chain" + mapping: | + let user = {"name": "Alice", "address": {"city": "London"}} + root.v = $user.address.city + output: {"v": "London"} + + - name: "variable field chain with index" + mapping: | + let items = [10, 20, 30] + root.v = $items.index(2) + output: {"v": 30} + + - name: "variable field chain with method" + mapping: | + let name = "hello" + root.v = $name.uppercase() + output: {"v": "HELLO"} + + # --- Input metadata chains --- + + - name: "input metadata field chain" + input_metadata: {"routing": {"region": "us-west"}} + mapping: | + root.v = @routing.region + output: {"v": "us-west"} + output_metadata: {"routing": {"region": "us-west"}} # V1 preserves input metadata on the message part when not modified + + - name: "input metadata with method" + input_metadata: {"topic": "events"} + mapping: | + root.v = @topic.uppercase() + output: {"v": "EVENTS"} + output_metadata: {"topic": "events"} # V1 preserves input metadata when not modified + + # --- Mixed chain with dynamic index --- + + - name: "chain with dynamic string index" + input: {"data": {"x": 1, "y": 2}} + mapping: | + let key = "y" + root.v = this.data.get($key) + output: {"v": 2} # FIXME-v1: verify — V1 has no `[expr]` indexing; `.get(path)` is the closest equivalent + + - name: "chain with negative index" + input: {"items": [1, 2, 3]} + mapping: | + root.v = this.items.index(-1) + output: {"v": 3} + + # --- Chain with lambda method --- + + - name: "chain with filter method" + input: {"items": [1, -2, 3, -4]} + mapping: | + root.v = this.items.filter(x -> x > 0) + output: {"v": [1, 3]} + + - name: "chain with map and further access" + input: {"items": [1, 2, 3]} + mapping: | + root.v = this.items.map_each(x -> x * 2).length() + output: {"v": 3} + + # --- Non-collapsible roots are preserved --- + + - name: "method call on literal is not collapsed" + mapping: | + root.v = "hello".uppercase() + output: {"v": "HELLO"} + + - name: "method call on function result is not collapsed" + mapping: | + root.v = uuid_v4().length() + output: {"v": 36} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/any_all_methods.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/any_all_methods.yaml new file mode 100644 index 000000000..544d6fec1 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/any_all_methods.yaml @@ -0,0 +1,98 @@ +description: > + .any() and .all() methods — short-circuit semantics (required), + empty array behavior, and boolean return enforcement. + +tests: + # --- .any() basic --- + + - name: "any returns true when one element matches" + mapping: | + root.v = [1, 2, 3, 4, 5].any(x -> x > 4) + output: {"v": true} + + - name: "any returns false when no element matches" + mapping: | + root.v = [1, 2, 3].any(x -> x > 10) + output: {"v": false} + + - name: "any on empty array returns false" + mapping: | + root.v = [].any(x -> true) + output: {"v": false} + + - name: "any returns true on first match" + mapping: | + root.v = [true, true, true].any(x -> x) + output: {"v": true} + + # --- .all() basic --- + + - name: "all returns true when all match" + mapping: | + root.v = [2, 4, 6].all(x -> x % 2 == 0) + output: {"v": true} + + - name: "all returns false when one doesn't match" + mapping: | + root.v = [2, 3, 6].all(x -> x % 2 == 0) + output: {"v": false} + + - name: "all on empty array returns true" + skip: "V1 .all() on empty array returns false (unlike V2 which returns true by vacuous truth)" + + # --- Short-circuit required --- + + - name: "any short-circuits on first true (throw not reached)" + mapping: | + root.v = [1, 2, 3].any(x -> if x == 1 { true } else { throw("boom") }) + output: {"v": true} + + - name: "all short-circuits on first false (throw not reached)" + mapping: | + root.v = [1, 2, 3].all(x -> if x == 1 { false } else { throw("boom") }) + output: {"v": false} + + # --- Complex predicates --- + + - name: "any with object field access" + mapping: | + let items = [ + {"status": "active"}, + {"status": "pending"}, + {"status": "inactive"}, + ] + root.v = $items.any(item -> item.status == "pending") + output: {"v": true} + + - name: "all with method chain in predicate" + mapping: | + root.v = ["hello", "world", "test"].all(s -> s.length() >= 4) + output: {"v": true} + + - name: "all fails with short string" + mapping: | + root.v = ["hello", "hi", "test"].all(s -> s.length() >= 4) + output: {"v": false} + + # --- any and all with outer captures --- + + - name: "any with outer variable in predicate" + mapping: | + let min = 10 + root.v = [5, 15, 25].any(x -> x > $min) + output: {"v": true} + + - name: "all with outer variable in predicate" + mapping: | + let min = 0 + root.v = [5, 15, 25].all(x -> x > $min) + output: {"v": true} + + # --- Combined any/all --- + + - name: "any and all on same array" + mapping: | + let nums = [2, 4, 6, 8] + root.any_odd = $nums.any(x -> x % 2 != 0) + root.all_even = $nums.all(x -> x % 2 == 0) + output: {"any_odd": false, "all_even": true} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/array_modify.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/array_modify.yaml new file mode 100644 index 000000000..02f239bb1 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/array_modify.yaml @@ -0,0 +1,158 @@ +description: "Array modify methods — append, concat, without_index, join, collect" + +tests: + # --- append --- + + - name: "append to array" + mapping: | + root.result = [1, 2, 3].append(4) + output: {"result": [1, 2, 3, 4]} + + - name: "append to empty array" + mapping: | + root.result = [].append("first") + output: {"result": ["first"]} + + - name: "append null" + mapping: | + root.result = [1, 2].append(null) + output: {"result": [1, 2, null]} + + - name: "append array as single element" + mapping: | + root.result = [1, 2].append([3, 4]) + output: {"result": [1, 2, [3, 4]]} + + - name: "append does not modify original" + mapping: | + let arr = [1, 2] + let new = $arr.append(3) + root.original = $arr + root.new = $new + output: {"original": [1, 2], "new": [1, 2, 3]} + + - name: "append bool" + mapping: | + root.result = [1, "two"].append(true) + output: {"result": [1, "two", true]} + + # --- concat (V1: use .merge() which concatenates arrays) --- + + - name: "concat two arrays" + mapping: | + root.result = [1, 2].merge([3, 4]) + output: {"result": [1, 2, 3, 4]} + + - name: "concat with empty array" + mapping: | + root.result = [1, 2].merge([]) + output: {"result": [1, 2]} + + - name: "concat empty with non-empty" + mapping: | + root.result = [].merge([1, 2]) + output: {"result": [1, 2]} + + - name: "concat two empty arrays" + mapping: | + root.result = [].merge([]) + output: {"result": []} + + - name: "concat preserves element types" + mapping: | + root.result = ["a", "b"].merge([1, 2]) + output: {"result": ["a", "b", 1, 2]} + + - name: "concat does not modify original" + mapping: | + let a = [1, 2] + let b = [3, 4] + let c = $a.merge($b) + root.a = $a + root.b = $b + root.c = $c + output: {"a": [1, 2], "b": [3, 4], "c": [1, 2, 3, 4]} + + # --- without_index --- + + - name: "without_index removes element at index" + skip: "V1 has no equivalent for without_index" + + - name: "without_index first element" + skip: "V1 has no equivalent for without_index" + + - name: "without_index last element" + skip: "V1 has no equivalent for without_index" + + - name: "without_index negative index" + skip: "V1 has no equivalent for without_index" + + - name: "without_index negative index second to last" + skip: "V1 has no equivalent for without_index" + + - name: "without_index out of bounds positive is error" + skip: "V1 has no equivalent for without_index" + + - name: "without_index out of bounds negative is error" + skip: "V1 has no equivalent for without_index" + + - name: "without_index single element array" + skip: "V1 has no equivalent for without_index" + + - name: "without_index does not modify original" + skip: "V1 has no equivalent for without_index" + + # --- join --- + + - name: "join strings with delimiter" + mapping: | + root.result = ["a", "b", "c"].join(", ") + output: {"result": "a, b, c"} + + - name: "join with empty delimiter" + mapping: | + root.result = ["a", "b", "c"].join("") + output: {"result": "abc"} + + - name: "join empty array" + mapping: | + root.result = [].join(", ") + output: {"result": ""} + + - name: "join single element" + mapping: | + root.result = ["hello"].join(", ") + output: {"result": "hello"} + + - name: "join non-string element is error" + mapping: | + root.result = ["a", 2, "c"].join(", ") + error: "string" + + - name: "join with newline delimiter" + mapping: | + root.result = ["line1", "line2", "line3"].join("\n") + output: {"result": "line1\nline2\nline3"} + + # --- collect --- + + - name: "collect key-value pairs into object" + skip: "V1 has no equivalent for collect" + + - name: "collect empty array" + skip: "V1 has no equivalent for collect" + + - name: "collect duplicate keys last wins" + skip: "V1 has no equivalent for collect" + + - name: "collect missing key field is error" + skip: "V1 has no equivalent for collect" + + - name: "collect missing value field is error" + skip: "V1 has no equivalent for collect" + + - name: "collect non-object element is error" + skip: "V1 has no equivalent for collect" + + - name: "collect with mixed value types" + skip: "V1 has no equivalent for collect" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/array_query.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/array_query.yaml new file mode 100644 index 000000000..5ff8b5a5f --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/array_query.yaml @@ -0,0 +1,331 @@ +description: "Array query methods — any, all, find, contains, index_of, sum, min, max, fold" + +tests: + # --- any --- + + - name: "any returns true when some match" + mapping: | + root.result = [1, 2, 3, 4].any(x -> x > 3) + output: {"result": true} + + - name: "any returns false when none match" + mapping: | + root.result = [1, 2, 3].any(x -> x > 10) + output: {"result": false} + + - name: "any returns false for empty array" + mapping: | + root.result = [].any(x -> x > 0) + output: {"result": false} + + - name: "any short-circuits on first true" + mapping: | + root.result = [1, 2, 3].any(x -> if x == 1 { true } else { throw("should not reach") }) + output: {"result": true} + + - name: "any non-bool return is error" + mapping: | + root.result = [1, 2].any(x -> x * 2) + error: "bool" + + - name: "any void return is error" + skip: "V1 has no 'void' concept; if-without-else returns null" + + # --- all --- + + - name: "all returns true when all match" + mapping: | + root.result = [2, 4, 6].all(x -> x % 2 == 0) + output: {"result": true} + + - name: "all returns false when some do not match" + mapping: | + root.result = [2, 3, 6].all(x -> x % 2 == 0) + output: {"result": false} + + - name: "all returns true for empty array" + skip: "V1 .all() on empty array returns false (unlike V2 which returns true by vacuous truth)" + + - name: "all short-circuits on first false" + mapping: | + root.result = [1, 2, 3].all(x -> if x == 1 { false } else { throw("should not reach") }) + output: {"result": false} + + - name: "all non-bool return is error" + mapping: | + root.result = [1, 2].all(x -> x.string()) + error: "bool" + + - name: "all void return is error" + skip: "V1 has no 'void' concept; if-without-else returns null" + + # --- find (V2's find(lambda) returns matched element; V1 find_by returns index. Compose filter+index.) --- + + - name: "find returns first match" + mapping: | + root.result = [1, 2, 3, 4].filter(x -> x > 2).index(0) + output: {"result": 3} + + - name: "find returns void when no match" + skip: "V2-only: find returning void sentinel; V1 filter+index returns null, different semantics" + + - name: "find on empty array returns void" + skip: "V2-only: find returning void sentinel" + + - name: "find short-circuits on first match" + skip: "V1 composition filter(...).index(0) does not short-circuit" + + - name: "find void lambda return is error" + skip: "V1 has no 'void' concept" + + - name: "find void result works with or" + mapping: | + root.result = [1, 2, 3].filter(x -> x > 10).index(0).or("none") + output: {"result": "none"} + + - name: "find void result assigned to output field is void (no assignment)" + skip: "V2-only 'void' assignment semantics; V1 would assign null" + + # --- contains (array) --- + + - name: "contains finds integer" + mapping: | + root.result = [1, 2, 3].contains(2) + output: {"result": true} + + - name: "contains does not find missing element" + mapping: | + root.result = [1, 2, 3].contains(4) + output: {"result": false} + + - name: "contains finds string" + mapping: | + root.result = ["apple", "banana"].contains("banana") + output: {"result": true} + + - name: "contains on empty array" + mapping: | + root.result = [].contains(1) + output: {"result": false} + + - name: "contains with null" + mapping: | + root.result = [1, null, 3].contains(null) + output: {"result": true} + + - name: "contains with bool" + mapping: | + root.result = [true, false].contains(false) + output: {"result": true} + + - name: "contains type mismatch returns false" + mapping: | + root.result = [1, 2, 3].contains("2") + output: {"result": false} + + # --- index_of (array) --- + # V1 'find' method returns the index (matches V2's 'index_of' semantics). + + - name: "index_of finds first occurrence" + mapping: | + root.result = [10, 20, 30, 20].find(20).number() + output: {"result": 1.0} + + - name: "index_of returns -1 when not found" + mapping: | + root.result = [10, 20, 30].find(99).number() + output: {"result": -1.0} + + - name: "index_of on empty array" + mapping: | + root.result = [].find(1).number() + output: {"result": -1.0} + + - name: "index_of with string" + mapping: | + root.result = ["a", "b", "c"].find("b").number() + output: {"result": 1.0} + + - name: "index_of type mismatch returns -1" + mapping: | + root.result = [1, 2, 3].find("1").number() + output: {"result": -1.0} + + # --- sum --- + + - name: "sum of integers" + mapping: | + root.result = [1, 2, 3, 4].sum() + output: {"result": 10.0} + + - name: "sum of empty array is zero int64" + mapping: | + root.result = [].sum() + output: {"result": 0.0} + + - name: "sum of floats" + mapping: | + root.result = [1.5, 2.5, 3.0].sum() + output: {"result": 7.0} + + - name: "sum promotes int and float pairwise" + mapping: | + root.result = [1, 2.5, 3].sum() + output: {"result": 6.5} + + - name: "sum non-numeric element is error" + mapping: | + root.result = [1, "two", 3].sum() + error: "expected number" + + - name: "sum single element" + mapping: | + root.result = [42].sum() + output: {"result": 42.0} + + - name: "sum single non-numeric element is error" + mapping: | + root.result = ["hello"].sum() + error: "expected number" + + - name: "sum single bool element is error" + mapping: | + root.result = [true].sum() + error: "expected number" + + # --- min (V1 min/max coerce everything to float64 and reject non-numeric) --- + + - name: "min of integers" + mapping: | + root.result = [3, 1, 4, 1, 5].min() + output: {"result": 1.0} + + - name: "min of floats" + mapping: | + root.result = [3.14, 1.0, 2.71].min() + output: {"result": 1.0} + + - name: "min of strings" + mapping: | + root.result = ["banana", "apple", "cherry"].min() + error: "number" + + - name: "min of empty array is error" + mapping: | + root.result = [].min() + error: "empty" + + - name: "min cross-family is error" + mapping: | + root.result = [1, "two"].min() + error: "number" + + - name: "min single element" + mapping: | + root.result = [42].min() + output: {"result": 42.0} + + - name: "min with numeric promotion" + mapping: | + root.result = [3, 1.5, 2].min() + output: {"result": 1.5} + + - name: "min large int64 mixed with float is error" + skip: "V1 min/max always coerce to float64; no exact-representation check" + + - name: "max large int64 mixed with float is error" + skip: "V1 min/max always coerce to float64; no exact-representation check" + + - name: "min mixed numeric result is promoted type" + mapping: | + root.result = [3, 1.5, 2].min().type() + output: {"result": "number"} + + - name: "max mixed numeric returns promoted type" + mapping: | + let result = [1.5, 3, 2].max() + root.value = $result + root.type = $result.type() + output: {"value": 3.0, "type": "number"} + + # --- max --- + + - name: "max of integers" + mapping: | + root.result = [3, 1, 4, 1, 5].max() + output: {"result": 5.0} + + - name: "max of floats" + mapping: | + root.result = [3.14, 1.0, 2.71].max() + output: {"result": 3.14} + + - name: "max of strings" + mapping: | + root.result = ["banana", "apple", "cherry"].max() + error: "number" + + - name: "max of empty array is error" + mapping: | + root.result = [].max() + error: "empty" + + - name: "max cross-family is error" + mapping: | + root.result = [1, "two"].max() + error: "number" + + - name: "max single element" + mapping: | + root.result = ["only"].max() + error: "number" + + # --- fold (V1 lambda is item -> item.tally + item.value, single-arg) --- + + - name: "fold sum with integer accumulator" + mapping: | + root.result = [1, 2, 3, 4].fold(0, item -> item.tally + item.value) + output: {"result": 10} + + - name: "fold string concatenation" + mapping: | + root.result = ["a", "b", "c"].fold("", item -> item.tally + item.value) + output: {"result": "abc"} + + - name: "fold on empty array returns initial value" + mapping: | + root.result = [].fold(42, item -> item.tally + item.value) + output: {"result": 42} + + - name: "fold builds array" + mapping: | + root.result = [1, 2, 3].fold([], item -> item.tally.append(item.value * 10)) + output: {"result": [10, 20, 30]} + + - name: "fold builds object" + mapping: | + let pairs = [{"k": "a", "v": 1}, {"k": "b", "v": 2}] + root.result = $pairs.fold({}, item -> item.tally.merge({(item.value.k): item.value.v})) + output: {"result": {"a": 1, "b": 2}} + + - name: "fold with product" + mapping: | + root.result = [1, 2, 3, 4].fold(1, item -> item.tally * item.value) + output: {"result": 24} + + # --- Named arguments for lambda methods --- + + - name: "fold with named args reordered" + mapping: | + root.result = [1, 2, 3].fold(query: item -> item.tally + item.value, initial: 0) + output: {"result": 6} + + - name: "filter with named fn arg" + mapping: | + root.result = [1, 2, 3, 4].filter(test: x -> x > 2) + output: {"result": [3, 4]} + + - name: "slice with named args reordered" + mapping: | + root.result = [10, 20, 30, 40, 50].slice(high: 3, low: 1) + output: {"result": [20, 30]} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/array_transform.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/array_transform.yaml new file mode 100644 index 000000000..7aec1573a --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/array_transform.yaml @@ -0,0 +1,273 @@ +description: "Array transform methods — filter, map, sort, sort_by, flatten, unique, enumerate" + +tests: + # --- filter --- + + - name: "filter keeps matching elements" + mapping: | + root.result = [1, 2, 3, 4, 5].filter(x -> x > 3) + output: {"result": [4, 5]} + + - name: "filter with no matches returns empty array" + mapping: | + root.result = [1, 2, 3].filter(x -> x > 100) + output: {"result": []} + + - name: "filter on empty array returns empty array" + mapping: | + root.result = [].filter(x -> x > 0) + output: {"result": []} + + - name: "filter keeps all when all match" + mapping: | + root.result = [10, 20, 30].filter(x -> x > 0) + output: {"result": [10, 20, 30]} + + - name: "filter preserves original element types" + mapping: | + root.result = ["apple", "banana", "avocado"].filter(s -> s.has_prefix("a")) + output: {"result": ["apple", "avocado"]} + + - name: "filter non-bool return is error" + mapping: | + root.result = [1, 2, 3].filter(x -> x * 2) + output: {"result": []} + + - name: "filter void return is error" + skip: "V1 has no 'void' concept; if-without-else returns null, which is non-true so filter excludes" + + - name: "filter does not modify original array" + mapping: | + let arr = [1, 2, 3, 4] + let filtered = $arr.filter(x -> x > 2) + root.original = $arr + root.filtered = $filtered + output: {"original": [1, 2, 3, 4], "filtered": [3, 4]} + + # --- map (V2's map on arrays = V1's map_each) --- + + - name: "map doubles each element" + mapping: | + root.result = [1, 2, 3].map_each(x -> x * 2) + output: {"result": [2, 4, 6]} + + - name: "map on empty array returns empty array" + mapping: | + root.result = [].map_each(x -> x + 1) + output: {"result": []} + + - name: "map can change element types" + mapping: | + root.result = [1, 2, 3].map_each(x -> x.string()) + output: {"result": ["1", "2", "3"]} + + - name: "map deleted omits element" + mapping: | + root.result = [1, 2, 3, 4].map_each(x -> if x % 2 == 0 { x * 10 } else { deleted() }) + output: {"result": [20, 40]} + + - name: "map void return is error" + skip: "V1 has no 'void' concept; if-without-else returns null" + + - name: "map does not modify original array" + mapping: | + let arr = [1, 2, 3] + let mapped = $arr.map_each(x -> x + 10) + root.original = $arr + root.mapped = $mapped + output: {"original": [1, 2, 3], "mapped": [11, 12, 13]} + + - name: "map with block body" + skip: "V1 lambdas do not support block-with-let syntax" + + # --- sort --- + + - name: "sort integers ascending" + mapping: | + root.result = [3, 1, 4, 1, 5].sort() + output: {"result": [1, 1, 3, 4, 5]} + + - name: "sort empty array" + mapping: | + root.result = [].sort() + output: {"result": []} + + - name: "sort single element" + mapping: | + root.result = [42].sort() + output: {"result": [42]} + + - name: "sort strings lexicographic" + mapping: | + root.result = ["banana", "apple", "cherry"].sort() + output: {"result": ["apple", "banana", "cherry"]} + + - name: "sort floats ascending" + mapping: | + root.result = [3.14, 1.0, 2.71].sort() + output: {"result": [1.0, 2.71, 3.14]} + + - name: "sort numeric promotion int and float" + mapping: | + root.result = [3, 1.5, 2].sort() + output: {"result": [1.5, 2, 3]} + + - name: "sort is stable" + mapping: | + let items = [{"k": "a", "v": 2}, {"k": "b", "v": 1}, {"k": "c", "v": 2}] + root.result = $items.sort_by(x -> x.v).map_each(x -> x.k) + output: {"result": ["b", "a", "c"]} + + - name: "sort cross-family is error" + mapping: | + root.result = [1, "two", 3].sort() + error: "sort" + + - name: "sort booleans is error" + mapping: | + root.result = [true, false, true].sort() + error: "sort" + + - name: "sort nulls is error" + mapping: | + root.result = [null, null].sort() + error: "sort" + + - name: "sort single boolean is error" + mapping: | + root.result = [true].sort() + output: {"result": [true]} + + - name: "sort single null is error" + mapping: | + root.result = [null].sort() + output: {"result": [null]} + + - name: "sort single object is error" + mapping: | + root.result = [{"a": 1}].sort() + output: {"result": [{"a": 1}]} + + - name: "sort int64 above 2^53 mixed with float is error" + skip: "V1 sort does not check exact integer representation" + + - name: "sort NaN sorts after all values" + skip: "V2-only _type NaN literal shorthand; V1 has no NaN literal" + + # --- sort_by --- + + - name: "sort_by with key function" + mapping: | + let items = [{"name": "Charlie"}, {"name": "Alice"}, {"name": "Bob"}] + root.result = $items.sort_by(x -> x.name) + output: {"result": [{"name": "Alice"}, {"name": "Bob"}, {"name": "Charlie"}]} + + - name: "sort_by numeric key" + mapping: | + let items = [{"score": 80}, {"score": 95}, {"score": 70}] + root.result = $items.sort_by(x -> x.score) + output: {"result": [{"score": 70}, {"score": 80}, {"score": 95}]} + + - name: "sort_by empty array" + mapping: | + root.result = [].sort_by(x -> x) + output: {"result": []} + + - name: "sort_by cross-family keys is error" + mapping: | + let items = [{"k": 1}, {"k": "two"}] + root.result = $items.sort_by(x -> x.k) + error: "sort" + + # --- flatten --- + + - name: "flatten nested arrays one level" + mapping: | + root.result = [[1, 2], [3, 4], [5]].flatten() + output: {"result": [1, 2, 3, 4, 5]} + + - name: "flatten only one level deep" + mapping: | + root.result = [[[1, 2]], [[3]]].flatten() + output: {"result": [[1, 2], [3]]} + + - name: "flatten non-arrays kept as-is" + mapping: | + root.result = [1, [2, 3], "hello", [4]].flatten() + output: {"result": [1, 2, 3, "hello", 4]} + + - name: "flatten empty inner arrays spliced as zero elements" + mapping: | + root.result = [1, [], 2, [], 3].flatten() + output: {"result": [1, 2, 3]} + + - name: "flatten empty array" + mapping: | + root.result = [].flatten() + output: {"result": []} + + - name: "flatten array of empty arrays" + mapping: | + root.result = [[], [], []].flatten() + output: {"result": []} + + - name: "flatten mixed types with nested arrays" + mapping: | + root.result = [null, [true, false], "abc"].flatten() + output: {"result": [null, true, false, "abc"]} + + # --- unique --- + + - name: "unique removes duplicates" + mapping: | + root.result = [1, 2, 2, 3, 1, 3].unique() + output: {"result": [1, 2, 3]} + + - name: "unique keeps first occurrence" + mapping: | + root.result = [3, 1, 2, 1, 3].unique() + output: {"result": [3, 1, 2]} + + - name: "unique on empty array" + mapping: | + root.result = [].unique() + output: {"result": []} + + - name: "unique strings" + mapping: | + root.result = ["a", "b", "a", "c", "b"].unique() + output: {"result": ["a", "b", "c"]} + + - name: "unique with key function" + mapping: | + let items = [{"id": 1, "v": "a"}, {"id": 2, "v": "b"}, {"id": 1, "v": "c"}] + root.result = $items.unique(x -> x.id) + output: {"result": [{"id": 1, "v": "a"}, {"id": 2, "v": "b"}]} + + - name: "unique NaN values considered equal" + skip: "V2-only _type NaN literal shorthand; V1 has no NaN literal" + + - name: "unique mixed types preserved" + skip: "V1 unique only accepts string/number arrays (bool/null elements error)" + + # --- enumerate (V1: enumerated) --- + + - name: "enumerate basic" + mapping: | + root.result = ["a", "b", "c"].enumerated() + output: {"result": [{"index": 0, "value": "a"}, {"index": 1, "value": "b"}, {"index": 2, "value": "c"}]} + + - name: "enumerate empty array" + mapping: | + root.result = [].enumerated() + output: {"result": []} + + - name: "enumerate single element" + mapping: | + root.result = [42].enumerated() + output: {"result": [{"index": 0, "value": 42}]} + + - name: "enumerate preserves element types" + mapping: | + root.result = [true, null, 3.14].enumerated() + output: {"result": [{"index": 0, "value": true}, {"index": 1, "value": null}, {"index": 2, "value": 3.14}]} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/collect_method.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/collect_method.yaml new file mode 100644 index 000000000..8ae03f843 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/collect_method.yaml @@ -0,0 +1,36 @@ +description: > + .collect() method — converts array of {key, value} objects into an object. + Last value wins on duplicate keys. + +tests: + # --- Basic collect --- + + - name: "collect simple key-value pairs" + skip: "V1 has no equivalent for collect" + + - name: "collect single entry" + skip: "V1 has no equivalent for collect" + + - name: "collect empty array" + skip: "V1 has no equivalent for collect" + + # --- Last value wins on duplicates --- + + - name: "collect duplicate keys — last wins" + skip: "V1 has no equivalent for collect" + + - name: "collect multiple duplicates — last wins" + skip: "V1 has no equivalent for collect" + + # --- Mixed value types --- + + - name: "collect with mixed value types" + skip: "V1 has no equivalent for collect" + + # --- Collect from transformed data --- + + - name: "enumerate then collect round-trips" + skip: "V1 has no equivalent for collect" + + - name: "map_entries to key-value then collect" + skip: "V1 has no equivalent for collect/iter" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/core_functions.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/core_functions.yaml new file mode 100644 index 000000000..8a9779caf --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/core_functions.yaml @@ -0,0 +1,210 @@ +description: "Core stdlib functions: uuid_v4, now, random_int, range, timestamp constructor, duration constants" + +tests: + # --- uuid_v4 --- + + - name: "uuid_v4 returns a string" + mapping: | + root = uuid_v4().type() + output: "string" + + - name: "uuid_v4 is non-deterministic" + mapping: | + root = uuid_v4() + no_output_check: true + output_type: "string" + + - name: "uuid_v4 returns 36 chars" + mapping: | + root = uuid_v4().length() + output: 36 + + - name: "uuid_v4 two calls differ" + mapping: | + root = uuid_v4() != uuid_v4() + output: true + + # --- now (V1 now() returns RFC3339 string, not timestamp) --- + + - name: "now returns a timestamp" + mapping: | + root = now() + no_output_check: true + output_type: "string" + + - name: "now type check" + mapping: | + root = now().type() + output: "string" + + - name: "now called twice may differ" + mapping: | + let a = now() + let b = now() + root = $a <= $b + output: true + + # --- random_int (V1 positional is (seed, min, max); use named args) --- + + - name: "random_int returns int64" + mapping: | + root = random_int(min: 0, max: 10).type() + output: "number" + + - name: "random_int within range" + mapping: | + let v = random_int(min: 5, max: 5) + root = $v + output: 5 + + - name: "random_int is non-deterministic" + mapping: | + root = random_int(min: 0, max: 1000000) + no_output_check: true + output_type: "int64" + + - name: "random_int min equals max returns that value" + mapping: | + root = random_int(min: 42, max: 42) + output: 42 + + - name: "random_int min greater than max is error" + mapping: | + root = random_int(min: 10, max: 5) + compile_error: "min" + + - name: "random_int named args reorder correctly" + mapping: | + root = random_int(max: 5, min: 5) + output: 5 + + - name: "random_int named args reversed still works" + mapping: | + let v = random_int(max: 10, min: 10) + root = $v + output: 10 + + - name: "random_int negative range" + skip: "V1 random_int rejects negative min at compile time (must be positive)" + + # --- range --- + + - name: "range ascending default step" + mapping: | + root = range(0, 5) + output: [0, 1, 2, 3, 4] + + - name: "range ascending explicit step" + mapping: | + root = range(0, 10, 2) + output: [0, 2, 4, 6, 8] + + - name: "range descending inferred step" + skip: "V1 range does not auto-infer negative step; must be explicit" + + - name: "range descending explicit step" + mapping: | + root = range(10, 0, -3) + output: [10, 7, 4] + + - name: "range start equals stop is empty" + mapping: | + root = range(5, 5) + compile_error: "must be <" + + - name: "range step zero is error" + mapping: | + root = range(0, 5, 0) + compile_error: "step" + + - name: "range step contradicts direction positive" + mapping: | + root = range(0, 5, -1) + compile_error: "step" + + - name: "range step contradicts direction negative" + mapping: | + root = range(5, 0, 1) + compile_error: "step" + + - name: "range single element" + mapping: | + root = range(0, 1) + output: [0] + + - name: "range negative values" + mapping: | + root = range(-3, 3) + output: [-3, -2, -1, 0, 1, 2] + + - name: "range result element type is int64" + mapping: | + root = range(0, 3).index(0).type() + output: "number" + + - name: "range named args reorder correctly" + mapping: | + root = range(stop: 3, start: 0) + output: [0, 1, 2] + + - name: "range named args with step" + mapping: | + root = range(stop: 10, start: 0, step: 3) + output: [0, 3, 6] + + - name: "range named args omit optional step" + mapping: | + root = range(stop: 3, start: 0) + output: [0, 1, 2] + + # --- timestamp constructor --- + + - name: "timestamp required args only" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp all positional args" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp with named args" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp with timezone" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp invalid month zero" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp invalid month thirteen" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp invalid day zero" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp invalid timezone" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp named args skip middle optional params" + skip: "V1 has no timestamp() function constructor" + + - name: "timestamp named args all required only" + skip: "V1 has no timestamp() function constructor" + + # --- duration constants --- + + - name: "second returns 1e9 nanoseconds" + skip: "V1 has no second() duration constant" + + - name: "minute returns 60e9 nanoseconds" + skip: "V1 has no minute() duration constant" + + - name: "hour returns 3600e9 nanoseconds" + skip: "V1 has no hour() duration constant" + + - name: "day returns 86400e9 nanoseconds" + skip: "V1 has no day() duration constant" + + - name: "duration constants are int64" + skip: "V1 has no second() duration constant" + + - name: "duration arithmetic 2 hours plus 30 minutes" + skip: "V1 has no hour()/minute() duration constants" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/encoding.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/encoding.yaml new file mode 100644 index 000000000..cfd81766e --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/encoding.yaml @@ -0,0 +1,343 @@ +description: "Encoding and parsing methods: parse_json, format_json, encode, decode" + +tests: + # --- parse_json: objects --- + + - name: "parse_json object" + mapping: | + root = "{\"name\":\"Alice\",\"age\":30}".parse_json() + output: {"name": "Alice", "age": 30.0} + + - name: "parse_json empty object" + mapping: | + root = "{}".parse_json() + output: {} + + # --- parse_json: arrays --- + + - name: "parse_json array of ints" + mapping: | + root = "[1,2,3]".parse_json() + output: [1.0, 2.0, 3.0] + + - name: "parse_json empty array" + mapping: | + root = "[]".parse_json() + output: [] + + # --- parse_json: scalars --- + + - name: "parse_json string" + mapping: | + root = "\"hello\"".parse_json() + output: "hello" + + - name: "parse_json integer is int64" + mapping: | + root = "42".parse_json().type() + output: "number" + + - name: "parse_json integer value" + mapping: | + root = "42".parse_json() + output: 42.0 + + - name: "parse_json float with decimal is float64" + mapping: | + root = "3.14".parse_json().type() + output: "number" + + - name: "parse_json float value" + mapping: | + root = "3.14".parse_json() + output: 3.14 + + - name: "parse_json exponent is float64" + mapping: | + root = "1e3".parse_json().type() + output: "number" + + - name: "parse_json exponent value" + mapping: | + root = "1e3".parse_json() + output: 1000.0 + + - name: "parse_json boolean true" + mapping: | + root = "true".parse_json() + output: true + + - name: "parse_json boolean false" + mapping: | + root = "false".parse_json() + output: false + + - name: "parse_json null" + mapping: | + root = "null".parse_json() + output: null + + - name: "parse_json negative integer" + mapping: | + root = "-100".parse_json() + output: -100.0 + + - name: "parse_json zero integer" + mapping: | + root = "0".parse_json() + output: 0.0 + + # --- parse_json: errors --- + + - name: "parse_json invalid json" + mapping: | + root = "not json".parse_json() + error: "parse" + + - name: "parse_json empty string" + mapping: | + root = "".parse_json() + error: "parse" + + - name: "parse_json truncated object" + mapping: | + root = "{\"name\":".parse_json() + error: "parse" + + # --- format_json (V1 default indent is 4 spaces; use no_indent:true for compact) --- + + - name: "format_json object keys sorted" + mapping: | + root = {"b": 2, "a": 1}.format_json(no_indent: true).string() + output: '{"a":1,"b":2}' + + - name: "format_json array" + mapping: | + root = [1, 2, 3].format_json(no_indent: true).string() + output: "[1,2,3]" + + - name: "format_json string" + mapping: | + root = "hello".format_json(no_indent: true).string() + output: '"hello"' + + - name: "format_json integer" + mapping: | + root = 42.format_json(no_indent: true).string() + output: "42" + + - name: "format_json boolean" + mapping: | + root = true.format_json(no_indent: true).string() + output: "true" + + - name: "format_json null" + mapping: | + root = null.format_json(no_indent: true).string() + output: "null" + + - name: "format_json empty object" + mapping: | + root = {}.format_json(no_indent: true).string() + output: "{}" + + - name: "format_json empty array" + skip: "V1 format_json on empty array returns 'null' (known V1 quirk)" + + # --- format_json: indent --- + + - name: "format_json with two-space indent" + mapping: | + root = {"a": 1}.format_json(indent: " ").string() + output: "{\n \"a\": 1\n}" + + - name: "format_json with tab indent" + mapping: | + root = {"a": 1}.format_json(indent: "\t").string() + output: "{\n\t\"a\": 1\n}" + + # --- format_json: escape_html --- + + - name: "format_json escapes html by default" + mapping: | + root = {"html": "hi"}.format_json(no_indent: true).string() + output: "{\"html\":\"\\u003cb\\u003ehi\\u003c/b\\u003e\"}" + + - name: "format_json escape_html false" + mapping: | + root = {"html": "hi"}.format_json(no_indent: true, escape_html: false).string() + output: '{"html":"hi"}' + + # --- format_json: timestamps (V1 has no timestamp() constructor) --- + + - name: "format_json timestamp as RFC 3339" + skip: "V1 has no timestamp() function constructor" + + - name: "format_json timestamp in object uses shortest precision" + skip: "V1 has no timestamp() function constructor" + + - name: "format_json timestamp with fractional seconds in object" + skip: "V1 has no timestamp() function constructor" + + - name: "format_json timestamp in nested array" + skip: "V1 has no timestamp() function constructor" + + # --- format_json: errors --- + + - name: "format_json bytes is error" + skip: "V1 format_json accepts bytes (coerces to string)" + + - name: "format_json nested bytes is error" + skip: "V1 format_json accepts bytes (coerces to string)" + + - name: "format_json NaN is error" + skip: "V2-only _type NaN literal; V1 has no NaN literal" + + - name: "format_json Infinity is error" + skip: "V2-only _type Infinity literal; V1 has no Infinity literal" + + # --- format_json / parse_json round-trip --- + + - name: "format then parse round-trip object" + mapping: | + let obj = {"name": "Alice", "age": 30} + root = $obj.format_json(no_indent: true).parse_json() + output: {"name": "Alice", "age": 30.0} + + - name: "format then parse round-trip array" + mapping: | + let arr = [1, "two", true, null] + root = $arr.format_json(no_indent: true).parse_json() + output: [1.0, "two", true, null] + + # --- encode: base64 --- + + - name: "encode base64 from string" + mapping: | + root = "hello".encode("base64") + output: "aGVsbG8=" + + - name: "encode base64 from bytes" + mapping: | + root = "hello".bytes().encode("base64") + output: "aGVsbG8=" + + - name: "encode base64 empty string" + mapping: | + root = "".encode("base64") + output: "" + + # --- encode: base64url --- + + - name: "encode base64url" + mapping: | + root = "hello?world>".encode("base64url") + output: "aGVsbG8_d29ybGQ-" + + # --- encode: base64rawurl --- + + - name: "encode base64rawurl no padding" + mapping: | + root = "hello".encode("base64rawurl") + output: "aGVsbG8" + + # --- encode: hex --- + + - name: "encode hex from string" + mapping: | + root = "hello".encode("hex") + output: "68656c6c6f" + + - name: "encode hex from bytes" + mapping: | + root = "hello".bytes().encode("hex") + output: "68656c6c6f" + + - name: "encode hex empty" + mapping: | + root = "".encode("hex") + output: "" + + # --- decode: base64 --- + + - name: "decode base64 returns bytes" + mapping: | + root = "aGVsbG8=".decode("base64").type() + output: "bytes" + + - name: "decode base64 to string" + mapping: | + root = "aGVsbG8=".decode("base64").string() + output: "hello" + + # --- decode: base64url --- + + - name: "decode base64url to string" + mapping: | + root = "aGVsbG8_d29ybGQ-".decode("base64url").string() + output: "hello?world>" + + # --- decode: base64rawurl --- + + - name: "decode base64rawurl to string" + mapping: | + root = "aGVsbG8".decode("base64rawurl").string() + output: "hello" + + # --- decode: hex --- + + - name: "decode hex returns bytes" + mapping: | + root = "68656c6c6f".decode("hex").type() + output: "bytes" + + - name: "decode hex to string" + mapping: | + root = "68656c6c6f".decode("hex").string() + output: "hello" + + # --- decode: errors --- + + - name: "decode invalid base64" + mapping: | + root = "!!!".decode("base64") + error: "EOF" + + - name: "decode invalid hex" + mapping: | + root = "zzzz".decode("hex") + error: "invalid byte" + + # --- encode/decode round-trips --- + + - name: "base64 encode decode round-trip" + mapping: | + root = "hello world!".encode("base64").decode("base64").string() + output: "hello world!" + + - name: "hex encode decode round-trip" + mapping: | + root = "hello world!".encode("hex").decode("hex").string() + output: "hello world!" + + - name: "base64rawurl encode decode round-trip" + mapping: | + root = "test data".encode("base64rawurl").decode("base64rawurl").string() + output: "test data" + + - name: "base64url encode decode round-trip" + mapping: | + root = "test data".encode("base64url").decode("base64url").string() + output: "test data" + + # --- encode: named arg --- + + - name: "encode with named scheme arg" + mapping: | + root = "hello".encode(scheme: "hex") + output: "68656c6c6f" + + - name: "decode with named scheme arg" + mapping: | + root = "68656c6c6f".decode(scheme: "hex").string() + output: "hello" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/enumerate_method.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/enumerate_method.yaml new file mode 100644 index 000000000..a7ecda81d --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/enumerate_method.yaml @@ -0,0 +1,131 @@ +description: > + .enumerate() method — returns array of {index, value} objects. + Also tests .without_index(), .sum(), .min(), .max(), .join() basics. + +tests: + # --- enumerate (V1: enumerated) --- + + - name: "enumerate basic" + mapping: | + root.v = ["a", "b", "c"].enumerated() + output: {"v": [{"index": 0, "value": "a"}, {"index": 1, "value": "b"}, {"index": 2, "value": "c"}]} + + - name: "enumerate empty array" + mapping: | + root.v = [].enumerated() + output: {"v": []} + + - name: "enumerate single element" + mapping: | + root.v = [42].enumerated() + output: {"v": [{"index": 0, "value": 42}]} + + - name: "enumerate then map to transform with index" + mapping: | + root.v = ["a", "b"].enumerated().map_each(e -> e.index.string() + ":" + e.value) + output: {"v": ["0:a", "1:b"]} + + # --- without_index --- + + - name: "without_index removes element at index" + skip: "V1 has no equivalent for without_index" + + - name: "without_index first element" + skip: "V1 has no equivalent for without_index" + + - name: "without_index last element" + skip: "V1 has no equivalent for without_index" + + - name: "without_index out of bounds is error" + skip: "V1 has no equivalent for without_index" + + # --- join --- + + - name: "join array of strings" + mapping: | + root.v = ["a", "b", "c"].join(",") + output: {"v": "a,b,c"} + + - name: "join with empty separator" + mapping: | + root.v = ["a", "b", "c"].join("") + output: {"v": "abc"} + + - name: "join empty array" + mapping: | + root.v = [].join(",") + output: {"v": ""} + + - name: "join single element" + mapping: | + root.v = ["only"].join(",") + output: {"v": "only"} + + - name: "join with non-string element is error" + mapping: | + root.v = ["a", 1, "b"].join(",") + error: "string" + + # --- sum --- + + - name: "sum of integers" + mapping: | + root.v = [1, 2, 3, 4].sum() + output: {"v": 10.0} + + - name: "sum of floats" + mapping: | + root.v = [1.5, 2.5, 3.0].sum() + output: {"v": 7.0} + + - name: "sum of empty array returns zero" + mapping: | + root.v = [].sum() + output: {"v": 0.0} + + - name: "sum of single element" + mapping: | + root.v = [42].sum() + output: {"v": 42.0} + + # --- min / max (V1 min/max are numeric-only and return float64) --- + + - name: "min of integers" + mapping: | + root.v = [3, 1, 4, 1, 5].min() + output: {"v": 1.0} + + - name: "max of integers" + mapping: | + root.v = [3, 1, 4, 1, 5].max() + output: {"v": 5.0} + + - name: "min of strings" + mapping: | + root.v = ["banana", "apple", "cherry"].min() + error: "number" + + - name: "max of strings" + mapping: | + root.v = ["banana", "apple", "cherry"].max() + error: "number" + + - name: "min of empty array is error" + mapping: | + root.v = [].min() + error: "empty" + + - name: "max of empty array is error" + mapping: | + root.v = [].max() + error: "empty" + + - name: "min of single element" + mapping: | + root.v = [42].min() + output: {"v": 42.0} + + - name: "max of single element" + mapping: | + root.v = [42].max() + output: {"v": 42.0} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/find_method.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/find_method.yaml new file mode 100644 index 000000000..4c4dc8b5d --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/find_method.yaml @@ -0,0 +1,74 @@ +description: > + .find() method — returns first matching element or void. Short-circuits + on first match. Void result can be rescued with .or(). + +tests: + # --- Basic find (V2's find(lambda) returns matched element. V1 approximation: filter+index(0)) --- + + - name: "find returns first matching element" + mapping: | + root.v = [1, 2, 3, 4, 5].filter(x -> x > 3).index(0) + output: {"v": 4} + + - name: "find returns first match, not all matches" + mapping: | + root.v = [10, 20, 30, 40].filter(x -> x > 15).index(0) + output: {"v": 20} + + - name: "find with string array" + mapping: | + root.v = ["apple", "banana", "cherry"].filter(s -> s.has_prefix("b")).index(0) + output: {"v": "banana"} + + # --- find returns void when no match (V1 has no void; filter+index(0) returns null) --- + + - name: "find no match — void skips output assignment" + skip: "V2-only 'void' skips assignment; V1 would assign null" + + - name: "find no match — void errors in variable declaration" + skip: "V1 has no 'void' concept" + + - name: "find no match — void rescued with or" + mapping: | + root.v = [1, 2, 3].filter(x -> x > 100).index(0).or(-1) + output: {"v": -1} + + - name: "find no match — void in array literal errors" + skip: "V2-only 'void' error propagation semantics" + + # --- Short-circuit behavior --- + + - name: "find short-circuits after first match" + mapping: | + root.v = [1, 2, 3, 4, 5].filter(x -> x == 2).index(0) + output: {"v": 2} + + # --- find on empty array --- + + - name: "find on empty array returns void — rescued with or" + mapping: | + root.v = [].filter(x -> true).index(0).or("nothing") + output: {"v": "nothing"} + + # --- find with complex predicates --- + + - name: "find with object elements" + mapping: | + let users = [ + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 17}, + {"name": "Carol", "age": 30}, + ] + root.v = $users.filter(u -> u.age < 18).index(0) + output: {"v": {"name": "Bob", "age": 17}} + + - name: "find with method chain in predicate" + mapping: | + root.v = ["hello", "world", "hi"].filter(s -> s.length() < 4).index(0) + output: {"v": "hi"} + + - name: "find with outer variable capture" + mapping: | + let threshold = 15 + root.v = [10, 20, 30].filter(x -> x > $threshold).index(0) + output: {"v": 20} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/iter_chain_patterns.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/iter_chain_patterns.yaml new file mode 100644 index 000000000..c2bd78e59 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/iter_chain_patterns.yaml @@ -0,0 +1,63 @@ +description: > + Iterator chain patterns — complex combinations of .iter(), .map(), + .filter(), .fold(), .map_entries(), .map_values() with nested lambdas, + outer variable capture, and control flow inside lambda bodies. + +tests: + # --- iter + fold to rebuild object --- + + - name: "iter then fold to filter and rebuild object" + skip: "V1 has no iter() method and no block-body lambdas with let" + + - name: "iter then map then fold" + skip: "V1 has no iter() method and no block-body lambdas with let" + + # --- map_entries with complex lambda --- + + - name: "map_entries with if in lambda body" + skip: "V1 has no map_entries method" + + - name: "map_entries with block body and local variables" + skip: "V1 has no map_entries method and no block-body lambdas" + + # --- filter + map + sort chain --- + + - name: "filter then map then sort" + mapping: | + root.v = [5, -3, 8, -1, 4, -6].filter(x -> x > 0).map_each(x -> x * x).sort() + output: {"v": [16, 25, 64]} + + # --- map with outer capture + dynamic path --- + + - name: "map with outer capture building objects" + skip: "V1 has no block-body lambdas with let/dynamic-key writes" + + # --- Nested map operations --- + + - name: "map_values calling map on inner arrays" + skip: "V1 has no map_values method" + + - name: "map_values with fold on inner arrays" + skip: "V1 has no map_values method" + + # --- Combined filter_entries and map_values --- + + - name: "filter_entries then map_values chain" + skip: "V1 has no filter_entries/map_values methods" + + # --- map producing objects then collect --- + + - name: "map to key-value then collect" + skip: "V1 has no collect method" + + # --- Chained any/all after transform --- + + - name: "map then any" + mapping: | + root.v = [1, 2, 3, 4, 5].map_each(x -> x * x).any(sq -> sq > 20) + output: {"v": true} + + - name: "filter then all" + mapping: | + root.v = [2, 4, 6, 8, 10].filter(x -> x > 3).all(x -> x % 2 == 0) + output: {"v": true} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/method_composition.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/method_composition.yaml new file mode 100644 index 000000000..bcf3ee837 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/method_composition.yaml @@ -0,0 +1,137 @@ +description: "Cross-method composition tests: chaining stdlib methods in realistic combinations" + +tests: + # --- String + encoding chains --- + + - name: "trim then encode base64" + mapping: | + root = " hello ".trim().encode("base64") + output: "aGVsbG8=" + + - name: "split then map uppercase" + mapping: | + root = "a,b,c".split(",").map_each(v -> v.uppercase()) + output: ["A", "B", "C"] + + - name: "split then filter then join" + mapping: | + root = "a,,b,,c".split(",").filter(v -> v.length() > 0).join(",") + output: "a,b,c" + + - name: "replace_all then split then length" + mapping: | + root = "one::two::three".replace_all("::", ",").split(",").length() + output: 3 + + # --- Numeric + type chains --- + + - name: "abs then floor" + mapping: | + root = (-3.7).abs().floor() + output: 3 + + - name: "ceil then int64 conversion" + skip: "V1 has no int64() method; use .number() instead (returns int64 or float64)" + + - name: "round then format_json" + skip: "V1 round() takes no precision argument" + + - name: "abs preserves int32 type through chain" + skip: "V1 has no abs()/int32() methods; V1 has no 'int32' type" + + # --- Timestamp + format chains --- + + - name: "ts_parse then ts_format custom" + skip: "V1 ts_format uses Go layout, not strftime (use ts_strftime for %Y-%m-%d)" + + - name: "timestamp constructor then ts_add then ts_format" + skip: "V1 has no timestamp() function constructor, no ts_add (has ts_add_iso8601), no minute() constant" + + - name: "ts_parse then ts_unix then ts_from_unix round-trip" + skip: "V1 has no ts_from_unix method" + + - name: "timestamp subtraction then comparison" + skip: "V1 has no timestamp() function constructor or hour() constant" + + # --- JSON + object chains --- + + - name: "parse_json then keys sorted" + mapping: | + root = "{\"b\":2,\"a\":1}".parse_json().keys().sort() + output: ["a", "b"] + + - name: "object merge then format_json" + mapping: | + root = {"a": 1}.assign({"b": 2}).format_json(no_indent: true).string() + output: '{"a":1,"b":2}' + + - name: "parse_json then map values" + skip: "V1 has no map_values method" + + - name: "format_json then parse_json idempotent" + mapping: | + let obj = {"name": "Alice", "age": 30, "active": true} + root = $obj.format_json(no_indent: true).parse_json() == $obj + output: true + + # --- Array + query chains --- + + - name: "range then filter then sum" + mapping: | + root = range(1, 11).filter(v -> v % 2 == 0).sum() + output: 30.0 + + - name: "range then map then sort descending" + mapping: | + root = range(1, 4).map_each(v -> v * v).sort(item -> item.right < item.left) + output: [9, 4, 1] + + - name: "array map abs then max" + mapping: | + root = [-3, 1, -7, 4].map_each(v -> v.abs()).max() + output: 7.0 + + - name: "split then unique then sort then join" + mapping: | + root = "b,a,c,a,b".split(",").unique().sort().join(",") + output: "a,b,c" + + # --- Encoding round-trip chains --- + + - name: "string to base64 to hex round-trip" + mapping: | + root = "hello".encode("base64").decode("base64").encode("hex") + output: "68656c6c6f" + + - name: "format_json then encode base64 then decode" + mapping: | + root = {"key": "value"}.format_json(no_indent: true).encode("base64").decode("base64").string() + output: '{"key":"value"}' + + # --- Error handling in chains --- + + - name: "parse_json catch returns default" + mapping: | + root = "bad json".parse_json().catch("fallback") + output: "fallback" + + - name: "ts_parse catch returns null" + skip: "migrator V1 core env lacks ts_parse" + + - name: "chained method after or" + mapping: | + root = null.or("hello").uppercase() + output: "HELLO" + + - name: "abs on parse_json result" + mapping: | + root = "\"-42\"".parse_json().number().abs() + output: 42.0 + + # --- Timestamp arithmetic composition --- + + - name: "ts_add chained multiple times" + skip: "V1 has no timestamp() function constructor or ts_add method" + + - name: "ts_unix_nano to int then back preserves nanos" + skip: "V1 has no timestamp() constructor or ts_from_unix_nano method" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/numeric_methods.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/numeric_methods.yaml new file mode 100644 index 000000000..f74ba02c2 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/numeric_methods.yaml @@ -0,0 +1,273 @@ +description: "Numeric methods: abs, floor, ceil, round (half-even)" + +tests: + # --- abs: float64 --- + + - name: "abs positive float64 is identity" + mapping: | + root = 3.14.abs() + output: 3.14 + + - name: "abs negative float64" + mapping: | + root = (-3.14).abs() + output: 3.14 + + - name: "abs zero float64" + mapping: | + root = 0.0.abs() + output: 0.0 + + - name: "abs float64 returns float64" + mapping: | + root = (-2.5).abs().type() + output: "number" + + # --- abs: int64 --- + + - name: "abs positive int64 is identity" + mapping: | + root = 42.abs() + output: 42 + + - name: "abs negative int64" + mapping: | + root = (-42).abs() + output: 42 + + - name: "abs zero int64" + mapping: | + root = 0.abs() + output: 0 + + - name: "abs int64 returns int64" + mapping: | + root = (-5).abs().type() + output: "number" + + - name: "abs int64 min value overflows" + skip: "V1-divergence: abs(int64-min) does not detect overflow; returns Go math.Abs-derived value" + + # --- abs: int32 --- + + - name: "abs negative int32" + skip: "V1 has no abs()/int32() methods" + + - name: "abs int32 min value overflows" + skip: "V1 has no abs()/int32() methods" + + - name: "abs int32 returns int32" + skip: "V1 has no abs()/int32() methods" + + # --- abs: uint64 identity --- + + - name: "abs uint64 is identity" + skip: "V1 has no abs()/uint64() methods" + + # --- abs: uint32 identity --- + + - name: "abs uint32 is identity" + skip: "V1 has no abs()/uint32() methods" + + # --- abs: float32 --- + + - name: "abs negative float32" + skip: "V1 has no abs()/float32() methods" + + - name: "abs float32 returns float32" + skip: "V1 has no abs()/float32() methods" + + # --- floor (V1 returns integer if fits, else float) --- + + - name: "floor positive float64" + mapping: | + root = 3.7.floor() + output: 3 + + - name: "floor negative float64" + mapping: | + root = (-3.2).floor() + output: -4 + + - name: "floor already integer float64" + mapping: | + root = 5.0.floor() + output: 5 + + - name: "floor zero" + mapping: | + root = 0.0.floor() + output: 0 + + - name: "floor negative already integer" + mapping: | + root = (-4.0).floor() + output: -4 + + - name: "floor small positive" + mapping: | + root = 0.1.floor() + output: 0 + + - name: "floor small negative" + mapping: | + root = (-0.1).floor() + output: -1 + + - name: "floor returns float64" + mapping: | + root = 3.7.floor().type() + output: "number" + + - name: "floor float32 returns float32" + skip: "V1-divergence: floor() rejects typed float32 values (expected number value, got number from method float32)" + + - name: "floor float32 value" + skip: "V1-divergence: floor() rejects typed float32 values (expected number value, got number from method float32)" + + # --- ceil --- + + - name: "ceil positive float64" + mapping: | + root = 3.2.ceil() + output: 4 + + - name: "ceil negative float64" + mapping: | + root = (-3.7).ceil() + output: -3 + + - name: "ceil already integer float64" + mapping: | + root = 5.0.ceil() + output: 5 + + - name: "ceil zero" + mapping: | + root = 0.0.ceil() + output: 0 + + - name: "ceil small positive" + mapping: | + root = 0.1.ceil() + output: 1 + + - name: "ceil small negative" + mapping: | + root = (-0.1).ceil() + output: 0 + + - name: "ceil returns float64" + mapping: | + root = 3.2.ceil().type() + output: "number" + + - name: "ceil float32 returns float32" + skip: "V1-divergence: ceil() rejects typed float32 values (expected number value, got number from method float32)" + + - name: "ceil float32 value" + skip: "V1-divergence: ceil() rejects typed float32 values (expected number value, got number from method float32)" + + # --- round: default (V1 rounds away from zero at .5, not half-even) --- + + - name: "round up from 3.7" + mapping: | + root = 3.7.round() + output: 4 + + - name: "round down from 3.2" + mapping: | + root = 3.2.round() + output: 3 + + - name: "round half-even 2.5 rounds to 2" + mapping: | + root = 2.5.round() + output: 3 + + - name: "round half-even 3.5 rounds to 4" + mapping: | + root = 3.5.round() + output: 4 + + - name: "round half-even 0.5 rounds to 0" + mapping: | + root = 0.5.round() + output: 1 + + - name: "round half-even 1.5 rounds to 2" + mapping: | + root = 1.5.round() + output: 2 + + - name: "round half-even 4.5 rounds to 4" + mapping: | + root = 4.5.round() + output: 5 + + - name: "round half-even negative -2.5 rounds to -2" + mapping: | + root = (-2.5).round() + output: -3 + + - name: "round half-even negative -3.5 rounds to -4" + mapping: | + root = (-3.5).round() + output: -4 + + - name: "round returns float64" + mapping: | + root = 3.7.round().type() + output: "number" + + # --- round: positive n (decimal places) — V1 round takes no precision arg --- + + - name: "round to 2 decimal places" + skip: "V1 round() takes no precision argument" + + - name: "round to 1 decimal place" + skip: "V1 round() takes no precision argument" + + - name: "round half-even at 2 decimal places" + skip: "V1 round() takes no precision argument" + + - name: "round half-even at 2 decimal places odd" + skip: "V1 round() takes no precision argument" + + # --- round: negative n --- + + - name: "round to nearest 10" + skip: "V1 round() takes no precision argument" + + - name: "round to nearest 100" + skip: "V1 round() takes no precision argument" + + - name: "round half-even to nearest 100" + skip: "V1 round() takes no precision argument" + + - name: "round half-even to nearest 100 odd" + skip: "V1 round() takes no precision argument" + + - name: "round to nearest 1000" + skip: "V1 round() takes no precision argument" + + - name: "round half-even to nearest 1000" + skip: "V1 round() takes no precision argument" + + # --- round: float32 --- + + - name: "round float32 returns float32" + skip: "V1-divergence: round() rejects typed float32 values (expected number value, got number from method float32)" + + - name: "round float32 value" + skip: "V1-divergence: round() rejects typed float32 values (expected number value, got number from method float32)" + + # --- round: explicit n=0 --- + + - name: "round with explicit n=0 same as default" + skip: "V1 round() takes no precision argument" + + # --- round: named arg --- + + - name: "round with named arg n" + skip: "V1 round() takes no precision argument" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/object_methods.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/object_methods.yaml new file mode 100644 index 000000000..221994db6 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/object_methods.yaml @@ -0,0 +1,220 @@ +description: "Object methods — iter, keys, values, has_key, merge, without" + +tests: + # --- iter (V1: key_values) --- + + - name: "iter produces key-value pairs" + mapping: | + let obj = {"a": 1} + root.result = $obj.key_values() + output: {"result": [{"key": "a", "value": 1}]} + + - name: "iter empty object" + mapping: | + root.result = {}.key_values() + output: {"result": []} + + - name: "iter preserves value types" + mapping: | + let obj = {"s": "hello", "n": 42, "b": true, "nil": null} + let entries = $obj.key_values() + root.len = $entries.length() + output: {"len": 4} + + - name: "iter entries have key and value fields" + mapping: | + let obj = {"x": 10} + let entry = $obj.key_values().index(0) + root.k = $entry.key + root.v = $entry.value + output: {"k": "x", "v": 10} + + - name: "iter result can be used with map" + mapping: | + let obj = {"a": 1, "b": 2} + root.result = $obj.key_values().map_each(e -> e.key + "=" + e.value.string()).sort() + output: {"result": ["a=1", "b=2"]} + + # --- keys --- + + - name: "keys of single-key object" + mapping: | + root.result = {"name": "Alice"}.keys() + output: {"result": ["name"]} + + - name: "keys of empty object" + mapping: | + root.result = {}.keys() + output: {"result": []} + + - name: "keys returns strings" + mapping: | + let obj = {"a": 1} + root.result = $obj.keys().index(0).type() + output: {"result": "string"} + + - name: "keys count matches object length" + mapping: | + let obj = {"a": 1, "b": 2, "c": 3} + root.result = $obj.keys().length() + output: {"result": 3} + + - name: "keys can be sorted for deterministic comparison" + mapping: | + let obj = {"c": 3, "a": 1, "b": 2} + root.result = $obj.keys().sort() + output: {"result": ["a", "b", "c"]} + + # --- values --- + + - name: "values of single-key object" + mapping: | + root.result = {"x": 42}.values() + output: {"result": [42]} + + - name: "values of empty object" + mapping: | + root.result = {}.values() + output: {"result": []} + + - name: "values count matches object length" + mapping: | + let obj = {"a": 1, "b": 2, "c": 3} + root.result = $obj.values().length() + output: {"result": 3} + + - name: "values can be sorted for deterministic comparison" + mapping: | + let obj = {"c": 3, "a": 1, "b": 2} + root.result = $obj.values().sort() + output: {"result": [1, 2, 3]} + + - name: "values preserves types" + mapping: | + let obj = {"s": "hello", "n": 42} + let vals = $obj.values().sort_by(v -> v.type()) + root.types = $vals.map_each(v -> v.type()) + output: {"types": ["number", "string"]} + + # --- has_key (V1: exists) --- + + - name: "has_key returns true for existing key" + mapping: | + root.result = {"name": "Alice"}.exists("name") + output: {"result": true} + + - name: "has_key returns false for missing key" + mapping: | + root.result = {"name": "Alice"}.exists("age") + output: {"result": false} + + - name: "has_key on empty object" + mapping: | + root.result = {}.exists("anything") + output: {"result": false} + + - name: "has_key with null value still returns true" + mapping: | + root.result = {"x": null}.exists("x") + output: {"result": true} + + - name: "has_key checks only top-level" + mapping: | + let obj = {"a": {"b": 1}} + root.top = $obj.exists("a") + root.nested = $obj.exists("b") + output: {"top": true, "nested": false} + + # --- merge (V2's override merge = V1's assign) --- + + - name: "merge two objects" + mapping: | + root.result = {"a": 1}.assign({"b": 2}) + output: {"result": {"a": 1, "b": 2}} + + - name: "merge other wins on conflict" + mapping: | + root.result = {"a": 1, "b": 2}.assign({"b": 99, "c": 3}) + output: {"result": {"a": 1, "b": 99, "c": 3}} + + - name: "merge with empty object" + mapping: | + root.result = {"a": 1}.assign({}) + output: {"result": {"a": 1}} + + - name: "merge empty with non-empty" + mapping: | + root.result = {}.assign({"a": 1}) + output: {"result": {"a": 1}} + + - name: "merge two empty objects" + mapping: | + root.result = {}.assign({}) + output: {"result": {}} + + - name: "merge does not modify original" + mapping: | + let a = {"x": 1} + let b = {"y": 2} + let c = $a.assign($b) + root.a = $a + root.c = $c + output: {"a": {"x": 1}, "c": {"x": 1, "y": 2}} + + - name: "merge nested objects are replaced not deep merged" + mapping: | + let a = {"config": {"host": "localhost", "port": 8080}} + let b = {"config": {"host": "remote"}} + root.result = $a.assign($b) + output: {"result": {"config": {"host": "remote", "port": 8080}}} + + # --- without (V1: variadic string args, not an array) --- + + - name: "without removes specified keys" + mapping: | + root.result = {"a": 1, "b": 2, "c": 3}.without("b") + output: {"result": {"a": 1, "c": 3}} + + - name: "without multiple keys" + mapping: | + root.result = {"a": 1, "b": 2, "c": 3}.without("a", "c") + output: {"result": {"b": 2}} + + - name: "without missing key is ignored" + mapping: | + root.result = {"a": 1, "b": 2}.without("c", "d") + output: {"result": {"a": 1, "b": 2}} + + - name: "without empty key list" + mapping: | + root.result = {"a": 1, "b": 2}.without() + output: {"result": {"a": 1, "b": 2}} + + - name: "without all keys" + mapping: | + root.result = {"a": 1, "b": 2}.without("a", "b") + output: {"result": {}} + + - name: "without on empty object" + mapping: | + root.result = {}.without("a") + output: {"result": {}} + + - name: "without does not modify original" + mapping: | + let obj = {"a": 1, "b": 2, "c": 3} + let new = $obj.without("b") + root.original = $obj + root.new = $new + output: {"original": {"a": 1, "b": 2, "c": 3}, "new": {"a": 1, "c": 3}} + + - name: "without mix of present and missing keys" + mapping: | + root.result = {"a": 1, "b": 2}.without("a", "z") + output: {"result": {"b": 2}} + + - name: "without non-string key in array is error" + skip: "V1 without takes variadic string args; no array argument form" + + - name: "without null key in array is error" + skip: "V1 without takes variadic string args; no array argument form" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/object_transform.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/object_transform.yaml new file mode 100644 index 000000000..2d6a77dfa --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/object_transform.yaml @@ -0,0 +1,153 @@ +description: "Object transform methods — map_values, map_keys, map_entries, filter_entries" + +tests: + # --- map_values (V1: map_each with item.value) --- + + - name: "map_values transforms all values" + mapping: | + root.result = {"a": 1, "b": 2, "c": 3}.map_each(item -> item.value * 10) + output: {"result": {"a": 10, "b": 20, "c": 30}} + + - name: "map_values on empty object" + mapping: | + root.result = {}.map_each(item -> item.value + 1) + output: {"result": {}} + + - name: "map_values can change value types" + mapping: | + root.result = {"a": 1, "b": 2}.map_each(item -> item.value.string()) + output: {"result": {"a": "1", "b": "2"}} + + - name: "map_values void return is error" + skip: "V1 has no 'void' concept; if-without-else returns null" + + - name: "map_values deleted omits entry" + mapping: | + root.result = {"a": 1, "b": 2, "c": 3}.map_each(item -> if item.value > 1 { item.value * 10 } else { deleted() }) + output: {"result": {"b": 20, "c": 30}} + + - name: "map_values does not modify original" + mapping: | + let obj = {"x": 1, "y": 2} + let new = $obj.map_each(item -> item.value + 100) + root.original = $obj + root.new = $new + output: {"original": {"x": 1, "y": 2}, "new": {"x": 101, "y": 102}} + + - name: "map_values with block body" + skip: "V1 lambdas do not support block-with-let syntax" + + # --- map_keys (V1: map_each_key) --- + + - name: "map_keys transforms all keys" + mapping: | + root.result = {"a": 1, "b": 2}.map_each_key(k -> k.uppercase()) + output: {"result": {"A": 1, "B": 2}} + + - name: "map_keys on empty object" + mapping: | + root.result = {}.map_each_key(k -> k + "_suffix") + output: {"result": {}} + + - name: "map_keys adds prefix" + mapping: | + root.result = {"name": "Alice", "age": 30}.map_each_key(k -> "user_" + k) + output: {"result": {"user_name": "Alice", "user_age": 30}} + + - name: "map_keys must return string" + mapping: | + root.result = {"a": 1}.map_each_key(k -> 42) + error: "string" + + - name: "map_keys void return is error" + skip: "V1 has no 'void' concept" + + - name: "map_keys deleted omits entry" + mapping: | + root.result = {"keep": 1, "drop": 2, "also_keep": 3}.map_each(item -> if item.key.has_prefix("drop") { deleted() } else { item.value }) + output: {"result": {"keep": 1, "also_keep": 3}} + + - name: "map_keys does not modify original" + mapping: | + let obj = {"a": 1, "b": 2} + let new = $obj.map_each_key(k -> k.uppercase()) + root.original = $obj + root.new = $new + output: {"original": {"a": 1, "b": 2}, "new": {"A": 1, "B": 2}} + + # --- map_entries (V1 has no equivalent returning {key,value}) --- + + - name: "map_entries transforms keys and values" + skip: "V1 has no map_entries method" + + - name: "map_entries on empty object" + skip: "V1 has no map_entries method" + + - name: "map_entries swap keys and values" + skip: "V1 has no map_entries method" + + - name: "map_entries deleted omits entry" + skip: "V1 has no map_entries method" + + - name: "map_entries void return is error" + skip: "V1 has no map_entries method" + + - name: "map_entries with computed keys" + skip: "V1 has no map_entries method" + + - name: "map_entries does not modify original" + skip: "V1 has no map_entries method" + + - name: "map_entries with block body" + skip: "V1 has no map_entries method" + + # --- filter_entries (V1: filter on object with item.key/item.value) --- + + - name: "filter_entries keeps matching entries" + mapping: | + root.result = {"a": 1, "b": 5, "c": 3}.filter(item -> item.value > 2) + output: {"result": {"b": 5, "c": 3}} + + - name: "filter_entries on empty object" + mapping: | + root.result = {}.filter(item -> true) + output: {"result": {}} + + - name: "filter_entries no matches" + mapping: | + root.result = {"a": 1, "b": 2}.filter(item -> item.value > 100) + output: {"result": {}} + + - name: "filter_entries all match" + mapping: | + root.result = {"a": 1, "b": 2}.filter(item -> item.value > 0) + output: {"result": {"a": 1, "b": 2}} + + - name: "filter_entries by key" + mapping: | + root.result = {"name": "Alice", "age": 30, "note": "test"}.filter(item -> item.key.has_prefix("n")) + output: {"result": {"name": "Alice", "note": "test"}} + + - name: "filter_entries non-bool return is error" + mapping: | + root.result = {"a": 1}.filter(item -> item.value * 2) + output: {"result": {}} + + - name: "filter_entries void return is error" + skip: "V1 has no 'void' concept" + + - name: "filter_entries does not modify original" + mapping: | + let obj = {"a": 1, "b": 2, "c": 3} + let new = $obj.filter(item -> item.value > 1) + root.original = $obj + root.new = $new + output: {"original": {"a": 1, "b": 2, "c": 3}, "new": {"b": 2, "c": 3}} + + - name: "filter_entries combined key and value condition" + mapping: | + root.result = {"x": 10, "y": 20, "z": 5}.filter(item -> item.key != "y" && item.value > 3) + output: {"result": {"x": 10, "z": 5}} + + - name: "filter_entries with block body" + skip: "V1 lambdas do not support block-with-let syntax" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/sequence_methods.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/sequence_methods.yaml new file mode 100644 index 000000000..b5b3a2c6f --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/sequence_methods.yaml @@ -0,0 +1,300 @@ +description: "Sequence methods: .length(), .contains(), .index_of(), .slice(), .reverse() for strings, arrays, and bytes" + +tests: + # --- .length() on strings (V1 is byte-count, not codepoint) --- + + - name: "string length ascii" + mapping: | + root = "hello".length() + output: 5 + + - name: "string length empty" + mapping: | + root = "".length() + output: 0 + + - name: "string length non-ascii codepoint-based" + mapping: | + root = "café".length() + output: 5 + + - name: "string length emoji single codepoint" + mapping: | + root = "\U0001F600".length() + output: 4 + + # --- .length() on arrays --- + + - name: "array length" + mapping: | + root = [1, 2, 3].length() + output: 3 + + - name: "array length empty" + mapping: | + root = [].length() + output: 0 + + - name: "array length nested" + mapping: | + root = [[1, 2], [3]].length() + output: 2 + + # --- .length() on bytes --- + + - name: "bytes length ascii" + mapping: | + root = "hello".bytes().length() + output: 5 + + - name: "bytes length multibyte" + mapping: | + root = "\U0001F600".bytes().length() + output: 4 + + - name: "bytes length empty" + mapping: | + root = "".bytes().length() + output: 0 + + # --- .contains() on strings --- + + - name: "string contains true" + mapping: | + root = "hello world".contains("world") + output: true + + - name: "string contains false" + mapping: | + root = "hello world".contains("xyz") + output: false + + - name: "string contains empty substring" + mapping: | + root = "hello".contains("") + output: true + + - name: "string contains full match" + mapping: | + root = "hello".contains("hello") + output: true + + - name: "string contains empty in empty" + mapping: | + root = "".contains("") + output: true + + # --- .contains() on arrays --- + + - name: "array contains int true" + mapping: | + root = [1, 2, 3].contains(2) + output: true + + - name: "array contains int false" + mapping: | + root = [1, 2, 3].contains(5) + output: false + + - name: "array contains string" + mapping: | + root = ["a", "b", "c"].contains("b") + output: true + + - name: "array contains null" + mapping: | + root = [1, null, 3].contains(null) + output: true + + - name: "array contains empty array" + mapping: | + root = [].contains(1) + output: false + + # --- .contains() on bytes --- + + - name: "bytes contains subsequence true" + mapping: | + root = "hello".bytes().contains("ll".bytes()) + output: true + + - name: "bytes contains subsequence false" + mapping: | + root = "hello".bytes().contains("xyz".bytes()) + output: false + + # --- .index_of() on strings (V1: index_of) --- + + - name: "string index_of found" + mapping: | + root = "hello world".index_of("world") + output: 6 + + - name: "string index_of not found" + mapping: | + root = "hello world".index_of("xyz") + output: -1 + + - name: "string index_of first occurrence" + mapping: | + root = "abcabc".index_of("abc") + output: 0 + + - name: "string index_of empty needle" + mapping: | + root = "hello".index_of("") + output: 0 + + - name: "string index_of codepoint-based" + mapping: | + root = "café!".index_of("!") + output: 5 + + # --- .index_of() on arrays (V1: find returns index) --- + + - name: "array index_of found" + mapping: | + root = [10, 20, 30].find(20).number() + output: 1.0 + + - name: "array index_of not found" + mapping: | + root = [10, 20, 30].find(99).number() + output: -1.0 + + - name: "array index_of first occurrence" + mapping: | + root = [1, 2, 1, 2].find(2).number() + output: 1.0 + + - name: "array index_of string element" + mapping: | + root = ["a", "b", "c"].find("c").number() + output: 2.0 + + # --- .index_of() on bytes --- + + - name: "bytes index_of found" + mapping: | + root = "hello".bytes().index_of("ll".bytes()) + output: 2 + + - name: "bytes index_of not found" + mapping: | + root = "hello".bytes().index_of("xyz".bytes()) + output: -1 + + # --- .slice() on strings --- + + - name: "string slice basic" + mapping: | + root = "hello world".slice(0, 5) + output: "hello" + + - name: "string slice to end" + mapping: | + root = "hello world".slice(6) + output: "world" + + - name: "string slice negative indices" + mapping: | + root = "hello world".slice(-5, -1) + output: "worl" + + - name: "string slice clamped beyond length" + mapping: | + root = "hello".slice(0, 100) + output: "hello" + + - name: "string slice empty result from inverted" + skip: "V1 slice: if low >= high (both positive) raises error" + + - name: "string slice full string" + mapping: | + root = "hello".slice(0) + output: "hello" + + # --- .slice() on arrays --- + + - name: "array slice basic" + mapping: | + root = [10, 20, 30, 40, 50].slice(1, 4) + output: [20, 30, 40] + + - name: "array slice to end" + mapping: | + root = [10, 20, 30].slice(1) + output: [20, 30] + + - name: "array slice negative indices" + mapping: | + root = [10, 20, 30, 40, 50].slice(-3, -1) + output: [30, 40] + + - name: "array slice clamped beyond length" + mapping: | + root = [1, 2, 3].slice(0, 100) + output: [1, 2, 3] + + - name: "array slice empty from inverted" + skip: "V1 slice: if low >= high (both positive) raises error" + + - name: "array slice empty array" + mapping: | + root = [].slice(0) + output: [] + + # --- .slice() on bytes --- + + - name: "bytes slice basic" + skip: "V2-only _type bytes literal shorthand for expected output" + + - name: "bytes slice to end" + skip: "V2-only _type bytes literal shorthand for expected output" + + - name: "bytes slice negative" + skip: "V2-only _type bytes literal shorthand for expected output" + + # --- .reverse() on strings --- + + - name: "string reverse ascii" + mapping: | + root = "hello".reverse() + output: "olleh" + + - name: "string reverse empty" + mapping: | + root = "".reverse() + output: "" + + - name: "string reverse single char" + mapping: | + root = "a".reverse() + output: "a" + + - name: "string reverse palindrome" + mapping: | + root = "racecar".reverse() + output: "racecar" + + # --- .reverse() on arrays --- + + - name: "array reverse" + skip: "V1 .reverse() is string-only; no array reverse method" + + - name: "array reverse empty" + skip: "V1 .reverse() is string-only; no array reverse method" + + - name: "array reverse single element" + skip: "V1 .reverse() is string-only; no array reverse method" + + - name: "array reverse mixed types" + skip: "V1 .reverse() is string-only; no array reverse method" + + # --- .reverse() on bytes --- + + - name: "bytes reverse" + skip: "V2-only _type bytes literal shorthand for expected output" + + - name: "bytes reverse empty" + skip: "V2-only _type bytes literal shorthand for expected output" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/sort_edge_cases.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/sort_edge_cases.yaml new file mode 100644 index 000000000..43e1cdde3 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/sort_edge_cases.yaml @@ -0,0 +1,126 @@ +description: > + Sort edge cases — stable sort, mixed numeric families error, NaN total + ordering, sort_by with complex keys, and sort with various types. + +tests: + # --- Basic stable sort --- + + - name: "sort integers ascending" + mapping: | + root.v = [3, 1, 4, 1, 5, 9, 2, 6].sort() + output: {"v": [1, 1, 2, 3, 4, 5, 6, 9]} + + - name: "sort already sorted" + mapping: | + root.v = [1, 2, 3].sort() + output: {"v": [1, 2, 3]} + + - name: "sort reverse sorted" + mapping: | + root.v = [3, 2, 1].sort() + output: {"v": [1, 2, 3]} + + - name: "sort single element" + mapping: | + root.v = [42].sort() + output: {"v": [42]} + + - name: "sort empty array" + mapping: | + root.v = [].sort() + output: {"v": []} + + - name: "sort strings lexicographic" + mapping: | + root.v = ["banana", "apple", "cherry", "date"].sort() + output: {"v": ["apple", "banana", "cherry", "date"]} + + - name: "sort floats" + mapping: | + root.v = [3.14, 1.41, 2.72].sort() + output: {"v": [1.41, 2.72, 3.14]} + + # --- Mixed numeric types within same family --- + + - name: "sort mixed int32 and int64" + skip: "V1 has no int32() method or distinct int32 type" + + # --- Mixed type families error --- + + - name: "sort mixed strings and integers errors" + mapping: | + root.v = ["a", 1, "b"].sort() + error: "sort" + + - name: "sort mixed booleans and integers errors" + mapping: | + root.v = [true, 1, false].sort() + error: "sort" + + # --- NaN in sort (total ordering: after all numbers) --- + + - name: "sort with special float values" + skip: "V2-only _type NaN/Infinity literal shorthand; V1 has no NaN/Infinity literals" + + # --- sort_by --- + + - name: "sort_by numeric field" + mapping: | + root.v = [ + {"name": "Charlie", "age": 30}, + {"name": "Alice", "age": 25}, + {"name": "Bob", "age": 35}, + ].sort_by(x -> x.age) + output: + v: + - name: "Alice" + age: 25 + - name: "Charlie" + age: 30 + - name: "Bob" + age: 35 + + - name: "sort_by string field" + mapping: | + root.v = [ + {"name": "Charlie"}, + {"name": "Alice"}, + {"name": "Bob"}, + ].sort_by(x -> x.name) + output: + v: + - name: "Alice" + - name: "Bob" + - name: "Charlie" + + - name: "sort_by computed key" + mapping: | + root.v = ["banana", "fig", "apple", "kiwi"].sort_by(s -> s.length()) + output: {"v": ["fig", "kiwi", "apple", "banana"]} + + - name: "sort_by with block body and local variables" + skip: "V1 lambdas do not support block-with-let syntax" + + - name: "sort_by with outer variable capture" + mapping: | + let field = "age" + root.v = [ + {"name": "B", "age": 20}, + {"name": "A", "age": 10}, + ].sort_by(x -> x.get($field)) + output: + v: + - name: "A" + age: 10 + - name: "B" + age: 20 + + # --- sort does not modify original --- + + - name: "sort returns new array" + mapping: | + let arr = [3, 1, 2] + let sorted = $arr.sort() + root.original = $arr + root.sorted = $sorted + output: {"original": [3, 1, 2], "sorted": [1, 2, 3]} diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/string_methods.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/string_methods.yaml new file mode 100644 index 000000000..8c30e315e --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/string_methods.yaml @@ -0,0 +1,267 @@ +description: "String methods: uppercase, lowercase, trim, trim_prefix, trim_suffix, has_prefix, has_suffix, split, replace_all, repeat" + +tests: + # --- .uppercase() --- + + - name: "uppercase basic" + mapping: | + root = "hello world".uppercase() + output: "HELLO WORLD" + + - name: "uppercase already uppercase" + mapping: | + root = "HELLO".uppercase() + output: "HELLO" + + - name: "uppercase empty string" + mapping: | + root = "".uppercase() + output: "" + + - name: "uppercase mixed case" + mapping: | + root = "hElLo".uppercase() + output: "HELLO" + + - name: "uppercase non-ascii" + mapping: | + root = "café".uppercase() + output: "CAFÉ" + + # --- .lowercase() --- + + - name: "lowercase basic" + mapping: | + root = "HELLO WORLD".lowercase() + output: "hello world" + + - name: "lowercase already lowercase" + mapping: | + root = "hello".lowercase() + output: "hello" + + - name: "lowercase empty string" + mapping: | + root = "".lowercase() + output: "" + + - name: "lowercase mixed case" + mapping: | + root = "HeLLo".lowercase() + output: "hello" + + # --- .trim() --- + + - name: "trim spaces" + mapping: | + root = " hello ".trim() + output: "hello" + + - name: "trim tabs and newlines" + mapping: | + root = "\t\nhello\n\t".trim() + output: "hello" + + - name: "trim no whitespace unchanged" + mapping: | + root = "hello".trim() + output: "hello" + + - name: "trim empty string" + mapping: | + root = "".trim() + output: "" + + - name: "trim only whitespace" + mapping: | + root = " \t\n ".trim() + output: "" + + # --- .trim_prefix() --- + + - name: "trim_prefix present" + mapping: | + root = "hello world".trim_prefix("hello ") + output: "world" + + - name: "trim_prefix absent" + mapping: | + root = "hello world".trim_prefix("xyz") + output: "hello world" + + - name: "trim_prefix empty prefix" + mapping: | + root = "hello".trim_prefix("") + output: "hello" + + - name: "trim_prefix entire string" + mapping: | + root = "hello".trim_prefix("hello") + output: "" + + - name: "trim_prefix only removes first occurrence" + mapping: | + root = "aaa".trim_prefix("a") + output: "aa" + + # --- .trim_suffix() --- + + - name: "trim_suffix present" + mapping: | + root = "hello world".trim_suffix(" world") + output: "hello" + + - name: "trim_suffix absent" + mapping: | + root = "hello world".trim_suffix("xyz") + output: "hello world" + + - name: "trim_suffix empty suffix" + mapping: | + root = "hello".trim_suffix("") + output: "hello" + + - name: "trim_suffix entire string" + mapping: | + root = "hello".trim_suffix("hello") + output: "" + + # --- .has_prefix() --- + + - name: "has_prefix true" + mapping: | + root = "hello world".has_prefix("hello") + output: true + + - name: "has_prefix false" + mapping: | + root = "hello world".has_prefix("world") + output: false + + - name: "has_prefix empty prefix is always true" + mapping: | + root = "hello".has_prefix("") + output: true + + - name: "has_prefix exact match" + mapping: | + root = "hello".has_prefix("hello") + output: true + + # --- .has_suffix() --- + + - name: "has_suffix true" + mapping: | + root = "hello world".has_suffix("world") + output: true + + - name: "has_suffix false" + mapping: | + root = "hello world".has_suffix("hello") + output: false + + - name: "has_suffix empty suffix is always true" + mapping: | + root = "hello".has_suffix("") + output: true + + - name: "has_suffix exact match" + mapping: | + root = "hello".has_suffix("hello") + output: true + + # --- .split() --- + + - name: "split by comma" + mapping: | + root = "a,b,c".split(",") + output: ["a", "b", "c"] + + - name: "split no delimiter found" + mapping: | + root = "hello".split(",") + output: ["hello"] + + - name: "split empty string by comma" + mapping: | + root = "".split(",") + output: [""] + + - name: "split empty string by empty string" + mapping: | + root = "".split("") + output: [] + + - name: "split by empty string produces codepoints" + mapping: | + root = "hello".split("") + output: ["h", "e", "l", "l", "o"] + + - name: "split by empty string with non-ascii" + mapping: | + root = "café".split("") + output: ["c", "a", "f", "é"] + + - name: "split with multi-char delimiter" + mapping: | + root = "one::two::three".split("::") + output: ["one", "two", "three"] + + - name: "split trailing delimiter" + mapping: | + root = "a,b,".split(",") + output: ["a", "b", ""] + + # --- .replace_all() --- + + - name: "replace_all basic" + mapping: | + root = "hello world".replace_all("world", "earth") + output: "hello earth" + + - name: "replace_all multiple occurrences" + mapping: | + root = "aabaa".replace_all("a", "x") + output: "xxbxx" + + - name: "replace_all no match" + mapping: | + root = "hello".replace_all("xyz", "abc") + output: "hello" + + - name: "replace_all empty old with new" + mapping: | + root = "ab".replace_all("", "-") + output: "-a-b-" + + - name: "replace_all remove substring" + mapping: | + root = "hello world".replace_all(" world", "") + output: "hello" + + # --- .repeat() --- + + - name: "repeat basic" + mapping: | + root = "ab".repeat(3) + output: "ababab" + + - name: "repeat zero times" + mapping: | + root = "hello".repeat(0) + output: "" + + - name: "repeat one time" + mapping: | + root = "hello".repeat(1) + output: "hello" + + - name: "repeat negative count is error" + mapping: | + root = "hello".repeat(-1) + compile_error: "count" + + - name: "repeat empty string" + mapping: | + root = "".repeat(5) + output: "" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/string_regex.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/string_regex.yaml new file mode 100644 index 000000000..74c35a5bf --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/string_regex.yaml @@ -0,0 +1,158 @@ +description: "String regex methods: re_match, re_find_all, re_replace_all" + +tests: + # --- .re_match() --- + + - name: "re_match simple match" + mapping: | + root = "hello world".re_match("world") + output: true + + - name: "re_match no match" + mapping: | + root = "hello world".re_match("xyz") + output: false + + - name: "re_match partial match returns true" + mapping: | + root = "foobar".re_match("oba") + output: true + + - name: "re_match anchored start" + mapping: | + root = "hello world".re_match("^hello") + output: true + + - name: "re_match anchored start no match" + mapping: | + root = "hello world".re_match("^world") + output: false + + - name: "re_match anchored end" + mapping: | + root = "hello world".re_match("world$") + output: true + + - name: "re_match full anchor" + mapping: | + root = "hello".re_match("^hello$") + output: true + + - name: "re_match digit pattern" + mapping: | + root = "abc123def".re_match("[0-9]+") + output: true + + - name: "re_match digit pattern no match" + mapping: | + root = "abcdef".re_match("[0-9]+") + output: false + + - name: "re_match empty pattern always matches" + mapping: | + root = "anything".re_match("") + output: true + + - name: "re_match empty string with empty pattern" + mapping: | + root = "".re_match("") + output: true + + - name: "re_match character class" + mapping: | + root = "Hello World".re_match("[A-Z][a-z]+") + output: true + + # --- .re_find_all() --- + + - name: "re_find_all basic" + mapping: | + root = "cat bat rat".re_find_all("[a-z]at") + output: ["cat", "bat", "rat"] + + - name: "re_find_all no matches" + mapping: | + root = "hello world".re_find_all("[0-9]+") + output: [] + + - name: "re_find_all digits" + mapping: | + root = "a1b22c333".re_find_all("[0-9]+") + output: ["1", "22", "333"] + + - name: "re_find_all overlapping non-overlap" + mapping: | + root = "aaa".re_find_all("aa") + output: ["aa"] + + - name: "re_find_all single char pattern" + mapping: | + root = "abcabc".re_find_all("a") + output: ["a", "a"] + + - name: "re_find_all empty pattern" + mapping: | + root = "ab".re_find_all("") + output: ["", "", ""] + + - name: "re_find_all word boundaries" + mapping: | + root = "foo bar baz".re_find_all("\\b[a-z]+\\b") + output: ["foo", "bar", "baz"] + + - name: "re_find_all capture groups return full match" + mapping: | + root = "2024-03-01".re_find_all("([0-9]{4})-([0-9]{2})-([0-9]{2})") + output: ["2024-03-01"] + + # --- .re_replace_all() --- + + - name: "re_replace_all basic" + mapping: | + root = "hello world".re_replace_all("world", "earth") + output: "hello earth" + + - name: "re_replace_all with pattern" + mapping: | + root = "abc123def456".re_replace_all("[0-9]+", "NUM") + output: "abcNUMdefNUM" + + - name: "re_replace_all no match unchanged" + mapping: | + root = "hello".re_replace_all("[0-9]+", "X") + output: "hello" + + - name: "re_replace_all backreference $0" + mapping: | + root = "hello world".re_replace_all("[a-z]+", "[$0]") + output: "[hello] [world]" + + - name: "re_replace_all capture group $1" + mapping: | + root = "2024-03-01".re_replace_all("([0-9]{4})-([0-9]{2})-([0-9]{2})", "$2/$3/$1") + output: "03/01/2024" + + - name: "re_replace_all named group" + mapping: | + root = "hello world".re_replace_all("(?P[a-z]+)", "${word}!") + output: "hello! world!" + + - name: "re_replace_all remove matches" + mapping: | + root = "a1b2c3".re_replace_all("[0-9]", "") + output: "abc" + + - name: "re_replace_all entire string" + mapping: | + root = "hello".re_replace_all("^.*$", "replaced") + output: "replaced" + + - name: "re_replace_all empty string match" + mapping: | + root = "ab".re_replace_all("", "-") + output: "-a-b-" + + - name: "re_replace_all special regex chars in replacement" + mapping: | + root = "hello world".re_replace_all("world", "w.o.r.l.d") + output: "hello w.o.r.l.d" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/timestamp_methods.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/timestamp_methods.yaml new file mode 100644 index 000000000..c6340d005 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/timestamp_methods.yaml @@ -0,0 +1,188 @@ +description: "Timestamp methods: ts_parse, ts_format, ts_unix*, ts_from_unix*, ts_add" + +tests: + # --- ts_parse (V1 requires explicit format arg; no default) --- + + - name: "ts_parse default format RFC 3339" + skip: "migrator V1 core env lacks ts_parse (lives in internal/impl/pure, not loaded by the test harness)" + + - name: "ts_parse with fractional seconds" + skip: "migrator V1 core env lacks ts_parse" + + - name: "ts_parse with timezone offset" + skip: "migrator V1 core env lacks ts_parse" + + - name: "ts_parse with negative offset" + skip: "migrator V1 core env lacks ts_parse" + + - name: "ts_parse custom format date only" + skip: "migrator V1 core env lacks ts_strptime/ts_format" + + - name: "ts_parse invalid string errors" + skip: "migrator V1 core env lacks ts_strptime" + + - name: "ts_parse returns timestamp type" + skip: "migrator V1 core env lacks ts_parse" + + - name: "ts_parse named format arg" + skip: "migrator V1 core env lacks ts_strptime/ts_format" + + # --- ts_format (V1 requires explicit layout; no default) --- + + - name: "ts_format default RFC 3339" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_format custom date only" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_format with fractional seconds" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_format whole seconds omit fraction" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_format matches string conversion" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_format named format arg" + skip: "V1 has no timestamp() function constructor" + + # --- ts_unix (V1 has ts_unix but V2 uses timestamp() constructor) --- + + - name: "ts_unix returns epoch seconds" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix returns int64" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix epoch zero" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix truncates sub-second" + skip: "V1 has no timestamp() function constructor" + + # --- ts_unix_milli --- + + - name: "ts_unix_milli returns epoch millis" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix_milli with fractional seconds" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix_milli returns int64" + skip: "V1 has no timestamp() function constructor" + + # --- ts_unix_micro --- + + - name: "ts_unix_micro returns epoch micros" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix_micro with fractional seconds" + skip: "V1 has no timestamp() function constructor" + + # --- ts_unix_nano --- + + - name: "ts_unix_nano returns epoch nanos" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix_nano with full precision" + skip: "V1 has no timestamp() function constructor" + + - name: "ts_unix_nano returns int64" + skip: "V1 has no timestamp() function constructor" + + # --- ts_from_unix (V1 has no ts_from_unix methods) --- + + - name: "ts_from_unix integer seconds" + skip: "V1 has no ts_from_unix method" + + - name: "ts_from_unix float sub-second" + skip: "V1 has no ts_from_unix method" + + - name: "ts_from_unix epoch zero" + skip: "V1 has no ts_from_unix method" + + - name: "ts_from_unix returns timestamp type" + skip: "V1 has no ts_from_unix method" + + - name: "ts_from_unix non-numeric receiver is error" + skip: "V1 has no ts_from_unix method" + + - name: "ts_from_unix bool receiver is error" + skip: "V1 has no ts_from_unix method" + + # --- ts_from_unix_milli --- + + - name: "ts_from_unix_milli whole seconds" + skip: "V1 has no ts_from_unix_milli method" + + - name: "ts_from_unix_milli with millis" + skip: "V1 has no ts_from_unix_milli method" + + # --- ts_from_unix_micro --- + + - name: "ts_from_unix_micro whole seconds" + skip: "V1 has no ts_from_unix_micro method" + + - name: "ts_from_unix_micro with micros" + skip: "V1 has no ts_from_unix_micro method" + + # --- ts_from_unix_nano --- + + - name: "ts_from_unix_nano whole seconds" + skip: "V1 has no ts_from_unix_nano method" + + - name: "ts_from_unix_nano with full precision" + skip: "V1 has no ts_from_unix_nano method" + + # --- Round-trips --- + + - name: "ts_unix round-trip" + skip: "V1 has no timestamp() function constructor or ts_from_unix method" + + - name: "ts_unix_milli round-trip" + skip: "V1 has no timestamp() function constructor or ts_from_unix_milli method" + + - name: "ts_unix_micro round-trip" + skip: "V1 has no timestamp() function constructor or ts_from_unix_micro method" + + - name: "ts_unix_nano round-trip lossless" + skip: "V1 has no timestamp() function constructor or ts_from_unix_nano method" + + - name: "ts_parse ts_format round-trip" + skip: "migrator V1 core env lacks ts_parse/ts_format" + + # --- ts_add (V1 has ts_add_iso8601, not ts_add) --- + + - name: "ts_add one second" + skip: "V1 has no timestamp() function constructor, no ts_add (has ts_add_iso8601), no second() constant" + + - name: "ts_add one minute" + skip: "V1 has no timestamp() function constructor, no ts_add, no minute() constant" + + - name: "ts_add one hour" + skip: "V1 has no timestamp() function constructor, no ts_add, no hour() constant" + + - name: "ts_add one day" + skip: "V1 has no timestamp() function constructor, no ts_add, no day() constant" + + - name: "ts_add negative subtracts" + skip: "V1 has no timestamp() function constructor, no ts_add, no second() constant" + + - name: "ts_add negative crosses day boundary" + skip: "V1 has no timestamp() function constructor, no ts_add, no second() constant" + + - name: "ts_add multiple hours" + skip: "V1 has no timestamp() function constructor, no ts_add, no hour() constant" + + - name: "ts_add compound duration" + skip: "V1 has no timestamp() function constructor, no ts_add, no hour()/minute() constants" + + - name: "ts_add zero is identity" + skip: "V1 has no timestamp() function constructor or ts_add method" + + - name: "ts_add named arg" + skip: "V1 has no timestamp() function constructor, no ts_add, no second() constant" + + - name: "ts_add crosses leap day" + skip: "V1 has no timestamp() function constructor, no ts_add, no day() constant" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/type_conversion.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/type_conversion.yaml new file mode 100644 index 000000000..6824dcf2b --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/type_conversion.yaml @@ -0,0 +1,276 @@ +description: "Type conversion methods: .string(), .int32(), .int64(), .uint32(), .uint64(), .float32(), .float64(), .bool(), .char(), .bytes()" + +tests: + # --- .string() --- + + - name: "string from int64" + mapping: | + root = 42.string() + output: "42" + + - name: "string from negative int64" + mapping: | + root = (-100).string() + output: "-100" + + - name: "string from float64" + mapping: | + root = 3.14.string() + output: "3.14" + + - name: "string from float64 whole number" + mapping: | + root = 5.0.string() + output: "5" + + - name: "string from negative zero" + skip: "V2-only _type negative-zero literal shorthand" + + - name: "string from float32" + mapping: | + root = 3.14.float32().string() + output: "3.14" + + - name: "string from float32 whole number" + skip: "V1-divergence: float32 values serialise as plain numbers; 5.0.float32().string() => \"5\" not \"5.0\"" + + - name: "string from nan" + skip: "V2-only _type NaN literal shorthand; V1 has no NaN literal" + + - name: "string from bool true" + mapping: | + root = true.string() + output: "true" + + - name: "string from bool false" + mapping: | + root = false.string() + output: "false" + + - name: "string from null" + mapping: | + root = null.string() + output: "null" + + - name: "string from timestamp" + skip: "V1 has no timestamp() function constructor" + + - name: "string from array compact json" + mapping: | + root = [1, 2, 3].string() + output: "[1,2,3]" + + - name: "string from object keys sorted" + mapping: | + root = {"b": 2, "a": 1}.string() + output: "{\"a\":1,\"b\":2}" + + - name: "string from string is identity" + mapping: | + root = "hello".string() + output: "hello" + + - name: "string from bytes valid utf8" + mapping: | + root = "hello".bytes().string() + output: "hello" + + # --- .int32() (V1 uses .number() which returns int64/float64) --- + + - name: "int32 from int64" + skip: "V1 has no int32() method or distinct int32 type" + + - name: "int32 from float64 truncates" + skip: "V1 has no int32() method" + + - name: "int32 from negative float truncates toward zero" + skip: "V1 has no int32() method" + + - name: "int32 from string" + skip: "V1 has no int32() method" + + - name: "int32 overflow positive" + skip: "V1 has no int32() method" + + - name: "int32 overflow negative" + skip: "V1 has no int32() method" + + # --- .int64() (V1 uses .number() which returns int64/float64) --- + + - name: "int64 from float64 truncates" + skip: "V1 has no int64() method; .number() preserves fractional part" + + - name: "int64 from negative float truncates toward zero" + skip: "V1 has no int64() method; .number() preserves fractional part" + + - name: "int64 from string" + mapping: | + root = "-500".number() + output: -500.0 + + - name: "int64 from invalid string" + mapping: | + root = "abc".number() + error: "ParseFloat" + + - name: "int64 identity" + mapping: | + root = 42.number() + output: 42.0 + + # --- .uint32() --- + + - name: "uint32 from int64" + skip: "V1 has no uint32() method or distinct uint32 type" + + - name: "uint32 from float truncates toward zero" + skip: "V1 has no uint32() method" + + - name: "uint32 negative is error" + skip: "V1 has no uint32() method" + + - name: "uint32 overflow" + skip: "V1 has no uint32() method" + + - name: "uint32 from string" + skip: "V1 has no uint32() method" + + # --- .uint64() --- + + - name: "uint64 from int64" + skip: "V1 has no uint64() method or distinct uint64 type" + + - name: "uint64 from float truncates toward zero" + skip: "V1 has no uint64() method" + + - name: "uint64 negative is error" + skip: "V1 has no uint64() method" + + - name: "uint64 from string" + skip: "V1 has no uint64() method" + + # --- .float32() --- + + - name: "float32 from int64" + skip: "V1-divergence: float32() returns plain number; V2 {_type: \"float32\", value: \"42.0\"} wrapper has no V1 equivalent" + + - name: "float32 from float64" + skip: "V1 has no float32() method" + + - name: "float32 from string" + skip: "V1 has no float32() method" + + - name: "float32 from invalid string" + skip: "V1 has no float32() method" + + # --- .float64() (V1 uses .number()) --- + + - name: "float64 from int64" + mapping: | + root = 42.number() + output: 42.0 + + - name: "float64 from string" + mapping: | + root = "3.14".number() + output: 3.14 + + - name: "float64 from bool is error" + mapping: | + root = true.number() + error: "number" + + - name: "float64 from invalid string" + mapping: | + root = "xyz".number() + error: "ParseFloat" + + # --- .bool() --- + + - name: "bool from true" + mapping: | + root = true.bool() + output: true + + - name: "bool from false" + mapping: | + root = false.bool() + output: false + + - name: "bool from string true" + mapping: | + root = "true".bool() + output: true + + - name: "bool from string false" + mapping: | + root = "false".bool() + output: false + + - name: "bool from int zero is false" + mapping: | + root = 0.bool() + output: false + + - name: "bool from int nonzero is true" + mapping: | + root = 1.bool() + output: true + + - name: "bool from negative int is true" + mapping: | + root = (-5).bool() + output: true + + - name: "bool from float zero is false" + mapping: | + root = 0.0.bool() + output: false + + - name: "bool from negative zero is false" + skip: "V2-only _type negative-zero literal shorthand" + + - name: "bool from infinity is true" + skip: "V2-only _type Infinity literal; V1 has no Infinity literal" + + - name: "bool from nan is error" + skip: "V2-only _type NaN literal; V1 has no NaN literal" + + - name: "bool from invalid string is error" + mapping: | + root = "maybe".bool() + error: "expected bool" + + # --- .char() --- + + - name: "char from ascii codepoint" + skip: "V1 has no char() method" + + - name: "char from emoji codepoint" + skip: "V1 has no char() method" + + - name: "char from non-ascii codepoint" + skip: "V1 has no char() method" + + - name: "char from zero codepoint" + skip: "V1 has no char() method" + + - name: "char from invalid codepoint" + skip: "V1 has no char() method" + + # --- .bytes() (V1 bytes() returns a []byte; equality assertion differs) --- + + - name: "bytes from string" + skip: "V2-only _type bytes literal shorthand for expected output" + + - name: "bytes from bytes identity" + skip: "V2-only _type bytes literal shorthand for expected output" + + - name: "bytes from int goes through string" + skip: "V2-only _type bytes literal shorthand for expected output" + + - name: "bytes from bool goes through string" + skip: "V2-only _type bytes literal shorthand for expected output" + + - name: "bytes from empty string" + skip: "V2-only _type bytes literal shorthand for expected output" diff --git a/internal/bloblang2/migrator/v1spec/tests/stdlib/unique_flatten.yaml b/internal/bloblang2/migrator/v1spec/tests/stdlib/unique_flatten.yaml new file mode 100644 index 000000000..10225b7ca --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/stdlib/unique_flatten.yaml @@ -0,0 +1,94 @@ +description: > + .unique() and .flatten() methods — deduplication and one-level flattening. + +tests: + # --- unique basic --- + + - name: "unique integers" + mapping: | + root.v = [1, 2, 3, 2, 1].unique() + output: {"v": [1, 2, 3]} + + - name: "unique strings" + mapping: | + root.v = ["a", "b", "a", "c", "b"].unique() + output: {"v": ["a", "b", "c"]} + + - name: "unique preserves first occurrence order" + mapping: | + root.v = [3, 1, 2, 1, 3, 2].unique() + output: {"v": [3, 1, 2]} + + - name: "unique on empty array" + mapping: | + root.v = [].unique() + output: {"v": []} + + - name: "unique single element" + mapping: | + root.v = [42].unique() + output: {"v": [42]} + + - name: "unique all same elements" + mapping: | + root.v = [5, 5, 5, 5].unique() + output: {"v": [5]} + + - name: "unique booleans" + skip: "V1 unique only accepts string/number arrays (boolean elements cause error)" + + - name: "unique with null values" + skip: "V1 unique only accepts string/number arrays (null elements cause error)" + + # --- unique with key function --- + + - name: "unique with key function on objects" + mapping: | + root.v = [ + {"id": 1, "name": "Alice"}, + {"id": 2, "name": "Bob"}, + {"id": 1, "name": "Alice2"}, + ].unique(x -> x.id) + output: {"v": [{"id": 1, "name": "Alice"}, {"id": 2, "name": "Bob"}]} + + - name: "unique with key function — string length" + mapping: | + root.v = ["hi", "hey", "yo", "sup"].unique(s -> s.length()) + output: {"v": ["hi", "hey"]} + + # --- flatten basic --- + + - name: "flatten nested arrays one level" + mapping: | + root.v = [[1, 2], [3, 4], [5]].flatten() + output: {"v": [1, 2, 3, 4, 5]} + + - name: "flatten empty inner arrays" + mapping: | + root.v = [[], [1], [], [2, 3], []].flatten() + output: {"v": [1, 2, 3]} + + - name: "flatten empty outer array" + mapping: | + root.v = [].flatten() + output: {"v": []} + + - name: "flatten non-array elements kept as-is" + mapping: | + root.v = [1, [2, 3], 4, [5]].flatten() + output: {"v": [1, 2, 3, 4, 5]} + + - name: "flatten only goes one level deep" + mapping: | + root.v = [[[1, 2]], [[3]]].flatten() + output: {"v": [[1, 2], [3]]} + + - name: "flatten single nested array" + mapping: | + root.v = [[1, 2, 3]].flatten() + output: {"v": [1, 2, 3]} + + - name: "flatten preserves non-array types in mixed" + mapping: | + root.v = ["a", ["b", "c"], "d"].flatten() + output: {"v": ["a", "b", "c", "d"]} diff --git a/internal/bloblang2/migrator/v1spec/tests/types/array.yaml b/internal/bloblang2/migrator/v1spec/tests/types/array.yaml new file mode 100644 index 000000000..0a3cb49ad --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/array.yaml @@ -0,0 +1,172 @@ +description: "Array literals, indexing, nesting, trailing commas, and methods" + +tests: + # --- Literals --- + + - name: "empty array literal" + mapping: | + root.arr = [] + output: {"arr": []} + + - name: "single element array" + mapping: | + root.arr = [42] + output: {"arr": [42]} + + - name: "multi-element array" + mapping: | + root.arr = [1, 2, 3] + output: {"arr": [1, 2, 3]} + + - name: "mixed type array" + mapping: | + root.arr = [1, "two", true, null, 3.14] + output: {"arr": [1, "two", true, null, 3.14]} + + - name: "trailing comma allowed" + mapping: | + root.arr = [1, 2, 3,] + output: {"arr": [1, 2, 3]} + + - name: "nested arrays" + mapping: | + root.arr = [[1, 2], [3, 4]] + output: {"arr": [[1, 2], [3, 4]]} + + - name: "array with expressions" + mapping: | + let x = 10 + root.arr = [$x, $x + 1, $x * 2] + output: {"arr": [10, 11, 20]} + + - name: "deeply nested arrays" + mapping: | + root.arr = [[[1]]] + output: {"arr": [[[1]]]} + + # --- Indexing --- + + - name: "index first element" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(0) + output: {"v": 10} + + - name: "index last element positive" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(2) + output: {"v": 30} + + - name: "negative index last element" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-1) + output: {"v": 30} + + - name: "negative index second to last" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-2) + output: {"v": 20} + + - name: "negative index first element" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-3) + output: {"v": 10} + + - name: "out of bounds positive index" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(3) + # FIXME-v1: verify runtime output — V1 `.index()` returns null on OOB, not an error + skip: "V1 .index() out-of-range returns null, not an error" + + - name: "out of bounds negative index" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(-4) + skip: "V1 .index() out-of-range returns null, not an error" + + - name: "index empty array" + mapping: | + let arr = [] + root.v = $arr.index(0) + skip: "V1 .index() out-of-range returns null, not an error" + + - name: "index with float whole number accepted" + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(2.0) + # FIXME-v1: verify — V1 ParamInt64 coercion accepts whole floats + output: {"v": 30} + + - name: "index with non-whole float is error" + # V1 `.index()` coerces non-whole floats to int (1.5 → 1); it does not reject them. + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index(1.5) + output: {"v": 20} + + - name: "index with string is error" + # V1 rejects a string argument to `.index()` at compile time (literal-type check). + mapping: | + let arr = [10, 20, 30] + root.v = $arr.index("0") + compile_error: "expected number value" + + - name: "nested array indexing" + mapping: | + let arr = [[1, 2], [3, 4]] + root.v = $arr.index(1).index(0) + output: {"v": 3} + + # --- Length --- + + - name: "length of empty array" + mapping: | + root.len = [].length() + output: {"len": 0} + + - name: "length of non-empty array" + mapping: | + root.len = [1, 2, 3].length() + output: {"len": 3} + + # --- Type --- + + - name: "array type" + mapping: | + root.t = [1, 2, 3].type() + output: {"t": "array"} + + - name: "empty array type" + mapping: | + root.t = [].type() + output: {"t": "array"} + + # --- Array from input --- + + - name: "array from input field" + input: {"items": [10, 20, 30]} + mapping: | + root.first = this.items.index(0) + root.last = this.items.index(-1) + root.len = this.items.length() + output: {"first": 10, "last": 30, "len": 3} + + # --- Deleted in array literal --- + + - name: "deleted in array literal removes element" + mapping: | + root.arr = [1, deleted(), 3] + # FIXME-v1: verify — V1 may retain `null` rather than elide + skip: "V1 deleted() semantics inside array literals are not equivalent to V2" + + # --- Void in array literal is error --- + + - name: "void in array literal is error" + mapping: | + root.arr = [1, if false { 2 }, 3] + skip: "V2-only: V1 has no 'void' concept; if-without-else yields null, not a void error" diff --git a/internal/bloblang2/migrator/v1spec/tests/types/bool_null.yaml b/internal/bloblang2/migrator/v1spec/tests/types/bool_null.yaml new file mode 100644 index 000000000..19e116644 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/bool_null.yaml @@ -0,0 +1,255 @@ +description: "Boolean and null literals, equality, type checking" + +tests: + # --- Boolean literals --- + + - name: "true literal" + mapping: | + root = true + output: true + + - name: "false literal" + mapping: | + root = false + output: false + + - name: "true type" + mapping: | + root = true.type() + output: "bool" + + - name: "false type" + mapping: | + root = false.type() + output: "bool" + + # --- Boolean equality --- + + - name: "true equals true" + mapping: | + root = true == true + output: true + + - name: "false equals false" + mapping: | + root = false == false + output: true + + - name: "true not equals false" + mapping: | + root = true != false + output: true + + - name: "true equals false is false" + mapping: | + root = true == false + output: false + + # --- Boolean logical operators --- + + - name: "logical and true true" + mapping: | + root = true && true + output: true + + - name: "logical and true false" + mapping: | + root = true && false + output: false + + - name: "logical or false true" + mapping: | + root = false || true + output: true + + - name: "logical or false false" + mapping: | + root = false || false + output: false + + - name: "logical not true" + mapping: | + root = !true + output: false + + - name: "logical not false" + mapping: | + root = !false + output: true + + # --- Boolean cross-type equality --- + + - name: "bool equals int is false" + # V1 `==` coerces bool to int (true=1, false=0) when comparing to a number — `true == 1` is TRUE. + mapping: | + root = true == 1 + output: true + + - name: "bool equals string is false" + mapping: | + root = true == "true" + output: false + + - name: "bool equals null is false" + mapping: | + root = false == null + output: false + + # --- Boolean conversions --- + + - name: "bool from string true" + mapping: | + root = "true".bool() + output: true + + - name: "bool from string false" + mapping: | + root = "false".bool() + output: false + + - name: "bool from int64 nonzero" + mapping: | + root = 1.bool() + output: true + + - name: "bool from int64 zero" + mapping: | + root = 0.bool() + output: false + + - name: "bool from float64 nonzero" + mapping: | + root = 3.14.bool() + output: true + + - name: "bool from float64 zero" + mapping: | + root = 0.0.bool() + output: false + + - name: "bool to string true" + mapping: | + root = true.string() + output: "true" + + - name: "bool to string false" + mapping: | + root = false.string() + output: "false" + + - name: "bool to int64 is error" + mapping: | + root = true.number() + # FIXME-v1: verify — V1 .number() on bool is a type error + error: "expected number" + + # --- Boolean errors with non-boolean operators --- + + - name: "bool arithmetic is error" + # V1 literal-operand arithmetic type checks happen at compile time. + mapping: | + root = true + false + compile_error: "cannot add" + + - name: "bool comparison is error" + # V1 literal-operand comparison type checks happen at compile time. + mapping: | + root = true > false + compile_error: "cannot compare" + + - name: "logical and with non-bool is error" + # V1 `&&` is lenient — it short-circuits on the boolean-coerced truthiness of the LHS without + # type-checking the RHS. `true && 1` evaluates to `true`, no error. + mapping: | + root = true && 1 + output: true + + - name: "logical or with non-bool is error" + mapping: | + root = false || "yes" + # FIXME-v1: verify error substring + error: "bool" + + - name: "logical not with non-bool is error" + mapping: | + root = !42 + # FIXME-v1: verify error substring + error: "bool" + + # --- Null literal --- + + - name: "null literal" + mapping: | + root = null + output: null + + - name: "null type" + mapping: | + root = null.type() + output: "null" + + # --- Null equality --- + + - name: "null equals null" + mapping: | + root = null == null + output: true + + - name: "null not equals null" + mapping: | + root = null != null + output: false + + - name: "null equals zero is false" + mapping: | + root = null == 0 + output: false + + - name: "null equals empty string is false" + mapping: | + root = null == "" + output: false + + - name: "null equals false is false" + mapping: | + root = null == false + output: false + + # --- Null errors in operations --- + + - name: "null arithmetic is error" + # V1 literal-operand arithmetic type checks happen at compile time. + mapping: | + root = null + 5 + compile_error: "cannot add" + + - name: "null comparison is error" + # V1 literal-operand comparison type checks happen at compile time. + mapping: | + root = null > 5 + compile_error: "cannot compare" + + - name: "null method call is error" + mapping: | + root = null.uppercase() + # FIXME-v1: verify error substring + error: "null" + + # --- Null string conversion --- + + - name: "null to string" + mapping: | + root = null.string() + output: "null" + + # --- Null with .or() --- + + - name: "null or default" + mapping: | + root = null.or("default") + output: "default" + + - name: "non-null or default returns value" + mapping: | + root = "hello".or("default") + output: "hello" diff --git a/internal/bloblang2/migrator/v1spec/tests/types/bytes.yaml b/internal/bloblang2/migrator/v1spec/tests/types/bytes.yaml new file mode 100644 index 000000000..08d02bf6c --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/bytes.yaml @@ -0,0 +1,274 @@ +description: "Bytes type: creation from strings, byte-level operations, encoding" + +tests: + # --- Bytes creation --- + + - name: "bytes from string" + mapping: | + root = "hello".bytes() + # FIXME-v1: verify — V1 emits bytes as base64-encoded string when JSON-marshalled; `_type` marker is V2-specific + skip: "V2-only output marker {_type: bytes, value: ...} — V1 renders bytes differently" + + - name: "bytes type check" + mapping: | + root = "hello".bytes().type() + output: "bytes" + + - name: "bytes from empty string" + mapping: | + root = "".bytes() + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes from bytes is unchanged" + mapping: | + root = "hello".bytes().bytes() + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes from integer goes through string" + mapping: | + root = 42.bytes() + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes from bool goes through string" + mapping: | + root = true.bytes() + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes from null goes through string" + mapping: | + root = null.bytes() + skip: "V2-only output marker {_type: bytes, value: ...}" + + # --- Bytes length (byte-based) --- + + - name: "bytes length ascii" + mapping: | + root = "hello".bytes().length() + output: 5 + + - name: "bytes length empty" + mapping: | + root = "".bytes().length() + output: 0 + + - name: "bytes length multibyte utf8" + mapping: | + root = "\u{1F44B}".bytes().length() + # FIXME-v1: V1 does not support \u{...} brace unicode escapes in double-quoted strings — only \uXXXX 4-hex form + skip: "V1 does not support \\u{...} brace-form unicode escapes" + + - name: "bytes length non-ascii two byte" + mapping: | + root = "é".bytes().length() + output: 2 + + # --- Bytes indexing (byte-based, returns int64) --- + + - name: "bytes index first byte" + mapping: | + root = "hello".bytes().index(0) + # FIXME-v1: verify — V1 .index() on bytes returns the byte value + output: 104 + + - name: "bytes index last byte" + mapping: | + root = "hello".bytes().index(4) + output: 111 + + - name: "bytes negative index" + mapping: | + root = "hello".bytes().index(-1) + output: 111 + + - name: "bytes index out of bounds" + mapping: | + root = "hello".bytes().index(5) + skip: "V1 .index() out-of-range returns null, not an error" + + - name: "bytes negative index out of bounds" + mapping: | + root = "hello".bytes().index(-6) + skip: "V1 .index() out-of-range returns null, not an error" + + - name: "bytes index returns byte value 0-255" + mapping: | + root = "é".bytes().index(0) + output: 195 + + # --- Bytes to string --- + + - name: "bytes to string utf8" + mapping: | + root = "hello".bytes().string() + output: "hello" + + - name: "bytes to string empty" + mapping: | + root = "".bytes().string() + output: "" + + # --- Bytes concatenation --- + + - name: "bytes concatenation" + mapping: | + root = "hello".bytes() + " world".bytes() + # FIXME-v1: V1 `+` on bytes coerces to string concatenation (returns a string, not bytes) + skip: "V1 bytes + bytes yields a string, not bytes" + + - name: "bytes concat with empty" + mapping: | + root = "hello".bytes() + "".bytes() + skip: "V1 bytes + bytes yields a string, not bytes" + + - name: "bytes plus string is error" + mapping: | + root = "hello".bytes() + " world" + # FIXME-v1: V1 coerces bytes+string to string concat (no error). Semantics differ. + skip: "V1 allows bytes+string via implicit string coercion; no error raised" + + - name: "string plus bytes is error" + mapping: | + root = "hello" + " world".bytes() + skip: "V1 allows string+bytes via implicit string coercion; no error raised" + + # --- Bytes slicing --- + + - name: "bytes slice basic" + mapping: | + root = "hello".bytes().slice(0, 3) + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes slice to end" + mapping: | + root = "hello".bytes().slice(3) + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes slice negative" + mapping: | + root = "hello".bytes().slice(-3, -1) + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes slice clamped" + mapping: | + root = "hello".bytes().slice(0, 100) + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes slice empty result" + mapping: | + root = "hello".bytes().slice(3, 1) + skip: "V2-only output marker {_type: bytes, value: ...}" + + # --- Bytes reverse --- + + - name: "bytes reverse" + mapping: | + root = "hello".bytes().reverse() + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "bytes reverse empty" + mapping: | + root = "".bytes().reverse() + skip: "V2-only output marker {_type: bytes, value: ...}" + + # --- Bytes contains --- + + - name: "bytes contains subsequence true" + mapping: | + root = "hello".bytes().contains("ll".bytes()) + # FIXME-v1: verify — V1 .contains on string/bytes accepts string arg, bytes-arg behavior unclear + output: true + + - name: "bytes contains subsequence false" + mapping: | + root = "hello".bytes().contains("xyz".bytes()) + output: false + + # --- Bytes index_of --- + + - name: "bytes index_of found" + mapping: | + root = "hello".bytes().index_of("ll".bytes()) + # FIXME-v1: verify — V1 index_of on bytes arg may coerce + output: 2 + + - name: "bytes index_of not found" + mapping: | + root = "hello".bytes().index_of("xyz".bytes()) + output: -1 + + # --- Bytes encoding --- + + - name: "bytes encode base64" + mapping: | + root = "hello".bytes().encode("base64") + output: "aGVsbG8=" + + - name: "bytes encode hex" + mapping: | + root = "hello".bytes().encode("hex") + output: "68656c6c6f" + + - name: "string encode base64" + mapping: | + root = "hello".encode("base64") + output: "aGVsbG8=" + + # --- Bytes decoding --- + + - name: "decode base64 to bytes" + mapping: | + root = "aGVsbG8=".decode("base64") + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "decode hex to bytes" + mapping: | + root = "68656c6c6f".decode("hex") + skip: "V2-only output marker {_type: bytes, value: ...}" + + - name: "decode base64 then string" + mapping: | + root = "aGVsbG8=".decode("base64").string() + output: "hello" + + - name: "decode invalid base64 is error" + # V1 surfaces the underlying encoding package's error text — "illegal base64 data" here. + mapping: | + root = "not-valid-base64!!!".decode("base64") + error: "illegal base64 data" + + - name: "decode invalid hex is error" + # V1 surfaces encoding/hex's own error: "invalid byte". + mapping: | + root = "zzzz".decode("hex") + error: "invalid byte" + + # --- Bytes equality --- + + - name: "bytes equal same content" + mapping: | + root = "hello".bytes() == "hello".bytes() + output: true + + - name: "bytes not equal different content" + mapping: | + root = "hello".bytes() == "world".bytes() + output: false + + - name: "bytes not equal to string cross type" + mapping: | + root = "hello".bytes() == "hello" + # FIXME-v1: V1 `==` may compare bytes to string by value-coercion and return true + skip: "V1 bytes vs string equality semantics differ (may return true via coercion)" + + # --- Bytes comparison --- + + - name: "bytes less than lexicographic" + mapping: | + root = "abc".bytes() < "abd".bytes() + # FIXME-v1: V1 ordering on bytes — via string coercion should work lexicographically + output: true + + - name: "bytes greater than lexicographic" + mapping: | + root = "b".bytes() > "a".bytes() + output: true diff --git a/internal/bloblang2/migrator/v1spec/tests/types/floats.yaml b/internal/bloblang2/migrator/v1spec/tests/types/floats.yaml new file mode 100644 index 000000000..2db949ab4 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/floats.yaml @@ -0,0 +1,260 @@ +description: "Float types: float32, float64, NaN, Infinity, negative zero" + +tests: + # --- float64 literals (default) --- + + - name: "float literal is float64" + mapping: | + root = 3.14.type() + # V1 .type() reports "number" for any numeric value, not "float64" + skip: "V2-only: V1 .type() returns 'number' for all numerics, no float64/int64 distinction" + + - name: "float64 zero" + mapping: | + root = 0.0 + output: 0.0 + + - name: "float64 negative" + mapping: | + root = -3.14 + output: -3.14 + + - name: "float64 small decimal" + mapping: | + root = 0.001 + output: 0.001 + + - name: "float64 large value" + mapping: | + root = 1000000.5 + output: 1000000.5 + + # --- float32 conversions --- + + - name: "float32 from float64" + mapping: | + root = 3.14.number() + skip: "V2-only: V1 has no float32 type" + + - name: "float32 type check" + mapping: | + root = 3.14.number().type() + skip: "V2-only: V1 has no float32 type" + + - name: "float32 from string" + mapping: | + root = "3.14".number() + skip: "V2-only: V1 has no float32 type (string.number is available but returns float64)" + + - name: "float32 from integer" + mapping: | + root = 42.number() + skip: "V2-only: V1 has no float32 type" + + - name: "float32 zero" + mapping: | + root = 0.0.number() + skip: "V2-only: V1 has no float32 type" + + - name: "float32 negative" + mapping: | + root = (-2.5).number() + skip: "V2-only: V1 has no float32 type" + + # --- float64 conversions --- + + - name: "float64 from string" + mapping: | + root = "3.14".number() + output: 3.14 + + - name: "float64 from integer" + mapping: | + root = 42.number() + # FIXME-v1: V1 .number() on an int returns the same int (no int→float promotion) + skip: "V1 .number() preserves int vs float distinction, no forced promotion" + + - name: "float64 from bool is error" + mapping: | + root = true.number() + # FIXME-v1: verify error substring + error: "expected number" + + - name: "float64 from invalid string" + # V1 surfaces strconv's own error text; "invalid syntax" is stable. + mapping: | + root = "not_a_number".number() + error: "invalid syntax" + + # --- float64 arithmetic --- + + - name: "float64 addition" + mapping: | + root = 1.5 + 2.5 + output: 4.0 + + - name: "float64 subtraction" + mapping: | + root = 5.0 - 3.5 + output: 1.5 + + - name: "float64 multiplication" + mapping: | + root = 2.5 * 4.0 + output: 10.0 + + - name: "float64 division" + mapping: | + root = 10.0 / 3.0 + output: 3.3333333333333335 + + - name: "float64 modulo" + mapping: | + root = 7.5 % 2.0 + # V1 modulo requires integer operands — non-integer is a type error + skip: "V1 %: operands must be integers; non-integer operands are a type error" + + # --- Division by zero --- + + - name: "float division by zero is error" + # V1 detects literal division-by-zero at compile time. + mapping: | + root = 7.0 / 0.0 + compile_error: "divide by zero" + + - name: "integer division by zero is error" + # V1 detects literal division-by-zero at compile time. + mapping: | + root = 7 / 0 + compile_error: "divide by zero" + + # --- NaN behavior --- + + - name: "nan not equal to itself" + mapping: | + let nan = this.nan + root = $nan == $nan + input: {nan: {_type: "float64", value: "NaN"}} + skip: "V2-only: V1 test harness cannot inject raw NaN via {_type} markers" + + - name: "nan not equal to nan explicit" + mapping: | + let nan = this.nan + root = $nan != $nan + input: {nan: {_type: "float64", value: "NaN"}} + skip: "V2-only: V1 test harness cannot inject raw NaN via {_type} markers" + + - name: "nan less than any is false" + mapping: | + root = this.nan < 1.0 + input: {nan: {_type: "float64", value: "NaN"}} + skip: "V2-only: V1 test harness cannot inject raw NaN via {_type} markers" + + - name: "nan greater than any is false" + mapping: | + root = this.nan > 1.0 + input: {nan: {_type: "float64", value: "NaN"}} + skip: "V2-only: V1 test harness cannot inject raw NaN via {_type} markers" + + - name: "nan arithmetic produces nan" + mapping: | + root = this.nan + 1.0 + input: {nan: {_type: "float64", value: "NaN"}} + skip: "V2-only: V1 test harness cannot inject raw NaN via {_type} markers" + + - name: "nan type is float64" + mapping: | + root = this.nan.type() + input: {nan: {_type: "float64", value: "NaN"}} + skip: "V2-only: V1 .type() returns 'number', not 'float64'" + + - name: "nan bool conversion is error" + mapping: | + root = this.nan.bool() + input: {nan: {_type: "float64", value: "NaN"}} + skip: "V2-only: V1 test harness cannot inject raw NaN via {_type} markers" + + # --- Infinity behavior --- + + - name: "infinity greater than any number" + mapping: | + root = this.inf > 999999999.0 + input: {inf: {_type: "float64", value: "Infinity"}} + skip: "V2-only: V1 test harness cannot inject raw Infinity via {_type} markers" + + - name: "infinity equals infinity" + mapping: | + root = this.inf == this.inf + input: {inf: {_type: "float64", value: "Infinity"}} + skip: "V2-only: V1 test harness cannot inject raw Infinity via {_type} markers" + + - name: "negative infinity less than any number" + mapping: | + root = this.ninf < -999999999.0 + input: {ninf: {_type: "float64", value: "-Infinity"}} + skip: "V2-only: V1 test harness cannot inject raw Infinity via {_type} markers" + + - name: "infinity type is float64" + mapping: | + root = this.inf.type() + input: {inf: {_type: "float64", value: "Infinity"}} + skip: "V2-only: V1 .type() returns 'number', not 'float64'" + + - name: "infinity minus infinity is nan" + mapping: | + root = this.inf - this.inf + input: {inf: {_type: "float64", value: "Infinity"}} + skip: "V2-only: V1 test harness cannot inject raw Infinity via {_type} markers" + + # --- Negative zero --- + + - name: "negative zero equals positive zero" + mapping: | + root = this.nz == 0.0 + input: {nz: {_type: "float64", value: "-0.0"}} + skip: "V2-only: V1 test harness cannot inject -0.0 via {_type} markers" + + - name: "negative zero not less than positive zero" + mapping: | + root = this.nz < 0.0 + input: {nz: {_type: "float64", value: "-0.0"}} + skip: "V2-only: V1 test harness cannot inject -0.0 via {_type} markers" + + - name: "negative zero string normalizes to zero" + mapping: | + root = this.nz.string() + input: {nz: {_type: "float64", value: "-0.0"}} + skip: "V2-only: V1 test harness cannot inject -0.0 via {_type} markers" + + # --- Float-integer promotion --- + + - name: "int plus float promotes to float64" + mapping: | + root = 5 + 3.0 + output: 8.0 + + - name: "int plus float result type" + mapping: | + root = (5 + 3.0).type() + # V1 .type() returns "number" for both int and float + skip: "V2-only: V1 .type() returns 'number' for all numerics" + + - name: "large int to float precision error" + mapping: | + root = 9007199254740993 + 1.0 + # V1 does not raise precision-loss errors; promotes silently + skip: "V1 has no int-to-float precision-loss check; arithmetic proceeds silently" + + # --- float32 arithmetic promotion --- + + - name: "float32 plus float64 promotes to float64" + mapping: | + root = (1.5.number() + 2.5).type() + skip: "V2-only: V1 has no float32 type" + + - name: "float32 division result" + mapping: | + let a = 10.0.number() + let b = 3.0.number() + root = ($a / $b).type() + skip: "V2-only: V1 has no float32 type" diff --git a/internal/bloblang2/migrator/v1spec/tests/types/integers.yaml b/internal/bloblang2/migrator/v1spec/tests/types/integers.yaml new file mode 100644 index 000000000..30c1197b5 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/integers.yaml @@ -0,0 +1,290 @@ +description: "Integer types: int32, int64, uint32, uint64 literals, limits, and conversions" + +tests: + # --- int64 literals (default) --- + + - name: "integer literal is int64" + mapping: | + root = 42.type() + # V1 .type() returns "number" — no int64/float64 distinction + skip: "V2-only: V1 .type() returns 'number' for all numerics" + + - name: "zero literal is int64" + mapping: | + root = 0.type() + skip: "V2-only: V1 .type() returns 'number' for all numerics" + + - name: "negative integer via unary minus" + mapping: | + root = (-10).type() + skip: "V2-only: V1 .type() returns 'number' for all numerics" + + - name: "negative integer value" + mapping: | + root = -10 + output: -10 + + - name: "int64 max value" + mapping: | + root = 9223372036854775807 + output: 9223372036854775807 + + - name: "int64 min literal exceeds int64 range" + # V1 parses `-N` as `(unary-minus N)` — the positive literal is parsed first and overflows. + # The compile error is strconv's "value out of range". + mapping: | + root = -9223372036854775808 + compile_error: "value out of range" + + - name: "int64 min value via arithmetic" + mapping: | + root = -9223372036854775807 - 1 + # FIXME-v1: V1 int overflow is silent per Go int64; this should yield int64 min + output: -9223372036854775808 + + # --- int32 conversions --- + + - name: "int32 from int64 literal" + mapping: | + root = 42.number() + skip: "V2-only: V1 has no int32 type" + + - name: "int32 type check" + mapping: | + root = 42.number().type() + skip: "V2-only: V1 has no int32 type" + + - name: "int32 from string" + mapping: | + root = "42".number() + skip: "V2-only: V1 has no int32 type" + + - name: "int32 max value" + mapping: | + root = 2147483647.number() + skip: "V2-only: V1 has no int32 type" + + - name: "int32 min value" + mapping: | + root = (-2147483648).number() + skip: "V2-only: V1 has no int32 type" + + - name: "int32 overflow positive" + mapping: | + root = 2147483648.number() + skip: "V2-only: V1 has no int32 type/range" + + - name: "int32 overflow negative" + mapping: | + root = (-2147483649).number() + skip: "V2-only: V1 has no int32 type/range" + + - name: "int32 zero" + mapping: | + root = 0.number() + skip: "V2-only: V1 has no int32 type" + + - name: "int32 negative" + mapping: | + root = (-100).number() + skip: "V2-only: V1 has no int32 type" + + # --- uint32 conversions --- + + - name: "uint32 from int64" + mapping: | + root = 42.number() + skip: "V2-only: V1 has no uint32 type" + + - name: "uint32 type check" + mapping: | + root = 42.number().type() + skip: "V2-only: V1 has no uint32 type" + + - name: "uint32 from string" + mapping: | + root = "255".number() + skip: "V2-only: V1 has no uint32 type" + + - name: "uint32 max value" + mapping: | + root = 4294967295.number() + skip: "V2-only: V1 has no uint32 type" + + - name: "uint32 zero" + mapping: | + root = 0.number() + skip: "V2-only: V1 has no uint32 type" + + - name: "uint32 overflow" + mapping: | + root = 4294967296.number() + skip: "V2-only: V1 has no uint32 type/range" + + - name: "uint32 negative is error" + mapping: | + root = (-1).number() + skip: "V2-only: V1 has no uint32 type/range" + + # --- uint64 conversions --- + + - name: "uint64 from int64" + mapping: | + root = 42.number() + skip: "V2-only: V1 has no uint64 type" + + - name: "uint64 type check" + mapping: | + root = 42.number().type() + skip: "V2-only: V1 has no uint64 type" + + - name: "uint64 from string" + mapping: | + root = "1000".number() + skip: "V2-only: V1 has no uint64 type" + + - name: "uint64 max from string" + mapping: | + root = "18446744073709551615".number() + skip: "V2-only: V1 has no uint64 type" + + - name: "uint64 zero" + mapping: | + root = 0.number() + skip: "V2-only: V1 has no uint64 type" + + - name: "uint64 negative is error" + mapping: | + root = (-1).number() + skip: "V2-only: V1 has no uint64 type/range" + + - name: "uint64 max as bare literal is compile error" + mapping: | + root = 18446744073709551615.number() + skip: "V2-only: V1 has no uint64 type; literal would be a compile error anyway but different semantics" + + - name: "uint64 overflow from string" + mapping: | + root = "18446744073709551616".number() + skip: "V2-only: V1 has no uint64 type" + + # --- int64 conversions --- + + - name: "int64 from string" + # V1 `"42".number()` returns float64, not int64 — there's no separate int-conversion method. + mapping: | + root = "42".number() + output: 42.0 + + - name: "int64 from float truncates" + mapping: | + root = 3.9.number() + # FIXME-v1: V1 .number() on a float returns the float (no truncation). V1 has no direct int-conversion method. + skip: "V1 .number() on float returns the float unchanged; no int-coercion method" + + - name: "int64 from negative float truncates toward zero" + mapping: | + root = (-3.9).number() + skip: "V1 .number() on float returns the float unchanged; no int-coercion method" + + - name: "int64 from bool is error" + mapping: | + root = true.number() + # FIXME-v1: verify error substring + error: "expected number" + + - name: "int64 from invalid string" + # V1 surfaces strconv's own error text. + mapping: | + root = "not_a_number".number() + error: "invalid syntax" + + # --- Cross-type integer equality (promotion) --- + + - name: "int32 equals int64 same value" + mapping: | + root = 5.number() == 5 + output: true + + - name: "int32 not equals int64 different value" + mapping: | + root = 5.number() == 6 + output: false + + - name: "uint32 equals int64 same value" + mapping: | + root = 42.number() == 42 + output: true + + - name: "uint64 equals int64 same value" + mapping: | + root = 42.number() == 42 + output: true + + - name: "int64 equals float64 same value" + mapping: | + root = 5 == 5.0 + # V1 == is representation-agnostic for numbers + output: true + + # --- Integer arithmetic basics --- + + - name: "int64 addition" + mapping: | + root = 5 + 3 + output: 8 + + - name: "int64 subtraction" + mapping: | + root = 10 - 3 + output: 7 + + - name: "int64 multiplication" + mapping: | + root = 6 * 7 + output: 42 + + - name: "int64 modulo" + mapping: | + root = 7 % 2 + output: 1 + + - name: "int64 division produces float64" + mapping: | + root = 7 / 2 + # V1 / on ints always returns float + output: 3.5 + + - name: "int64 overflow addition" + mapping: | + root = 9223372036854775807 + 1 + # V1 integer overflow is silent per Go int64 semantics; no error + skip: "V1 int64 overflow wraps silently per Go semantics; no error raised" + + - name: "int64 min literal is compile error in subtraction" + # V1 parses `-N` as `(unary-minus N)`; the positive literal overflows int64 at parse. + mapping: | + root = -9223372036854775808 - 1 + compile_error: "value out of range" + + - name: "int64 overflow subtraction" + mapping: | + root = (-9223372036854775807 - 1) - 1 + skip: "V1 int64 overflow wraps silently per Go semantics; no error raised" + + # --- Conversion from float to integer types --- + + - name: "float to int32" + mapping: | + root = 3.14.number() + skip: "V2-only: V1 has no int32 type and no float-to-int coercion method" + + - name: "float to uint32" + mapping: | + root = 100.0.number() + skip: "V2-only: V1 has no uint32 type" + + - name: "float to uint64" + mapping: | + root = 100.0.number() + skip: "V2-only: V1 has no uint64 type" diff --git a/internal/bloblang2/migrator/v1spec/tests/types/object.yaml b/internal/bloblang2/migrator/v1spec/tests/types/object.yaml new file mode 100644 index 000000000..d059e87ef --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/object.yaml @@ -0,0 +1,204 @@ +description: "Object literals, field access, expression keys, key ordering, and methods" + +tests: + # --- Literals --- + + - name: "empty object literal" + mapping: | + root.obj = {} + output: {"obj": {}} + + - name: "single field object" + mapping: | + root.obj = {"name": "Alice"} + output: {"obj": {"name": "Alice"}} + + - name: "multi-field object" + mapping: | + root.obj = {"name": "Alice", "age": 30} + output: {"obj": {"name": "Alice", "age": 30}} + + - name: "trailing comma allowed" + mapping: | + root.obj = {"a": 1, "b": 2,} + output: {"obj": {"a": 1, "b": 2}} + + - name: "mixed value types" + mapping: | + root.obj = {"s": "hello", "n": 42, "f": 3.14, "b": true, "nil": null} + output: {"obj": {"s": "hello", "n": 42, "f": 3.14, "b": true, "nil": null}} + + - name: "nested objects" + mapping: | + root.obj = {"user": {"name": "Alice", "address": {"city": "London"}}} + output: {"obj": {"user": {"name": "Alice", "address": {"city": "London"}}}} + + - name: "object containing array" + mapping: | + root.obj = {"items": [1, 2, 3]} + output: {"obj": {"items": [1, 2, 3]}} + + # --- Field access --- + + - name: "field access dot notation" + mapping: | + let obj = {"name": "Alice", "age": 30} + root.name = $obj.name + output: {"name": "Alice"} + + - name: "nested field access" + mapping: | + let obj = {"user": {"name": "Alice"}} + root.name = $obj.user.name + output: {"name": "Alice"} + + - name: "non-existent field returns null" + mapping: | + let obj = {"name": "Alice"} + root.v = $obj.missing + output: {"v": null} + + - name: "deeply nested non-existent field returns null" + mapping: | + let obj = {"a": {"b": {}}} + root.v = $obj.a.b.c + output: {"v": null} + + - name: "dynamic field access with bracket notation" + mapping: | + let obj = {"name": "Alice"} + let key = "name" + root.v = $obj.get($key) + # FIXME-v1: V1 has no [] bracket indexing; use .get(path) for dynamic key + output: {"v": "Alice"} + + - name: "dynamic field access non-string key is error" + # V1 `.get()` type-checks at compile time. + mapping: | + let obj = {"name": "Alice"} + root.v = $obj.get(42) + compile_error: "expected string" + + # --- Expression keys --- + + - name: "variable as key" + mapping: | + let key = "dynamic" + root.obj = {$key: "value"} + # FIXME-v1: V1 requires parens around computed-key expressions in object literals sometimes; $key alone should work + output: {"obj": {"dynamic": "value"}} + + - name: "concatenation as key" + mapping: | + let prefix = "pre" + root.obj = {($prefix + "_field"): "value"} + # V1 requires parens for computed keys + output: {"obj": {"pre_field": "value"}} + + - name: "non-string expression key is runtime error" + # V1 surfaces the actual offending Go type: "invalid key type: int64" / "" / "bool". + mapping: | + let key = 42 + root.obj = {$key: "value"} + error: "invalid key type" + + - name: "null expression key is runtime error" + mapping: | + let key = null + root.obj = {$key: "value"} + error: "invalid key type" + + - name: "bool expression key is runtime error" + mapping: | + let key = true + root.obj = {$key: "value"} + error: "invalid key type" + + # --- Key ordering is NOT preserved --- + + - name: "object equality ignores key order" + mapping: | + let a = {"x": 1, "y": 2} + let b = {"y": 2, "x": 1} + root.eq = $a == $b + output: {"eq": true} + + - name: "object inequality when values differ" + mapping: | + let a = {"x": 1, "y": 2} + let b = {"x": 1, "y": 3} + root.eq = $a == $b + output: {"eq": false} + + - name: "object inequality different keys" + mapping: | + let a = {"x": 1} + let b = {"y": 1} + root.eq = $a == $b + output: {"eq": false} + + # --- Length --- + + - name: "length of empty object" + mapping: | + root.len = {}.length() + output: {"len": 0} + + - name: "length of non-empty object" + mapping: | + root.len = {"a": 1, "b": 2, "c": 3}.length() + output: {"len": 3} + + # --- Type --- + + - name: "object type" + mapping: | + root.t = {"a": 1}.type() + output: {"t": "object"} + + - name: "empty object type" + mapping: | + root.t = {}.type() + output: {"t": "object"} + + # --- From input --- + + - name: "object from input" + input: {"user": {"name": "Alice", "age": 30}} + mapping: | + root.name = this.user.name + root.age = this.user.age + output: {"name": "Alice", "age": 30} + + # --- Deleted in object literal --- + + - name: "deleted value in object literal omits field" + mapping: | + root.obj = {"a": 1, "b": deleted(), "c": 3} + # FIXME-v1: V1 deleted() inside object literal may produce {"b": null} rather than omitting + skip: "V1 deleted() semantics inside object literals differ from V2" + + # --- Void in object literal is error --- + + - name: "void value in object literal is error" + mapping: | + root.obj = {"a": 1, "b": if false { 2 }} + skip: "V2-only: V1 has no 'void' concept; if-without-else yields null, not a void error" + + # --- Quoted field names --- + + - name: "quoted field name with special characters" + mapping: | + root.obj = {"field-with-dashes": "value"} + output: {"obj": {"field-with-dashes": "value"}} + + - name: "quoted field name starting with digit" + mapping: | + root.obj = {"123abc": "value"} + output: {"obj": {"123abc": "value"}} + + - name: "access quoted field name" + mapping: | + let obj = {"field-name": "hello"} + root.v = $obj."field-name" + output: {"v": "hello"} diff --git a/internal/bloblang2/migrator/v1spec/tests/types/string.yaml b/internal/bloblang2/migrator/v1spec/tests/types/string.yaml new file mode 100644 index 000000000..92bd5ba25 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/string.yaml @@ -0,0 +1,306 @@ +description: "String literals, escape sequences, raw strings, and codepoint semantics" + +tests: + # --- Basic string literals --- + + - name: "empty string literal" + mapping: | + root = "" + output: "" + + - name: "simple string literal" + mapping: | + root = "hello world" + output: "hello world" + + - name: "string type introspection" + mapping: | + root = "hello".type() + output: "string" + + # --- Escape sequences --- + + - name: "escape newline" + mapping: | + root = "line1\nline2" + output: "line1\nline2" + + - name: "escape tab" + mapping: | + root = "col1\tcol2" + output: "col1\tcol2" + + - name: "escape carriage return" + mapping: | + root = "hello\rworld" + output: "hello\rworld" + + - name: "escape double quote" + mapping: | + root = "say \"hello\"" + output: "say \"hello\"" + + - name: "escape backslash" + mapping: | + root = "back\\slash" + output: "back\\slash" + + - name: "unicode escape 4 digit BMP" + mapping: | + root = "A" + output: "A" + + - name: "unicode escape 4 digit non-ascii" + mapping: | + root = "é" + output: "é" + + - name: "unicode escape braced single digit" + mapping: | + root = "\u{41}" + # V1 strconv.Unquote does not accept \u{...} brace form + skip: "V1 string escape does not support \\u{...} brace form (only \\uXXXX)" + + - name: "unicode escape braced emoji" + mapping: | + root = "\u{1F600}" + skip: "V1 string escape does not support \\u{...} brace form (only \\uXXXX)" + + - name: "multiple escapes in one string" + mapping: | + root = "a\tb\nc\\d\"e" + output: "a\tb\nc\\d\"e" + + # --- Raw strings --- + + - name: "raw string basic" + mapping: "root = \"\"\"hello world\"\"\"" + # V1 uses triple-double-quotes for raw strings, not backticks + output: "hello world" + + - name: "raw string no escape processing" + mapping: "root = \"\"\"no\\nescape\\there\"\"\"" + output: "no\\nescape\\there" + + - name: "raw string preserves quotes" + mapping: "root = \"\"\"she said \"hello\"\"\"\"" + # FIXME-v1: triple-quoted cannot contain `"""` ; this is a lexing ambiguity edge case + skip: "V1 triple-quoted raw strings cannot contain the closing triple-quote sequence" + + - name: "raw string preserves backslashes" + mapping: "root = \"\"\"C:\\path\\to\\file\"\"\"" + output: "C:\\path\\to\\file" + + - name: "raw string with newlines preserved" + mapping: "root = \"\"\"line1\nline2\"\"\"" + output: "line1\nline2" + + # --- String length (codepoint-based in V2, byte-based in V1) --- + + - name: "length of ascii string" + mapping: | + root = "hello".length() + output: 5 + + - name: "length of empty string" + mapping: | + root = "".length() + output: 0 + + - name: "length of string with non-ascii" + mapping: | + root = "café".length() + # V1 .length() on a string returns byte length, not codepoint count + output: 5 + + - name: "length of single codepoint emoji" + mapping: | + root = "\u{1F600}".length() + skip: "V1 string escape does not support \\u{...} brace form" + + - name: "length of multi-codepoint emoji" + mapping: | + root = "\u{1F44B}\u{1F3FD}".length() + skip: "V1 string escape does not support \\u{...} brace form" + + # --- String indexing (codepoint-based in V2, returns int64) --- + + - name: "index first codepoint" + mapping: | + root = "hello".index(0) + # FIXME-v1: verify — V1 .index() on strings may not exist; strings are not arrays + skip: "V1 has no codepoint indexing on strings via .index() (only on arrays/bytes)" + + - name: "index last codepoint positive" + mapping: | + root = "hello".index(4) + skip: "V1 has no codepoint indexing on strings via .index()" + + - name: "index negative last codepoint" + mapping: | + root = "hello".index(-1) + skip: "V1 has no codepoint indexing on strings via .index()" + + - name: "index negative second to last" + mapping: | + root = "hello".index(-2) + skip: "V1 has no codepoint indexing on strings via .index()" + + - name: "index non-ascii codepoint" + mapping: | + root = "café".index(3) + skip: "V1 has no codepoint indexing on strings via .index()" + + - name: "index emoji codepoint" + mapping: | + root = "\u{1F600}".index(0) + skip: "V1 has no codepoint indexing on strings; brace-form escape also unsupported" + + - name: "index out of bounds positive" + mapping: | + root = "hello".index(5) + skip: "V1 has no codepoint indexing on strings" + + - name: "index out of bounds negative" + mapping: | + root = "hello".index(-6) + skip: "V1 has no codepoint indexing on strings" + + # --- Codepoint round-trip with .char() --- + + - name: "char round trip ascii" + mapping: | + root = "hello".index(0).char() + skip: "V1 has no .char() method" + + - name: "char round trip non-ascii" + mapping: | + root = "café".index(3).char() + skip: "V1 has no .char() method" + + # --- String concatenation --- + + - name: "string concatenation" + mapping: | + root = "hello" + " " + "world" + output: "hello world" + + - name: "string concat with empty" + mapping: | + root = "" + "hello" + "" + output: "hello" + + - name: "string plus number is error" + # V1 literal-operand `+` type checks at compile time. + mapping: | + root = "hello" + 5 + compile_error: "cannot add" + + # --- String comparison --- + + - name: "string equality same" + mapping: | + root = "abc" == "abc" + output: true + + - name: "string equality different" + mapping: | + root = "abc" == "abd" + output: false + + - name: "string less than lexicographic" + mapping: | + root = "abc" < "abd" + output: true + + - name: "string greater than lexicographic" + mapping: | + root = "b" > "a" + output: true + + - name: "string equality cross type is false" + mapping: | + root = "5" == 5 + output: false + + # --- No Unicode normalization --- + + - name: "no normalization precomposed vs decomposed not equal" + mapping: | + root = "é" == "é" + output: false + + - name: "no normalization different lengths" + mapping: | + root.precomposed = "é".length() + root.decomposed = "é".length() + # V1 .length() is byte-based: "é" → 2 bytes; "é" → 3 bytes + output: + precomposed: 2 + decomposed: 3 + + # --- String slicing --- + + - name: "string slice basic" + mapping: | + root = "hello world".slice(0, 5) + output: "hello" + + - name: "string slice to end" + mapping: | + root = "hello world".slice(6) + output: "world" + + - name: "string slice negative indices" + mapping: | + root = "hello world".slice(-5, -1) + # FIXME-v1: verify — V1 .slice() on strings supports negative indices + output: "worl" + + - name: "string slice clamped" + mapping: | + root = "hello".slice(0, 100) + # FIXME-v1: verify — V1 .slice() may or may not clamp + output: "hello" + + - name: "string slice empty result" + # V1 `.slice(low, high)` with `low >= high` is a compile-time error when both are literals. + mapping: | + root = "hello".slice(3, 1) + compile_error: "lower slice bound" + + # --- String reverse --- + + - name: "string reverse ascii" + mapping: | + root = "hello".reverse() + # FIXME-v1: V1 may not have .reverse() on strings (list only) + skip: "V1 .reverse() may not be available for strings; verify method registry" + + - name: "string reverse empty" + mapping: | + root = "".reverse() + skip: "V1 .reverse() may not be available for strings; verify method registry" + + # --- String contains and index_of --- + + - name: "string contains true" + mapping: | + root = "hello world".contains("world") + output: true + + - name: "string contains false" + mapping: | + root = "hello world".contains("xyz") + output: false + + - name: "string index_of found" + mapping: | + root = "hello world".index_of("world") + output: 6 + + - name: "string index_of not found" + mapping: | + root = "hello world".index_of("xyz") + # FIXME-v1: verify — V1 .index_of may return null or -1 + output: -1 diff --git a/internal/bloblang2/migrator/v1spec/tests/types/timestamp.yaml b/internal/bloblang2/migrator/v1spec/tests/types/timestamp.yaml new file mode 100644 index 000000000..20441f294 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/timestamp.yaml @@ -0,0 +1,247 @@ +description: "Timestamp creation, formatting, arithmetic, and comparison" + +tests: + # --- Creation --- + + - name: "timestamp constructor with required args only" + mapping: | + root.ts = timestamp(2024, 3, 1) + skip: "V2-only: V1 has no timestamp(year, month, day, ...) constructor" + + - name: "timestamp constructor with all positional args" + mapping: | + root.ts = timestamp(2024, 12, 25, 8, 30, 45, 123000000) + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp constructor with named args" + mapping: | + root.ts = timestamp(year: 2024, month: 1, day: 15, hour: 10) + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp constructor with timezone" + mapping: | + root.ts = timestamp(2024, 3, 1, 12, 30, 0, 0, "America/New_York") + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp constructor invalid month" + mapping: | + root.ts = timestamp(2024, 13, 1) + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp constructor invalid timezone" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0, 0, "Not/A/Zone") + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "now returns a timestamp" + # V1 `now()` returns an RFC3339-formatted STRING, not a first-class timestamp value. + mapping: | + root = now() + no_output_check: true + output_type: "string" + + # --- Parsing --- + + - name: "ts_parse with default format (RFC 3339)" + mapping: | + root.ts = "2024-03-01T12:00:00Z".ts_parse() + # FIXME-v1: V1 ts_parse requires a format string argument; there is no zero-arg default + skip: "V1 ts_parse requires an explicit format argument" + + - name: "ts_parse with explicit format" + mapping: | + root.ts = "2024-03-01".ts_parse("2006-01-02") + # FIXME-v1: V1 ts_parse uses Go reference-layout format, not strftime (%Y-%m-%d) + skip: "V1 ts_parse uses Go reference-layout format; %Y/%m/%d is V1 ts_strptime" + + - name: "ts_parse with fractional seconds" + mapping: | + root.ts = "2024-03-01T12:00:00.123Z".ts_parse() + skip: "V1 ts_parse requires an explicit format argument" + + - name: "ts_parse with timezone offset" + mapping: | + root.ts = "2024-03-01T12:00:00+05:30".ts_parse() + skip: "V1 ts_parse requires an explicit format argument" + + - name: "ts_parse invalid string" + # V1 `ts_parse` is registered in `internal/impl/pure` (not in the bare `public/bloblang` + # environment used by the migrator test harness), so the method is not visible to this runner. + skip: "ts_parse not registered in the migrator's bare V1 environment (lives in internal/impl/pure)" + + # --- Formatting --- + + - name: "ts_format default is RFC 3339" + mapping: | + root.s = timestamp(2024, 3, 1, 12, 0, 0).ts_format() + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "ts_format custom format" + mapping: | + root.s = timestamp(2024, 3, 1).ts_format("%Y-%m-%d") + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp string serialization trims trailing zeros" + mapping: | + root.s = timestamp(2024, 3, 1, 12, 0, 0, 500000000).string() + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp string serialization whole seconds omit fraction" + mapping: | + root.s = timestamp(2024, 3, 1, 12, 0, 0).string() + skip: "V2-only: V1 has no timestamp() constructor" + + # --- Comparison --- + + - name: "timestamp equality same value" + mapping: | + let a = timestamp(2024, 3, 1) + let b = timestamp(2024, 3, 1) + root.eq = $a == $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp equality different values" + mapping: | + let a = timestamp(2024, 3, 1) + let b = timestamp(2024, 3, 2) + root.eq = $a == $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp inequality" + mapping: | + let a = timestamp(2024, 3, 1) + let b = timestamp(2024, 3, 2) + root.neq = $a != $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp less than" + mapping: | + let a = timestamp(2024, 3, 1) + let b = timestamp(2024, 3, 2) + root.lt = $a < $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp greater than" + mapping: | + let a = timestamp(2024, 3, 2) + let b = timestamp(2024, 3, 1) + root.gt = $a > $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp less than or equal (equal case)" + mapping: | + let a = timestamp(2024, 3, 1) + let b = timestamp(2024, 3, 1) + root.le = $a <= $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp greater than or equal (greater case)" + mapping: | + let a = timestamp(2024, 3, 2) + let b = timestamp(2024, 3, 1) + root.ge = $a >= $b + skip: "V2-only: V1 has no timestamp() constructor" + + # --- Arithmetic --- + + - name: "timestamp subtraction returns nanoseconds" + mapping: | + let a = timestamp(2024, 3, 1, 0, 0, 0) + let b = timestamp(2024, 3, 1, 0, 0, 1) + root.diff = $b - $a + skip: "V2-only: V1 has no timestamp() constructor; V1 uses .ts_sub() for timestamp diff" + + - name: "timestamp subtraction negative result" + mapping: | + let a = timestamp(2024, 3, 1, 0, 0, 1) + let b = timestamp(2024, 3, 1, 0, 0, 0) + root.diff = $b - $a + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp addition is an error" + mapping: | + let a = timestamp(2024, 3, 1) + let b = timestamp(2024, 3, 2) + root.bad = $a + $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp plus number is an error" + mapping: | + root.bad = timestamp(2024, 3, 1) + 1 + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "number minus timestamp is an error" + mapping: | + root.bad = 1 - timestamp(2024, 3, 1) + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp multiply is an error" + mapping: | + root.bad = timestamp(2024, 3, 1) * 2 + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp divide is an error" + mapping: | + root.bad = timestamp(2024, 3, 1) / 2 + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp modulo is an error" + mapping: | + root.bad = timestamp(2024, 3, 1) % 2 + skip: "V2-only: V1 has no timestamp() constructor" + + # --- ts_add --- + + - name: "ts_add positive duration" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(second()) + skip: "V2-only: V1 has no ts_add(nanos) method, no duration constants, no timestamp() constructor" + + - name: "ts_add negative duration" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(second() * -1) + skip: "V2-only: V1 has no ts_add/duration constants" + + - name: "ts_add with minute constant" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(minute()) + skip: "V2-only: V1 has no ts_add/duration constants" + + - name: "ts_add with hour constant" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(hour()) + skip: "V2-only: V1 has no ts_add/duration constants" + + - name: "ts_add with day constant" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(day()) + skip: "V2-only: V1 has no ts_add/duration constants" + + # --- Duration constants --- + + - name: "second returns nanoseconds" + mapping: | + root.v = second() + skip: "V2-only: V1 has no second() constant" + + - name: "minute returns nanoseconds" + mapping: | + root.v = minute() + skip: "V2-only: V1 has no minute() constant" + + - name: "hour returns nanoseconds" + mapping: | + root.v = hour() + skip: "V2-only: V1 has no hour() constant" + + - name: "day returns nanoseconds" + mapping: | + root.v = day() + skip: "V2-only: V1 has no day() constant" + + # --- Type --- + + - name: "timestamp type" + mapping: | + root.t = timestamp(2024, 3, 1).type() + skip: "V2-only: V1 has no timestamp() constructor" diff --git a/internal/bloblang2/migrator/v1spec/tests/types/timestamp_arithmetic.yaml b/internal/bloblang2/migrator/v1spec/tests/types/timestamp_arithmetic.yaml new file mode 100644 index 000000000..79299ad55 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/timestamp_arithmetic.yaml @@ -0,0 +1,195 @@ +description: > + Timestamp arithmetic edge cases — subtraction overflow for far-apart + timestamps, ts_add overflow, unix conversion round-trips, nanosecond + precision, fractional second formatting, and timezone handling. + +tests: + # --- Subtraction precision --- + + - name: "timestamp subtraction with nanosecond precision" + mapping: | + let a = timestamp(2024, 3, 1, 0, 0, 0, 0) + let b = timestamp(2024, 3, 1, 0, 0, 0, 123456789) + root.diff = $b - $a + skip: "V2-only: V1 has no timestamp() constructor; V1 uses .ts_sub() for timestamp diff" + + - name: "timestamp subtraction across days" + mapping: | + let a = timestamp(2024, 3, 1, 0, 0, 0) + let b = timestamp(2024, 3, 3, 0, 0, 0) + root.diff = $b - $a + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp subtraction across months" + mapping: | + let a = timestamp(2024, 1, 1, 0, 0, 0) + let b = timestamp(2024, 2, 1, 0, 0, 0) + root.diff = ($b - $a) / second() + skip: "V2-only: V1 has no timestamp() constructor or second() constant" + + - name: "timestamp subtraction yields zero for same timestamp" + mapping: | + let t = timestamp(2024, 6, 15, 12, 0, 0) + root.diff = $t - $t + skip: "V2-only: V1 has no timestamp() constructor" + + # --- Subtraction overflow (timestamps > ~292 years apart) --- + + - name: "timestamp subtraction overflow — far future minus far past" + mapping: | + let past = timestamp(1700, 1, 1, 0, 0, 0) + let future = timestamp(2300, 1, 1, 0, 0, 0) + root.diff = $future - $past + skip: "V2-only: V1 has no timestamp() constructor" + + # --- ts_add edge cases --- + + - name: "ts_add with nanosecond precision" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(1) + skip: "V2-only: V1 has no timestamp() constructor or ts_add(nanos) method" + + - name: "ts_add with sub-millisecond precision" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(1500000) + skip: "V2-only: V1 has no timestamp() constructor or ts_add" + + - name: "ts_add negative one day" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(day() * -1) + skip: "V2-only: V1 has no timestamp()/day() or ts_add" + + - name: "ts_add crossing year boundary" + mapping: | + root.ts = timestamp(2024, 12, 31, 23, 59, 59).ts_add(second()) + skip: "V2-only: V1 has no timestamp()/second() or ts_add" + + - name: "ts_add multiple days as seconds" + mapping: | + root.ts = timestamp(2024, 3, 1, 0, 0, 0).ts_add(second() * 86400 * 7) + skip: "V2-only: V1 has no timestamp()/second() or ts_add" + + # --- Unix conversion round-trips --- + + - name: "ts_unix round-trip" + mapping: | + let ts = timestamp(2024, 3, 1, 12, 0, 0) + root.rt = $ts.ts_unix().ts_from_unix().ts_format() + skip: "V2-only: V1 has no timestamp() constructor or ts_from_unix method" + + - name: "ts_unix_milli round-trip with milliseconds" + mapping: | + let ts = timestamp(2024, 3, 1, 12, 0, 0, 123000000) + root.rt = $ts.ts_unix_milli().ts_from_unix_milli().ts_format() + skip: "V2-only: V1 has no timestamp() constructor or ts_from_unix_milli" + + - name: "ts_unix_micro round-trip with microseconds" + mapping: | + let ts = timestamp(2024, 3, 1, 12, 0, 0, 123456000) + root.rt = $ts.ts_unix_micro().ts_from_unix_micro().ts_format() + skip: "V2-only: V1 has no timestamp() constructor or ts_from_unix_micro" + + - name: "ts_unix_nano lossless round-trip with nanoseconds" + mapping: | + let ts = timestamp(2024, 3, 1, 12, 0, 0, 123456789) + root.rt = $ts.ts_unix_nano().ts_from_unix_nano().ts_format() + skip: "V2-only: V1 has no timestamp() constructor or ts_from_unix_nano" + + - name: "ts_unix returns int64" + mapping: | + root.v = timestamp(2024, 3, 1, 12, 0, 0).ts_unix().type() + skip: "V2-only: V1 has no timestamp() constructor; V1 .type() returns 'number'" + + - name: "ts_unix_nano returns int64" + mapping: | + root.v = timestamp(2024, 3, 1, 12, 0, 0).ts_unix_nano().type() + skip: "V2-only: V1 has no timestamp() constructor; V1 .type() returns 'number'" + + # --- Fractional second formatting --- + + - name: "fractional seconds trim trailing zeros to shortest" + mapping: | + root.s = timestamp(2024, 3, 1, 12, 0, 0, 100000000).string() + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "microsecond precision formatting" + mapping: | + root.s = timestamp(2024, 3, 1, 12, 0, 0, 123456000).string() + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "nanosecond precision formatting" + mapping: | + root.s = timestamp(2024, 3, 1, 12, 0, 0, 123456789).string() + skip: "V2-only: V1 has no timestamp() constructor" + + # --- ts_parse timezone handling --- + + - name: "ts_parse with Z timezone" + mapping: | + let ts = "2024-03-01T12:00:00Z".ts_parse() + root.s = $ts.ts_format() + skip: "V1 ts_parse requires explicit format argument" + + - name: "ts_parse with positive offset" + mapping: | + let ts = "2024-03-01T12:00:00+05:30".ts_parse() + root.s = $ts.ts_format() + skip: "V1 ts_parse requires explicit format argument" + + - name: "ts_parse with negative offset" + mapping: | + let ts = "2024-03-01T12:00:00-08:00".ts_parse() + root.s = $ts.ts_format() + skip: "V1 ts_parse requires explicit format argument" + + - name: "ts_parse fractional seconds with nanosecond precision" + mapping: | + root.ts = "2024-03-01T12:00:00.123456789Z".ts_parse() + skip: "V1 ts_parse requires explicit format argument" + + # --- ts_from_unix with float (limited precision) --- + + - name: "ts_from_unix with integer" + mapping: | + root.s = 1709294400.ts_from_unix().ts_format() + skip: "V2-only: V1 has no ts_from_unix method" + + - name: "ts_from_unix with float gives sub-second" + mapping: | + root.s = 1709294400.5.ts_from_unix().ts_format() + skip: "V2-only: V1 has no ts_from_unix method" + + # --- Comparison with timestamps from different construction methods --- + + - name: "timestamps from constructor and parse are equal" + mapping: | + let a = timestamp(2024, 3, 1, 12, 0, 0) + let b = "2024-03-01T12:00:00Z".ts_parse() + root.eq = $a == $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamps from constructor and unix round-trip are equal" + mapping: | + let a = timestamp(2024, 3, 1, 12, 0, 0) + let b = $a.ts_unix().ts_from_unix() + root.eq = $a == $b + skip: "V2-only: V1 has no timestamp() constructor or ts_from_unix" + + # --- Arithmetic type errors --- + + - name: "number minus timestamp is error" + mapping: | + root.v = 100 - timestamp(2024, 3, 1) + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp plus timestamp is error" + mapping: | + let a = timestamp(2024, 3, 1) + let b = timestamp(2024, 3, 2) + root.v = $a + $b + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "timestamp minus number is error" + mapping: | + root.v = timestamp(2024, 3, 1) - 1 + skip: "V2-only: V1 has no timestamp() constructor" diff --git a/internal/bloblang2/migrator/v1spec/tests/types/type_introspection.yaml b/internal/bloblang2/migrator/v1spec/tests/types/type_introspection.yaml new file mode 100644 index 000000000..dfcf37fe0 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/type_introspection.yaml @@ -0,0 +1,187 @@ +description: ".type() method for every runtime type" + +tests: + # --- String --- + + - name: "type of string" + mapping: | + root.t = "hello".type() + output: {"t": "string"} + + - name: "type of empty string" + mapping: | + root.t = "".type() + output: {"t": "string"} + + # --- Integer types --- + + - name: "type of int64 literal" + mapping: | + root.t = 42.type() + # V1 .type() returns "number" for all numerics + output: {"t": "number"} + + - name: "type of negative int64" + mapping: | + root.t = (-10).type() + output: {"t": "number"} + + - name: "type of zero int64" + mapping: | + root.t = 0.type() + output: {"t": "number"} + + - name: "type of int32" + mapping: | + root.t = 42.number().type() + skip: "V2-only: V1 has no int32 type" + + - name: "type of uint32" + mapping: | + root.t = 42.number().type() + skip: "V2-only: V1 has no uint32 type" + + - name: "type of uint64" + mapping: | + root.t = 42.number().type() + skip: "V2-only: V1 has no uint64 type" + + # --- Float types --- + + - name: "type of float64 literal" + mapping: | + root.t = 3.14.type() + output: {"t": "number"} + + - name: "type of float64 zero" + mapping: | + root.t = 0.0.type() + output: {"t": "number"} + + - name: "type of float32" + mapping: | + root.t = 3.14.number().type() + skip: "V2-only: V1 has no float32 type" + + # --- Bool --- + + - name: "type of true" + mapping: | + root.t = true.type() + output: {"t": "bool"} + + - name: "type of false" + mapping: | + root.t = false.type() + output: {"t": "bool"} + + # --- Null --- + + - name: "type of null" + mapping: | + root.t = null.type() + output: {"t": "null"} + + # --- Bytes --- + + - name: "type of bytes" + mapping: | + root.t = "hello".bytes().type() + output: {"t": "bytes"} + + # --- Timestamp --- + + - name: "type of timestamp from constructor" + mapping: | + root.t = timestamp(2024, 3, 1).type() + skip: "V2-only: V1 has no timestamp() constructor" + + - name: "type of timestamp from now" + mapping: | + root.t = now().type() + # V1 now() returns an RFC3339 string, not a timestamp value + output: {"t": "string"} + + - name: "type of timestamp from parse" + mapping: | + root.t = "2024-03-01T00:00:00Z".ts_parse().type() + skip: "V1 ts_parse requires explicit format argument" + + # --- Array --- + + - name: "type of array" + mapping: | + root.t = [1, 2, 3].type() + output: {"t": "array"} + + - name: "type of empty array" + mapping: | + root.t = [].type() + output: {"t": "array"} + + # --- Object --- + + - name: "type of object" + mapping: | + root.t = {"a": 1}.type() + output: {"t": "object"} + + - name: "type of empty object" + mapping: | + root.t = {}.type() + output: {"t": "object"} + + # --- Type checking pattern --- + + - name: "type comparison for runtime check" + mapping: | + let v = 42 + root.is_int = $v.type() == "number" + root.is_str = $v.type() == "string" + # V1: .type() returns "number", not "int64" + output: {"is_int": true, "is_str": false} + + - name: "type of null is not object" + mapping: | + root.is_obj = null.type() == "object" + root.is_null = null.type() == "null" + output: {"is_obj": false, "is_null": true} + + # --- Type from input --- + + - name: "type of input string field" + input: {"name": "Alice"} + mapping: | + root.t = this.name.type() + output: {"t": "string"} + + - name: "type of input number field" + input: {"count": 42} + mapping: | + root.t = this.count.type() + output: {"t": "number"} + + - name: "type of input null field" + input: {"missing": null} + mapping: | + root.t = this.missing.type() + output: {"t": "null"} + + - name: "type of input array field" + input: {"items": [1, 2]} + mapping: | + root.t = this.items.type() + output: {"t": "array"} + + - name: "type of input object field" + input: {"user": {"name": "Alice"}} + mapping: | + root.t = this.user.type() + output: {"t": "object"} + + # --- Void: type() is not callable --- + + - name: "type on void is error" + mapping: | + root.t = (if false { 42 }).type() + skip: "V2-only: V1 has no 'void' concept; if-without-else yields null, .type() returns 'null'" diff --git a/internal/bloblang2/migrator/v1spec/tests/types/void.yaml b/internal/bloblang2/migrator/v1spec/tests/types/void.yaml new file mode 100644 index 000000000..4f4372681 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/types/void.yaml @@ -0,0 +1,194 @@ +description: "Void behavior in every context — void is not a type, it is the absence of a value" + +tests: + # --- Output field assignment: void skips assignment --- + + - name: "void skips output assignment (no prior value)" + mapping: | + root.x = if false { "hello" } + # FIXME-v1: V1 if-without-else evaluates to null; assigning null writes null (not skip) + skip: "V2-only: V1 if-without-else returns null and assigns it; no void-skip behaviour" + + - name: "void preserves prior output value" + mapping: | + root.status = "pending" + root.status = if false { "override" } + skip: "V2-only: V1 second assignment overwrites with null; prior value is lost" + + - name: "void from else-if chain without final else" + mapping: | + root.tier = "default" + root.tier = if false { "gold" } else if false { "silver" } + skip: "V2-only: V1 if-else-if without final else returns null and assigns it" + + - name: "void from non-exhaustive match" + mapping: | + root.sound = "unknown" + root.sound = match "bird" { + "cat" => "meow", + "dog" => "woof", + } + skip: "V2-only: V1 non-exhaustive match returns null and assigns it, not void-skip" + + # --- Variable declaration: runtime error --- + + - name: "void in variable declaration is runtime error (if)" + mapping: | + let x = if false { 42 } + skip: "V2-only: V1 binds $x = null with no error" + + - name: "void in variable declaration is runtime error (match)" + mapping: | + let x = match "nope" { + "a" => 1, + } + skip: "V2-only: V1 binds $x = null with no error" + + # --- Variable reassignment: void skips --- + + - name: "void skips variable reassignment" + mapping: | + let x = 10 + let x = if false { 42 } + root.result = $x + skip: "V2-only: V1 overwrites $x with null; no void-skip" + + - name: "void skips variable reassignment from match" + mapping: | + let x = "original" + let x = match "nope" { + "a" => "found", + } + root.result = $x + skip: "V2-only: V1 overwrites $x with null; no void-skip" + + # --- Collection literal: error --- + + - name: "void in array literal is error" + mapping: | + root.arr = [1, if false { 2 }, 3] + skip: "V2-only: V1 would include null in the array, not error" + + - name: "void in object literal is error" + mapping: | + root.obj = {"a": 1, "b": if false { 2 }} + skip: "V2-only: V1 would set b: null, not error" + + - name: "void in array from match is error" + mapping: | + root.arr = [match "x" { "y" => 1 }] + skip: "V2-only: V1 non-exhaustive match returns null; array would contain [null]" + + # --- Function/map argument: error --- + + - name: "void as map argument is error" + mapping: | + map double { root = this * 2 } + root.result = (if false { 42 }).apply("double") + # FIXME-v1: V1 maps take `this` as the receiver; void arg becomes null → null * 2 is a type error + skip: "V2-only: V1 map syntax differs (no argument list); void concept does not apply" + + # --- .or() rescues void --- + + - name: "or rescues void from if-without-else" + mapping: | + root.result = (if false { "hello" }).or("default") + # V1 if-without-else returns null; .or() rescues null → "default" + output: {"result": "default"} + + - name: "or rescues void from non-exhaustive match" + mapping: | + root.result = (match "bird" { "cat" => "meow" }).or("unknown") + # V1 non-exhaustive match returns null; .or() rescues → "unknown" + output: {"result": "unknown"} + + - name: "or does not trigger when value exists" + mapping: | + root.result = (if true { "hello" }).or("default") + output: {"result": "hello"} + + - name: "or short-circuits argument on non-void" + mapping: | + root.result = (if true { "hello" }).or(throw("should not run")) + output: {"result": "hello"} + + # --- .catch() passes void through unchanged --- + + - name: "catch does not trigger on void" + mapping: | + root.x = "prior" + root.x = (if false { 1 }).catch(0) + # FIXME-v1: V1 returns null (not void); .catch only handles errors, so returns null → overwrites "prior" + skip: "V2-only: V1 does not preserve prior value here; second assignment overwrites with null" + + - name: "catch passes void through then method errors" + mapping: | + root.result = (if false { 1 }).catch(0).string().catch("caught") + # FIXME-v1: V1 returns null; null.string() returns "null" → "null" (no catch triggers) + skip: "V2-only: V1 null.string() returns 'null' without erroring" + + # --- Method calls on void: error --- + + - name: "type on void is error" + mapping: | + root.t = (if false { 42 }).type() + # V1: null.type() returns "null" + skip: "V2-only: V1 null.type() returns 'null', no void error" + + - name: "string on void is error" + mapping: | + root.s = (if false { "hello" }).string() + # V1: null.string() returns "null" + skip: "V2-only: V1 null.string() returns 'null', no void error" + + - name: "length on void is error" + mapping: | + root.l = (if false { [1, 2] }).length() + # V1: null.length() is a type error + skip: "V2-only error type mismatch with V1 (would error on type, not on void)" + + - name: "uppercase on void is error" + mapping: | + root.s = (if false { "hello" }).uppercase() + skip: "V2-only error type mismatch with V1 (would error on null, not on void)" + + # --- Expression operand: error --- + + - name: "void plus number is error" + mapping: | + root.result = (if false { 42 }) + 1 + skip: "V2-only: V1 null+number is a type error ('cannot add'), not a void error" + + - name: "number plus void is error" + mapping: | + root.result = 1 + (if false { 42 }) + skip: "V2-only: V1 number+null is a type error, not a void error" + + - name: "void in boolean negation is error" + mapping: | + root.result = !(if false { true }) + skip: "V2-only: V1 !null is a type error, not a void error" + + - name: "void equality comparison is error" + mapping: | + root.result = (if false { 42 }) == 42 + # FIXME-v1: V1 null == 42 returns false, not an error + skip: "V2-only: V1 null == 42 returns false without erroring" + + # --- Void rescued with or then used in variable declaration --- + + - name: "or rescues void for variable declaration" + mapping: | + let x = (if false { 42 }).or(0) + root.result = $x + output: {"result": 0} + + # --- Void vs deleted distinction --- + + - name: "void preserves prior value while deleted removes it" + mapping: | + root.a = "exists" + root.b = "exists" + root.a = if false { "override" } + root.b = deleted() + skip: "V2-only: V1 overwrites root.a with null (not preserved); only root.b=deleted() matches V2 semantics" diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/bare_ident_resolution.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/bare_ident_resolution.yaml new file mode 100644 index 000000000..f9d56b308 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/bare_ident_resolution.yaml @@ -0,0 +1,93 @@ +description: > + Bare identifier resolution — bare identifiers (without $ prefix) must NOT + resolve to variables. Variables require the $ prefix for both declaration + and reference. Bare identifiers resolve only to map parameters, lambda + parameters, match-as bindings, map names (in call/method-arg context), + and standard library functions (in call/method-arg context). + +tests: + # --- Bare identifier must not resolve to a $variable --- + + - name: "bare identifier does not resolve to variable of same name" + input: {} + mapping: | + let foo = "hello world" + root = foo + output: null # V1 resolves bare `foo` to `this.foo` (legacy); with no input.foo this yields null + + - name: "bare identifier in expression does not resolve to variable" + mapping: | + let x = 10 + root.v = x + 1 + error: "cannot add types null" + + - name: "bare identifier in method chain does not resolve to variable" + mapping: | + let name = "alice" + root.v = name.uppercase() + error: "expected string" # FIXME-v1: V1 treats `name` as `this.name`; null.uppercase() is a runtime type error + + - name: "bare identifier in array literal does not resolve to variable" + mapping: | + let val = 42 + root.v = [val] + output: {"v": [null]} # FIXME-v1: V1 bare `val` = `this.val`, yielding null; no error raised + + - name: "bare identifier in object value does not resolve to variable" + mapping: | + let val = 42 + root.v = {"key": val} + output: {"v": {"key": null}} # FIXME-v1: V1 bare `val` = `this.val`; yields null; no error + + - name: "bare identifier in if condition does not resolve to variable" + input: {} + mapping: | + let flag = true + root.v = if flag { "yes" } else { "no" } + output: {"v": "no"} # V1 treats bare `flag` as `this.flag` (null); null is accepted as falsy in if conditions + + - name: "bare identifier with $ prefix works correctly" + mapping: | + let foo = "hello world" + root = $foo + output: "hello world" + + # --- Bare identifiers that ARE valid (parameters, match-as bindings) --- + + - name: "bare identifier as lambda parameter is valid" + mapping: | + root.v = [1, 2, 3].map_each(x -> x * 2) + output: {"v": [2, 4, 6]} + + - name: "bare identifier as match-as binding is valid" + skip: "V1 match has no `as` binding clause; use `.(val -> ...)` or a let binding instead" + mapping: | + root.v = match 42 as val { + val > 0 => "positive", + _ => "other", + } + output: {"v": "positive"} + + - name: "bare identifier as map parameter is valid" + skip: "V1 maps do not take parameters; receiver is `this`" + mapping: | + map double(x) { x * 2 } + root.v = double(21) + output: {"v": 42} + + # --- Variable with same name as parameter does not leak through bare ident --- + + - name: "variable does not shadow lambda parameter via bare ident" + mapping: | + let x = 999 + root.v = [1, 2, 3].map_each(x -> x * 2) + output: {"v": [2, 4, 6]} + + - name: "bare ident after lambda still requires $ for variable" + mapping: | + let x = 10 + root.items = [1, 2].map_each(x -> x + 1) + root.v = $x + output: + items: [2, 3] + v: 10 diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/copy_on_write.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/copy_on_write.yaml new file mode 100644 index 000000000..b3270324a --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/copy_on_write.yaml @@ -0,0 +1,203 @@ +description: "Copy-on-write semantics: independence from input, from output, between variables, nested mutation independence" + +tests: + # --- Independence from input --- + + - name: "variable copy from input is independent" + skip: "V1 does not support variable path assignment ($data.name = ...); must re-bind via `let data = $data.merge(...)`" + input: {"user": {"name": "Alice", "age": 30}} + mapping: | + let data = this.user + let data = $data.merge({"name": "Bob"}) + root.var_name = $data.name + root.input_name = this.user.name + output: {"var_name": "Bob", "input_name": "Alice"} + + - name: "variable copy from input nested field is independent" + skip: "V1 does not support variable path assignment" + input: {"config": {"settings": {"theme": "dark", "lang": "en"}}} + mapping: | + let settings = this.config.settings + let settings = $settings.merge({"theme": "light"}) + root.var_theme = $settings.theme + root.input_theme = this.config.settings.theme + output: {"var_theme": "light", "input_theme": "dark"} + + - name: "variable copy from input array is independent" + skip: "V1 does not support variable path/index assignment; use .map_each with enumerated to rebuild" + input: {"items": [1, 2, 3]} + mapping: | + let arr = this.items + let arr = $arr.enumerated().map_each(e -> if e.index == 0 { 99 } else { e.value }) + root.var_first = $arr.0 + root.input_first = this.items.0 + output: {"var_first": 99, "input_first": 1} + + - name: "variable copy from entire input is independent" + skip: "V1 does not support variable path assignment" + input: {"a": 1, "b": 2} + mapping: | + let copy = this + let copy = $copy.merge({"a": 100}) + root.var_a = $copy.a + root.input_a = this.a + output: {"var_a": 100, "input_a": 1} + + # --- Independence from output --- + + - name: "variable snapshot of output is independent from later output changes" + skip: "V1 `root` is not readable mid-mapping for snapshotting with COW semantics identical to V2; V1 reads `root` as the partial output but later path writes mutate it. Behaviour is not equivalent." + mapping: | + root.user.name = "Alice" + let snap = root.user + root.user.name = "Bob" + root.snap_name = $snap.name + output: {"user": {"name": "Bob"}, "snap_name": "Alice"} + + - name: "mutating variable snapshot does not affect output" + skip: "V1 does not support variable path assignment" + mapping: | + root.data = {"x": 1, "y": 2} + let snap = root.data + let snap = $snap.merge({"x": 99}) + root.snap_x = $snap.x + root.original_x = root.data.x + output: {"data": {"x": 1, "y": 2}, "snap_x": 99, "original_x": 1} + + - name: "variable snapshot of output array is independent" + skip: "V1 does not support output index assignment like `root.items[0] = 99` with V2's COW guarantee alongside a variable snapshot" + mapping: | + root.items = [10, 20, 30] + let snap = root.items + root.items.0 = 99 + root.snap_first = $snap.0 + output: {"items": [99, 20, 30], "snap_first": 10} + + # --- Independence between variables --- + + - name: "copy between variables is independent" + skip: "V1 does not support variable path assignment" + mapping: | + let a = {"x": 1} + let b = $a + let b = $b.merge({"x": 2}) + root.a = $a.x + root.b = $b.x + output: {"a": 1, "b": 2} + + - name: "copy between variables reverse mutation" + skip: "V1 does not support variable path assignment" + mapping: | + let a = {"x": 1} + let b = $a + let a = $a.merge({"x": 99}) + root.a = $a.x + root.b = $b.x + output: {"a": 99, "b": 1} + + - name: "multiple copies from same source are independent" + skip: "V1 does not support variable path assignment" + mapping: | + let source = {"val": 0} + let copy1 = $source + let copy2 = $source + let copy1 = $copy1.merge({"val": 1}) + let copy2 = $copy2.merge({"val": 2}) + root.source = $source.val + root.c1 = $copy1.val + root.c2 = $copy2.val + output: {"source": 0, "c1": 1, "c2": 2} + + - name: "chain of copies are all independent" + skip: "V1 does not support variable path assignment" + mapping: | + let a = {"v": "a"} + let b = $a + let b = $b.merge({"v": "b"}) + let c = $b + let c = $c.merge({"v": "c"}) + root.a = $a.v + root.b = $b.v + root.c = $c.v + output: {"a": "a", "b": "b", "c": "c"} + + - name: "array copy between variables is independent" + skip: "V1 does not support variable index assignment" + mapping: | + let a = [1, 2, 3] + let b = $a + let b = $b.enumerated().map_each(e -> if e.index == 0 { 99 } else { e.value }) + root.a = $a + root.b = $b + output: {"a": [1, 2, 3], "b": [99, 2, 3]} + + # --- Nested mutation independence --- + + - name: "nested object mutation independent from input" + skip: "V1 does not support nested variable path assignment" + input: {"record": {"address": {"city": "London", "zip": "SW1"}}} + mapping: | + let rec = this.record + let rec = $rec.merge({"address": $rec.address.merge({"city": "Paris"})}) + root.var_city = $rec.address.city + root.input_city = this.record.address.city + output: {"var_city": "Paris", "input_city": "London"} + + - name: "nested array mutation independent between variables" + skip: "V1 does not support nested variable path/index assignment" + mapping: | + let a = {"items": [1, 2, 3]} + let b = $a + let b = $b.merge({"items": $b.items.enumerated().map_each(e -> if e.index == 0 { 99 } else { e.value })}) + root.a_first = $a.items.0 + root.b_first = $b.items.0 + output: {"a_first": 1, "b_first": 99} + + - name: "deeply nested mutation independent between variables" + skip: "V1 does not support deeply nested variable path assignment" + mapping: | + let a = {"level1": {"level2": {"level3": "original"}}} + let b = $a + let b = $b.merge({"level1": $b.level1.merge({"level2": $b.level1.level2.merge({"level3": "modified"})})}) + root.a_val = $a.level1.level2.level3 + root.b_val = $b.level1.level2.level3 + output: {"a_val": "original", "b_val": "modified"} + + - name: "nested array in object mutation independent from output" + skip: "V1 does not support bracket-index assignment on output; path form `root.data.tags.0` can work but the V2 semantics of variable snapshot COW around output mutation does not translate cleanly" + mapping: | + root.data = {"tags": ["a", "b", "c"]} + let snap = root.data + root.data.tags.0 = "z" + root.snap_tag = $snap.tags.0 + output: {"data": {"tags": ["z", "b", "c"]}, "snap_tag": "a"} + + - name: "adding nested field to copy does not affect source" + skip: "V1 does not support nested variable path assignment" + mapping: | + let a = {"user": {"name": "Alice"}} + let b = $a + let b = $b.merge({"user": $b.user.merge({"email": "alice@example.com"})}) + root.a_email = $a.user.email + root.b_email = $b.user.email + output: {"a_email": null, "b_email": "alice@example.com"} + + - name: "deleting nested field in copy does not affect source" + skip: "V1 does not support variable path assignment (including deleted())" + mapping: | + let a = {"user": {"name": "Alice", "age": 30}} + let b = $a + let b = $b.merge({"user": $b.user.without("age")}) + root.a = $a.user + root.b = $b.user + output: {"a": {"name": "Alice", "age": 30}, "b": {"name": "Alice"}} + + # --- Independence with whole-document copies --- + + - name: "output = input then mutate output leaves input unchanged" + input: {"name": "Alice", "score": 100} + mapping: | + root = this + root.name = "Bob" + root.original_name = this.name + output: {"name": "Bob", "score": 100, "original_name": "Alice"} diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/declaration.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/declaration.yaml new file mode 100644 index 000000000..298821ae0 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/declaration.yaml @@ -0,0 +1,156 @@ +description: "Variable declaration: basic usage, types, use-before-declare errors, void and deleted errors" + +tests: + # --- Basic declaration and use --- + + - name: "declare and use integer variable" + mapping: | + let x = 42 + root.v = $x + output: {"v": 42} + + - name: "declare and use string variable" + mapping: | + let name = "hello" + root.v = $name + output: {"v": "hello"} + + - name: "declare and use boolean variable" + mapping: | + let flag = true + root.v = $flag + output: {"v": true} + + - name: "declare and use null variable" + mapping: | + let val = null + root.v = $val + output: {"v": null} + + - name: "declare and use float variable" + mapping: | + let pi = 3.14 + root.v = $pi + output: {"v": 3.14} + + - name: "declare and use array variable" + mapping: | + let arr = [1, 2, 3] + root.v = $arr + output: {"v": [1, 2, 3]} + + - name: "declare and use object variable" + mapping: | + let obj = {"a": 1, "b": 2} + root.v = $obj + output: {"v": {"a": 1, "b": 2}} + + - name: "declare variable from expression" + mapping: | + let x = 10 + 5 + root.v = $x + output: {"v": 15} + + - name: "declare variable from input field" + input: {"name": "Alice"} + mapping: | + let n = this.name + root.v = $n + output: {"v": "Alice"} + + - name: "declare variable from other variable" + mapping: | + let a = 100 + let b = $a + root.v = $b + output: {"v": 100} + + - name: "multiple independent variables" + mapping: | + let x = 1 + let y = 2 + let z = 3 + root.sum = $x + $y + $z + output: {"sum": 6} + + - name: "variable used in expression" + mapping: | + let base = 10 + root.v = $base * 2 + 1 + output: {"v": 21} + + # --- Use before declare is compile error --- + + - name: "use undeclared variable is compile error" + mapping: | + root.v = $x + error: "undefined" # FIXME-v1: verify — V1 raises a runtime error ("variable 'x' undefined") rather than a compile error + + - name: "use variable before its declaration is compile error" + mapping: | + root.v = $x + let x = 42 + error: "undefined" # FIXME-v1: verify — V1 evaluates let eagerly; read before let is a runtime undefined error + + - name: "reference undeclared variable in expression is compile error" + mapping: | + let y = $x + 1 + error: "undefined" # FIXME-v1: verify — V1 produces a runtime "variable 'x' undefined" error + + # --- Void in declaration is runtime error --- + + - name: "void from if-without-else in declaration is runtime error" + skip: "V1 if-expression without matching branch yields null, not a void/error; let binding accepts null" + mapping: | + let x = if false { 42 } + error: "void" # FIXME-v1: V1 produces null here, so no error is raised + + - name: "void from non-exhaustive match in declaration is runtime error" + skip: "V1 match with no matching arm yields null, not an error" + mapping: | + let x = match "nope" { + "a" => 1, + } + error: "void" # FIXME-v1: V1 produces null + + - name: "void from else-if chain without final else in declaration is runtime error" + skip: "V1 if-chain without matching branch yields null, not an error" + input: {"score": 10} + mapping: | + let tier = if false { "gold" } else if false { "silver" } + error: "void" # FIXME-v1: V1 produces null + + # --- Deleted in declaration is runtime error --- + + - name: "deleted in variable declaration is runtime error" + skip: "V1 treats `let x = deleted()` as a DELETE of the binding (no error); subsequent $x reads fail" + mapping: | + let x = deleted() + error: "deleted" # FIXME-v1: V1 silently deletes the var; no error at the let + + # --- Void rescued with or is ok --- + + - name: "or rescues void for variable declaration" + mapping: | + let x = (if false { 42 }).or(0) + root.v = $x + output: {"v": 0} + + - name: "if-else provides value for variable declaration" + mapping: | + let x = if false { 42 } else { 0 } + root.v = $x + output: {"v": 0} + + # --- Variable holds bytes --- + + - name: "declare variable with bytes value" + mapping: | + let b = "hello".bytes() + root.v = $b + output: {"v": {_type: "bytes", value: "aGVsbG8="}} + + # --- Variable holds timestamp --- + + - name: "declare variable with timestamp value" + skip: "migrator V1 core env lacks ts_parse (lives in internal/impl/pure, not loaded by the test harness)" diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/dynamic_assignment.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/dynamic_assignment.yaml new file mode 100644 index 000000000..42c1fd237 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/dynamic_assignment.yaml @@ -0,0 +1,80 @@ +description: > + Dynamic index assignments — assigning to output or variable paths using + computed indices from variables or expressions. + +tests: + # --- Variable as object key --- + + - name: "output with variable string key" + mapping: | + let key = "name" + root.data = {($key): "Alice"} + output: {"data": {"name": "Alice"}} + + - name: "output with variable integer index" + skip: "V1 has no bracket-index assignment; dynamic integer indices into arrays cannot be expressed as a writable target" + mapping: | + let idx = 0 + root.arr = [].enumerated().map_each(_ -> null) + output: {"arr": ["first"]} + + - name: "output with multiple variable keys" + mapping: | + let k1 = "a" + let k2 = "b" + root = {($k1): 1, ($k2): 2} + output: {"a": 1, "b": 2} + + # --- Nested dynamic paths --- + + - name: "nested dynamic keys" + mapping: | + let outer = "data" + let inner = "name" + root = {($outer): {($inner): "Alice"}} + output: {"data": {"name": "Alice"}} + + # --- Dynamic assignment on variables --- + + - name: "variable path with dynamic key" + skip: "V1 does not support variable path assignment; must rebuild with .merge()" + mapping: | + let obj = {"a": 1, "b": 2} + let key = "a" + let obj = $obj.merge({($key): 99}) + root = $obj + output: {"a": 99, "b": 2} + + - name: "variable path with dynamic index" + skip: "V1 has no bracket-index assignment for variables" + mapping: | + let arr = [10, 20, 30] + let idx = 1 + let arr = $arr.enumerated().map_each(e -> if e.index == $idx { 99 } else { e.value }) + root = $arr + output: [10, 99, 30] + + # --- Computed expressions as keys --- + + - name: "computed string key from concatenation" + mapping: | + let prefix = "key" + root = {($prefix + "_1"): "val"} + output: {"key_1": "val"} + + - name: "computed key from method result" + mapping: | + let name = "Hello" + root = {($name.lowercase()): true} + output: {"hello": true} + + # --- Deleted with dynamic keys --- + + - name: "delete object field with dynamic key" + skip: "V1 does not support variable path assignment with dynamic key" + mapping: | + let obj = {"a": 1, "b": 2, "c": 3} + let key = "b" + let obj = $obj.without($key) + root = $obj + output: {"a": 1, "c": 3} diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/expr_body_path_assign.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/expr_body_path_assign.yaml new file mode 100644 index 000000000..d263027cc --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/expr_body_path_assign.yaml @@ -0,0 +1,109 @@ +description: > + Path assignments inside expression bodies — variable path mutations + like $obj[$key] = val in if-expressions, match-expressions, lambda + blocks, and nested combinations. These must resolve to the existing + variable slot, not shadow it. + +tests: + # --- Path assignment in if-expression body --- + + - name: "path assign with dynamic key in if-expression" + skip: "V1 does not support variable path assignment; V1 if-expression body is a single expression, not a statement list" + mapping: | + let key = "x" + root.v = if true { + let obj = {} + {($key): 42} + } else { {} } + output: {"v": {"x": 42}} + + - name: "path assign in else branch reads correct slot" + skip: "V1 if-expression body is a single expression; cannot declare + mutate within" + mapping: | + root.v = if false { + "never" + } else { + {"base": true, "extra": "added"} + } + output: {"v": {"base": true, "extra": "added"}} + + - name: "path assign in if-expression both branches" + skip: "V1 if-expression body is a single expression; cannot chain var declare + mutate" + mapping: | + let flag = true + root.v = if $flag { + {"branch": "then", "tag": "tagged"} + } else { + {"branch": "else", "tag": "tagged"} + } + output: {"v": {"branch": "then", "tag": "tagged"}} + + # --- Path assignment in lambda block body --- + + - name: "path assign in map lambda block" + skip: "V1 lambda body is a single expression; no multi-statement blocks" + mapping: | + root.v = [1, 2, 3].map_each(x -> {"val": x, "doubled": x * 2}) + output: {"v": [{"val": 1, "doubled": 2}, {"val": 2, "doubled": 4}, {"val": 3, "doubled": 6}]} + + - name: "path assign with dynamic key in map lambda" + skip: "V1 lambda body is a single expression" + mapping: | + let keys = ["a", "b", "c"] + root.v = [0, 1, 2].map_each(i -> {($keys.index(i)): i * 10}) + output: {"v": [{"a": 0}, {"b": 10}, {"c": 20}]} + + - name: "path assign in fold accumulator" + skip: "V1 fold lambda `tally -> value -> expr` is a single expression; use .merge on accumulator" + mapping: | + root.v = ["x", "y", "z"].fold({}, tally -> value -> $tally.merge({($value.value): true})) + output: {"v": {"x": true, "y": true, "z": true}} + + - name: "path assign in fold with index tracking" + skip: "V1 fold lambda is a single expression; rebuild via .merge" + mapping: | + root.v = ["a", "b", "c"].enumerated().fold({}, tally -> value -> $tally.merge({($value.value.value): $value.value.index})) + output: {"v": {"a": 0, "b": 1, "c": 2}} + + # --- Nested path assignments across lambda + if --- + + - name: "path assign via statement if inside lambda" + skip: "V1 lambda body is a single expression; cannot embed statement-mode if" + mapping: | + root.v = [1, -2, 3, -4].map_each(x -> {"value": x, "sign": if x > 0 { "positive" } else { "negative" }}) + output: + v: + - value: 1 + sign: "positive" + - value: -2 + sign: "negative" + - value: 3 + sign: "positive" + - value: -4 + sign: "negative" + + - name: "path assign in nested lambdas" + skip: "V1 lambda body is a single expression" + mapping: | + root.v = [[1, 2], [3, 4]].map_each(row -> {"items": row.map_each(x -> x * 10)}) + output: {"v": [{"items": [10, 20]}, {"items": [30, 40]}]} + + # --- Multiple path assigns in lambda block --- + + - name: "multiple path assigns in lambda block accumulate" + skip: "V1 lambda body is a single expression; no multi-statement blocks" + mapping: | + root.v = [1].map_each(x -> {"a": 1, "b": 2, "c": 3}) + output: {"v": [{"a": 1, "b": 2, "c": 3}]} + + - name: "path assign to array index in lambda block" + skip: "V1 lambda body is a single expression; no sequential index assignments" + mapping: | + root.v = [1].map_each(x -> [10, 20, 30]) + output: {"v": [[10, 20, 30]]} + + - name: "path assign then read modified field in lambda" + skip: "V1 lambda body is a single expression" + mapping: | + root.v = [1].map_each(x -> 5 + 10) + output: {"v": [15]} diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/nested_scope_mutations.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/nested_scope_mutations.yaml new file mode 100644 index 000000000..39d9b383b --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/nested_scope_mutations.yaml @@ -0,0 +1,164 @@ +description: > + Variable mutation patterns across nested scopes — statement-mode + write-through in nested if/match, expression-mode shadowing, + and interactions between the two contexts. + +tests: + # --- Statement-mode nested if: both levels modify outer --- + + - name: "nested if-statements both modify outer variable" + mapping: | + let x = 0 + if true { + let x = 1 + if true { + let x = 2 + } + } + root.v = $x + output: {"v": 2} + + - name: "nested if-statement inner false — outer modified only" + mapping: | + let x = 0 + if true { + let x = 1 + if false { + let x = 2 + } + } + root.v = $x + output: {"v": 1} + + - name: "match-statement arms modify outer variable" + skip: "V1 has no statement-mode match; match is expression-only, arms are single expressions not blocks" + mapping: | + let result = "init" + match "b" { + "a" => "alpha", + "b" => "beta", + _ => "other", + } + root.v = $result + output: {"v": "beta"} + + - name: "nested match inside if modifies outer" + skip: "V1 has no statement-mode match with `{ let val = ... }` arm blocks" + mapping: | + let val = 0 + if true { + let val = match "x" { + "x" => 42, + _ => -1, + } + } + root.v = $val + output: {"v": 42} + + # --- Expression-mode shadowing does not affect outer --- + + - name: "if-expression shadows outer variable" + skip: "V1 has no block scope for let — `let x = \"inner\"` inside an if-expression mutates the outer binding" + mapping: | + let x = "outer" + root.inner = if true { + let x = "inner" + $x + } else { "nope" } + root.outer = $x + output: {"inner": "inner", "outer": "outer"} + + - name: "match-expression shadows outer variable" + skip: "V1 match arm body is a single expression; cannot declare + return" + mapping: | + let x = "outer" + root.matched = match "go" { + "go" => "matched", + _ => "nope", + } + root.outer = $x + output: {"matched": "matched", "outer": "outer"} + + - name: "lambda shadows outer variable" + skip: "V1 lambda body is a single expression; cannot declare + return. Also V1 has no block scope for let." + mapping: | + let x = 100 + root.mapped = [1, 2, 3].map_each(x -> x * 10) + root.outer = $x + output: {"mapped": [10, 20, 30], "outer": 100} + + # --- Statement modifies then expression reads --- + + - name: "statement modifies variable, expression reads it" + skip: "V1 does not support variable path assignment; use let to rebuild" + mapping: | + let data = {"status": "pending"} + if true { + let data = $data.merge({"status": "done"}) + } + root.v = if true { $data.status } else { "unknown" } + output: {"v": "done"} + + # --- Expression body variable not visible outside --- + + - name: "variable declared in if-expression not visible outside" + skip: "V1 has no block scope for let; `let temp = 42` inside an if is visible outside" + mapping: | + root.v = if true { + let temp = 42 + $temp + } else { 0 } + root.leaked = $temp + compile_error: "undeclared" # FIXME-v1: V1 has no scope boundary — no error + + - name: "variable declared in match-expression not visible outside" + skip: "V1 match arm body is a single expression and has no let scope anyway; also no scope isolation" + mapping: | + root.v = match "a" { + "a" => "hello", + _ => "nope", + } + root.leaked = $inner + compile_error: "undeclared" # FIXME-v1 + + - name: "variable declared in lambda not visible outside" + skip: "V1 lambda body is a single expression; cannot declare within it" + mapping: | + root.v = [1].map_each(x -> x * 10) + root.leaked = $inner + compile_error: "undeclared" # FIXME-v1 + + # --- Path assign in statement mode modifies outer --- + + - name: "path assign in if-statement modifies outer object" + skip: "V1 does not support variable path assignment; must rebuild via .merge" + mapping: | + let config = {"debug": false, "verbose": false} + if true { + let config = $config.merge({"debug": true}) + } + root.v = $config + output: {"v": {"debug": true, "verbose": false}} + + - name: "path assign in match-statement modifies outer object" + skip: "V1 has no statement-mode match with block arms, and no variable path assignment" + mapping: | + let config = {"level": "info"} + match "debug" { + "debug" => { let config = $config.merge({"level": "debug"}) }, + "trace" => { let config = $config.merge({"level": "trace"}) }, + } + root.v = $config + output: {"v": {"level": "debug"}} + + # --- Complex nested: statement with expression inside --- + + - name: "if-statement body uses expression that shadows" + skip: "V1 lambda body is a single expression; cannot declare within" + mapping: | + let items = [] + if true { + let items = [1, 2, 3].map_each(x -> x * 2) + } + root.v = $items + output: {"v": [2, 4, 6]} diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/path_assignment.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/path_assignment.yaml new file mode 100644 index 000000000..d189092e3 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/path_assignment.yaml @@ -0,0 +1,189 @@ +description: "Variable path assignment: field mutation, index mutation, auto-creation, gap filling, deleted removal, wrong intermediate type errors" + +tests: + # --- Field assignment --- + + - name: "assign field on object variable" + skip: "V1 does not support variable path assignment; must rebuild via .merge" + mapping: | + let obj = {"name": "Alice"} + let obj = $obj.merge({"name": "Bob"}) + root.v = $obj + output: {"v": {"name": "Bob"}} + + - name: "add new field to object variable" + skip: "V1 does not support variable path assignment" + mapping: | + let obj = {"a": 1} + let obj = $obj.merge({"b": 2}) + root.v = $obj + output: {"v": {"a": 1, "b": 2}} + + - name: "assign nested field on object variable" + skip: "V1 does not support nested variable path assignment" + mapping: | + let obj = {"user": {"name": "Alice"}} + let obj = $obj.merge({"user": $obj.user.merge({"name": "Bob"})}) + root.v = $obj + output: {"v": {"user": {"name": "Bob"}}} + + - name: "auto-create intermediate object for field assignment" + skip: "V1 has no auto-creation semantics for variables; value must be explicitly built" + mapping: | + let obj = {"user": {"name": "Alice"}} + root.v = $obj + output: {"v": {"user": {"name": "Alice"}}} + + - name: "deeply nested auto-creation" + skip: "V1 has no auto-creation semantics for variables" + mapping: | + let obj = {"a": {"b": {"c": {"d": 42}}}} + root.v = $obj + output: {"v": {"a": {"b": {"c": {"d": 42}}}}} + + # --- Index assignment --- + + - name: "assign index on array variable" + skip: "V1 does not support variable index assignment" + mapping: | + let arr = [10, 20, 30] + let arr = $arr.enumerated().map_each(e -> if e.index == 1 { 99 } else { e.value }) + root.v = $arr + output: {"v": [10, 99, 30]} + + - name: "assign to end of array" + skip: "V1 does not support variable index assignment; use .append" + mapping: | + let arr = [1, 2, 3] + let arr = $arr.append(4) + root.v = $arr + output: {"v": [1, 2, 3, 4]} + + - name: "gaps filled with null" + skip: "V1 has no gap-filling semantics" + mapping: | + let arr = [1, null, null, 99] + root.v = $arr + output: {"v": [1, null, null, 99]} + + - name: "assign to index zero of empty array" + skip: "V1 does not support variable index assignment" + mapping: | + let arr = ["first"] + root.v = $arr + output: {"v": ["first"]} + + - name: "negative index assignment" + skip: "V1 does not support variable index assignment, and path-form `.−1` is a parse error" + mapping: | + let arr = [10, 20, 99] + root.v = $arr + output: {"v": [10, 20, 99]} + + # --- Auto-creation based on index type --- + + - name: "auto-create array for numeric index on empty object field" + skip: "V1 has no auto-creation semantics for variables" + mapping: | + let obj = {"items": ["first"]} + root.v = $obj + output: {"v": {"items": ["first"]}} + + - name: "auto-create object for string field on empty object field" + skip: "V1 has no auto-creation semantics for variables" + mapping: | + let obj = {"nested": {"key": "value"}} + root.v = $obj + output: {"v": {"nested": {"key": "value"}}} + + # --- Deleted removes field --- + + - name: "deleted removes field from variable object" + skip: "V1 does not support variable path assignment (including deleted()); use .without" + mapping: | + let obj = {"a": 1, "b": 2, "c": 3} + let obj = $obj.without("b") + root.v = $obj + output: {"v": {"a": 1, "c": 3}} + + - name: "deleted removes nested field from variable object" + skip: "V1 does not support nested variable path assignment" + mapping: | + let obj = {"user": {"name": "Alice", "age": 30}} + let obj = $obj.merge({"user": $obj.user.without("age")}) + root.v = $obj + output: {"v": {"user": {"name": "Alice"}}} + + # --- Deleted removes array element (shifts remaining) --- + + - name: "deleted removes array element and shifts" + skip: "V1 does not support variable index assignment" + mapping: | + let arr = [10, 20, 30, 40] + let arr = $arr.enumerated().filter(e -> e.index != 1).map_each(e -> e.value) + root.v = $arr + output: {"v": [10, 30, 40]} + + - name: "deleted removes first array element" + skip: "V1 does not support variable index assignment" + mapping: | + let arr = [10, 20, 30] + let arr = $arr.slice(1) + root.v = $arr + output: {"v": [20, 30]} + + - name: "deleted removes last array element" + skip: "V1 does not support variable index assignment" + mapping: | + let arr = [10, 20, 30] + let arr = $arr.slice(0, $arr.length() - 1) + root.v = $arr + output: {"v": [10, 20]} + + # --- Wrong intermediate type errors --- + + - name: "field assignment on string variable is error" + skip: "V1 does not support variable path assignment; no such error is raised" + mapping: | + let val = "hello" + error: "field" # FIXME-v1: test is inapplicable in V1 + + - name: "field assignment on integer variable is error" + skip: "V1 does not support variable path assignment" + mapping: | + let val = 42 + error: "field" # FIXME-v1 + + - name: "field assignment on boolean variable is error" + skip: "V1 does not support variable path assignment" + mapping: | + let val = true + error: "field" # FIXME-v1 + + - name: "index assignment on string variable is error" + skip: "V1 does not support variable index assignment" + mapping: | + let val = "hello" + error: "index" # FIXME-v1 + + - name: "nested path wrong intermediate type is error" + skip: "V1 does not support variable path assignment" + mapping: | + let obj = {"name": "Alice"} + error: "field" # FIXME-v1 + + # --- Multiple path assignments --- + + - name: "multiple field assignments build up object" + skip: "V1 does not support variable path assignment; must build via a single object literal or chained merges" + mapping: | + let record = {"name": "Alice", "age": 30, "active": true} + root.v = $record + output: {"v": {"name": "Alice", "age": 30, "active": true}} + + - name: "mixed field and index assignments" + skip: "V1 does not support variable path/index assignment" + mapping: | + let data = {"scores": [90, 85], "name": "Alice"} + root.v = $data + output: {"v": {"scores": [90, 85], "name": "Alice"}} diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/reassignment.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/reassignment.yaml new file mode 100644 index 000000000..68e8ee0e5 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/reassignment.yaml @@ -0,0 +1,208 @@ +description: "Variable reassignment: same-scope mutation, statement context outer modification, block-scoped new vars, pre-declare pattern" + +tests: + # --- Same-scope reassignment (mutation) --- + + - name: "reassign variable in same scope" + mapping: | + let x = 1 + let x = 2 + root.v = $x + output: {"v": 2} + + - name: "reassign variable multiple times" + mapping: | + let x = 1 + let x = 2 + let x = 3 + let x = 4 + root.v = $x + output: {"v": 4} + + - name: "reassign variable to different type" + mapping: | + let x = 42 + let x = "hello" + root.v = $x + output: {"v": "hello"} + + - name: "reassign variable using its own value" + mapping: | + let x = 10 + let x = $x + 5 + root.v = $x + output: {"v": 15} + + - name: "reassign variable accumulation" + mapping: | + let sum = 0 + let sum = $sum + 1 + let sum = $sum + 2 + let sum = $sum + 3 + root.v = $sum + output: {"v": 6} + + - name: "void skips variable reassignment" + skip: "V1 has no void concept; `if false { 42 }` yields null, and `let x = null` re-binds x to null (does not skip)" + mapping: | + let x = 10 + let x = if false { 42 } + root.v = $x + output: {"v": 10} # FIXME-v1: V1 would produce null, not 10 + + - name: "void from match skips variable reassignment" + skip: "V1 has no void concept; unmatched match yields null, which re-binds the variable" + mapping: | + let x = "original" + let x = match "nope" { + "a" => "found", + } + root.v = $x + output: {"v": "original"} # FIXME-v1: V1 would produce null + + # --- Statement context: if-statement modifies outer variable --- + + - name: "if-statement outer variable modification" + mapping: | + let value = 10 + if this.flag { + let value = 20 + } + root.v = $value + cases: + - name: "modifies when true" + input: {"flag": true} + output: {"v": 20} + - name: "unchanged when false" + input: {"flag": false} + output: {"v": 10} + + - name: "if-else statement modifies outer variable in else branch" + input: {"flag": false} + mapping: | + let value = "initial" + if this.flag { + let value = "from-if" + } else { + let value = "from-else" + } + root.v = $value + output: {"v": "from-else"} + + - name: "match statement outer variable modification" + skip: "V1 has no statement-mode match with `{ let result = ... }` arm blocks" + mapping: | + let result = "none" + match this.kind { + "a" => { let result = "found-a" }, + "b" => { let result = "found-b" }, + } + root.v = $result + cases: + - name: "modifies on match" + input: {"kind": "b"} + output: {"v": "found-b"} + - name: "unchanged on no match" + input: {"kind": "c"} + output: {"v": "none"} + + # --- New variables in statement blocks are block-scoped --- + + - name: "new variable in if-statement block not visible outside" + skip: "V1 has no block scope for let" + input: {"flag": true} + mapping: | + if this.flag { + let local = "hello" + root.inner = $local + } + root.outer = $local + compile_error: "undeclared" # FIXME-v1: V1 has no block scope, so $local is visible and no error + + - name: "new variable in match statement block not visible outside" + skip: "V1 has no statement-mode match, and no block scope for let" + mapping: | + match "a" { + "a" => { + let local = "found" + root.inner = $local + }, + } + root.outer = $local + compile_error: "undeclared" # FIXME-v1 + + - name: "new variable in else block not visible outside" + skip: "V1 has no block scope for let" + input: {"flag": false} + mapping: | + if this.flag { + root.x = 1 + } else { + let temp = "hello" + root.y = $temp + } + root.z = $temp + compile_error: "undeclared" # FIXME-v1: V1 has no block scope + + # --- Pre-declare pattern --- + + - name: "pre-declare pattern with if-statement" + mapping: | + let temp = null + if this.flag { + let temp = "found" + } + root.v = $temp + cases: + - name: "true branch assigns" + input: {"flag": true} + output: {"v": "found"} + - name: "false branch keeps null" + input: {"flag": false} + output: {"v": null} + + - name: "pre-declare pattern with match statement" + skip: "V1 has no statement-mode match with block arms" + input: {"kind": "gold"} + mapping: | + let discount = 0 + match this.kind { + "gold" => { let discount = 20 }, + "silver" => { let discount = 10 }, + } + root.v = $discount + output: {"v": 20} + + - name: "pre-declare pattern with nested if-statements" + input: {"a": true, "b": true} + mapping: | + let msg = "none" + if this.a { + let msg = "a" + if this.b { + let msg = "a+b" + } + } + root.v = $msg + output: {"v": "a+b"} + + - name: "pre-declare pattern modification and further use" + input: {"items": ["x", "y", "z"]} + mapping: | + let count = 0 + let count = this.items.length() + root.v = $count + output: {"v": 3} + + # --- Statement context reassignment uses new value after block --- + + - name: "outer variable modified in statement reflects after block" + input: {"score": 95} + mapping: | + let tier = "bronze" + if this.score >= 90 { + let tier = "gold" + } + root.tier = $tier + root.check = $tier == "gold" + output: {"tier": "gold", "check": true} diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/scope_boundaries.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/scope_boundaries.yaml new file mode 100644 index 000000000..3756c3076 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/scope_boundaries.yaml @@ -0,0 +1,140 @@ +description: > + Variable scope boundary semantics — variables crossing if/match/lambda + boundaries, statement-mode write-through, and expression-mode shadowing. + +tests: + # --- Statement-mode write-through --- + + - name: "if-statement branch modifies outer variable" + mapping: | + let x = 1 + if true { + let x = 99 + } + root.v = $x + output: {"v": 99} + + - name: "if-statement false branch does not execute" + mapping: | + let x = 1 + if false { + let x = 99 + } + root.v = $x + output: {"v": 1} + + - name: "nested if-statements both modify outer" + mapping: | + let x = 0 + if true { + let x = 1 + if true { + let x = 2 + } + } + root.v = $x + output: {"v": 2} + + - name: "match-statement arm modifies outer variable" + skip: "V1 has no statement-mode match with `{ let x = ... }` arm blocks" + mapping: | + let x = 0 + match "a" { + "a" => { let x = 42 } + } + root.v = $x + output: {"v": 42} + + # --- Block-scoped variables --- + + - name: "variable declared in if-branch not visible outside" + skip: "V1 has no block scope for let" + mapping: | + if true { + let inner = 10 + } + root.v = $inner + compile_error: "undeclared" # FIXME-v1: V1 has no scope boundary — no error + + - name: "variable declared in match-arm not visible outside" + skip: "V1 has no statement-mode match with block arms, and no block scope" + mapping: | + match "a" { + "a" => { let inner = 10 } + } + root.v = $inner + compile_error: "undeclared" # FIXME-v1 + + # --- Expression-mode shadowing --- + + - name: "if-expression shadows outer variable" + skip: "V1 if-expression body is a single expression; cannot declare + return. Also V1 has no block scope." + mapping: | + let x = "outer" + let result = if true { + let x = "shadow" + $x + } + root.outer = $x + root.result = $result + output: + outer: "outer" + result: "shadow" + + - name: "match-expression shadows outer variable" + skip: "V1 match arm body is a single expression; cannot declare + return" + mapping: | + let x = "outer" + let result = match "a" { + "a" => { + let x = "shadow" + $x + } + } + root.outer = $x + root.result = $result + output: + outer: "outer" + result: "shadow" + + # --- Combined patterns --- + + - name: "pre-declare then write in both branches" + mapping: | + let result = "" + if this.flag { + let result = "yes" + } else { + let result = "no" + } + root.v = $result + cases: + - name: "true branch" + input: {"flag": true} + output: {"v": "yes"} + - name: "false branch" + input: {"flag": false} + output: {"v": "no"} + + - name: "outer variable and block-scoped variable coexist" + skip: "V1 has no block scope for let; $y leaks out of the if-block (visible after it). FIXME-v1: this would still evaluate without error but semantics differ." + mapping: | + let x = "outer" + if true { + let x = "modified" + let y = "local" + root.local = $y + } + root.outer = $x + output: + outer: "modified" + local: "local" + + - name: "lambda reads outer variable modified by if-statement" + mapping: | + let factor = 1 + if true { + let factor = 10 + } + root.v = [1, 2, 3].map_each(x -> x * $factor) + output: {"v": [10, 20, 30]} diff --git a/internal/bloblang2/migrator/v1spec/tests/variables/shadowing.yaml b/internal/bloblang2/migrator/v1spec/tests/variables/shadowing.yaml new file mode 100644 index 000000000..5436d630a --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/tests/variables/shadowing.yaml @@ -0,0 +1,235 @@ +description: "Variable shadowing in expression contexts: if-expression, match expression, lambda bodies; outer unchanged; same-scope reassignment within expression" + +tests: + # --- If-expression shadows outer variable --- + + - name: "if-expression variable shadowing" + skip: "V1 if-expression body is a single expression; cannot declare + return. V1 also has no block scope — `let value = 20` would mutate the outer binding." + mapping: | + let value = 10 + root.inner = if this.flag { + let value = 20 + $value + } + root.outer = $value + cases: + - name: "shadow when true" + input: {"flag": true} + output: {"inner": 20, "outer": 10} + - name: "outer unchanged when false" + input: {"flag": false} + output: {"outer": 10} + + - name: "if-else expression shadowing" + skip: "V1 if-expression body is a single expression; no scope isolation either" + mapping: | + let x = "original" + root.result = if this.flag { + let x = "from-if" + $x + } else { + let x = "from-else" + $x + } + root.outer = $x + cases: + - name: "if branch shadow" + input: {"flag": true} + output: {"result": "from-if", "outer": "original"} + - name: "else branch shadow" + input: {"flag": false} + output: {"result": "from-else", "outer": "original"} + + # --- Match expression shadows outer variable --- + + - name: "match expression shadows outer variable" + skip: "V1 match arm body is a single expression; cannot declare + return" + input: {"kind": "a"} + mapping: | + let val = "outer" + root.result = match this.kind { + "a" => { + let val = "inner-a" + $val + }, + _ => "default", + } + root.outer = $val + output: {"result": "inner-a", "outer": "outer"} + + - name: "match expression shadow in different arms" + skip: "V1 match arm body is a single expression" + input: {"kind": "b"} + mapping: | + let val = "outer" + root.result = match this.kind { + "a" => { + let val = "inner-a" + $val + }, + "b" => { + let val = "inner-b" + $val + }, + _ => "default", + } + root.outer = $val + output: {"result": "inner-b", "outer": "outer"} + + - name: "match expression default arm also shadows" + skip: "V1 match arm body is a single expression" + input: {"kind": "c"} + mapping: | + let val = 0 + root.result = match this.kind { + "a" => { + let val = 1 + $val + }, + _ => { + let val = 99 + $val + }, + } + root.outer = $val + output: {"result": 99, "outer": 0} + + # --- Same-scope reassignment within expression body --- + + - name: "reassignment within if-expression body is mutation not shadow" + skip: "V1 if-expression body is a single expression; cannot declare + reassign + return" + input: {"flag": true} + mapping: | + let outer = "untouched" + root.result = if this.flag { + let x = 1 + let x = 2 + let x = 3 + $x + } + root.outer = $outer + output: {"result": 3, "outer": "untouched"} + + - name: "reassignment within match-expression body is mutation" + skip: "V1 match arm body is a single expression" + input: {"kind": "a"} + mapping: | + root.result = match this.kind { + "a" => { + let acc = 0 + let acc = $acc + 10 + let acc = $acc + 20 + $acc + }, + _ => 0, + } + output: {"result": 30} + + # --- New variables in expression block are block-scoped --- + + - name: "new variable in if-expression not visible outside" + skip: "V1 if-expression body is a single expression; V1 also has no block scope for let" + input: {"flag": true} + mapping: | + root.result = if this.flag { + let local = 42 + $local + } + root.after = $local + compile_error: "undeclared" # FIXME-v1 + + - name: "new variable in match-expression not visible outside" + skip: "V1 match arm body is a single expression" + mapping: | + root.result = match "a" { + "a" => { + let inner = "found" + $inner + }, + _ => "default", + } + root.after = $inner + compile_error: "undeclared" # FIXME-v1 + + # --- Nested expression contexts --- + + - name: "output assignment in expression context is compile error" + skip: "V1 if-expression body is a single expression; statement-level root assignment inside it is a parse error, but the error message differs" + input: {"a": true, "b": true} + mapping: | + let x = "top" + root.inner = if this.a { + let x = "level1" + root.nested = if this.b { + let x = "level2" + $x + } + $x + } + root.outer = $x + compile_error: "output" # FIXME-v1: V1 parse error message will not contain "output" + + - name: "shadow does not leak between sibling expressions" + skip: "V1 if-expression body is a single expression; cannot declare + return. V1 has no scope isolation either — `let x = \"first\"` persists across siblings." + input: {"flag": true} + mapping: | + let x = "original" + root.first = if this.flag { + let x = "first" + $x + } + root.second = if this.flag { + $x + } + root.outer = $x + output: {"first": "first", "second": "original", "outer": "original"} + + # --- Lambda expression context shadows --- + + - name: "lambda body shadows outer variable" + skip: "V1 lambda body is a single expression; cannot declare + return. V1 also has no block scope." + mapping: | + let x = "outer" + root.result = [1, 2, 3].map_each(n -> { + let x = n * 10 + $x + }) + root.outer = $x + output: {"result": [10, 20, 30], "outer": "outer"} + + # --- Map body expression context shadows --- + + - name: "map body shadows outer variable via expression context" + skip: "V1 maps do not take parameters; receiver is `this`. Also V1 clears variables on apply entry and doesn't leak them out, so shadowing is irrelevant here." + mapping: | + map add_label(data) { + let tag = "inner" + {"value": data, "tag": $tag} + } + let tag = "outer" + root.result = add_label(42) + root.tag = $tag + output: {"result": {"value": 42, "tag": "inner"}, "tag": "outer"} + + # --- Expression context: outer variable read but not mutated --- + + - name: "expression reads outer variable without mutating it" + input: {"flag": true} + mapping: | + let base = 100 + root.result = if this.flag { + $base + 50 + } + root.base = $base + output: {"result": 150, "base": 100} + + - name: "match expression reads outer variable without mutating it" + input: {"multiplier": 3} + mapping: | + let base = 10 + root.result = match this.multiplier { + 3 => $base * 3, + _ => $base, + } + root.base = $base + output: {"result": 30, "base": 10} diff --git a/internal/bloblang2/migrator/v1spec/v1quirks_test.go b/internal/bloblang2/migrator/v1spec/v1quirks_test.go new file mode 100644 index 000000000..956b27469 --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/v1quirks_test.go @@ -0,0 +1,443 @@ +package v1spec_test + +import ( + "reflect" + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1spec" +) + +// TestV1Quirks exercises V1-specific behaviours that the YAML corpus doesn't +// cover because those behaviours have no V2 counterpart to translate from. +// Each case documents a specific claim from bloblang_v1_spec.md so the spec +// and reality cannot silently drift. +// +// Each entry is one of: +// - output: a concrete Go value the mapping must produce (reflect.DeepEqual) +// - runtimeErr: substring that must appear in the runtime error +// - compileErr: substring that must appear in the compile error +// +// Exactly one of the three must be set per case. +func TestV1Quirks(t *testing.T) { + for _, c := range []struct { + // Section reference in bloblang_v1_spec.md for traceability. + spec string + // Human-readable claim being verified. + name string + // The V1 mapping. + mapping string + // Optional input document (default nil). + input any + + output any + runtimeErr string + compileErr string + }{ + // §2.1 — whitespace and newline rules around assignment = + { + spec: "§2.1", name: "assignment = needs whitespace before", + mapping: "root.a =1", compileErr: "expected whitespace", + }, + { + spec: "§2.1", name: "assignment = needs whitespace after", + mapping: "root.a= 1", compileErr: "expected whitespace", + }, + { + spec: "§2.1", name: "let = needs whitespace", + mapping: "let x=5", compileErr: "expected whitespace", + }, + { + spec: "§2.1", name: "binary + does not need whitespace", + mapping: "root = 1+2", output: int64(3), + }, + + // §5.1 — ! is single-use (not stackable) + { + spec: "§5.1", name: "double-not !!x is a parse error", + mapping: "root = !!true", compileErr: "expected query", + }, + { + spec: "§5.1", name: "parenthesised double-not works", + mapping: "root = !(!true)", output: true, + }, + + // §3 — .type() on sentinels + { + spec: "§3", name: "deleted().type() returns \"delete\"", + mapping: "root = deleted().type()", output: "delete", + }, + { + spec: "§3", name: "nothing().type() returns \"nothing\"", + mapping: "root = nothing().type()", output: "nothing", + }, + { + spec: "§3", name: "now() returns a string, not a timestamp", + mapping: "root = now().type()", output: "string", + }, + { + spec: "§3", name: ".number() always returns float64", + // Going through a method that is type-strict between int and float. + mapping: `root = "42".number()`, output: float64(42), + }, + + // §4.3 — string escapes + { + spec: "§4.3", name: "backslash-slash escape is not supported", + mapping: `root = "\/"`, compileErr: "failed to unescape", + }, + { + spec: "§4.3", name: "triple-quoted string is raw", + mapping: "root = \"\"\"line1\\nline2\"\"\"", output: `line1\nline2`, + }, + + // §4.5 — object key classification + { + spec: "§4.5", name: "int literal as object key is a parse error", + mapping: `root = {5: "v"}`, compileErr: "object keys must be strings", + }, + { + spec: "§4.5", name: "bare ident as object key parses (legacy this.ident dynamic key)", + mapping: `root = {a: 1}`, input: map[string]any{"a": "dyn"}, + output: map[string]any{"dyn": int64(1)}, + }, + { + spec: "§4.5", name: "bare ident key with null this.ident errors at runtime", + mapping: `root = {a: 1}`, input: map[string]any{}, + runtimeErr: "invalid key type", + }, + + // §5.3 — constant-folding scope + { + spec: "§5.3", name: "constant folding: literal divide-by-zero is a compile error", + mapping: "root = 5 / 0", compileErr: "divide by zero", + }, + { + spec: "§5.3", name: "constant folding: literal type mismatch on + is a compile error", + mapping: `root = 5 + "x"`, compileErr: "cannot add types", + }, + { + spec: "§5.3", name: "&& does NOT constant-fold; literal operands defer to runtime", + mapping: `root = true && "x"`, runtimeErr: "expected bool", + }, + { + spec: "§5.3", name: "|| does NOT constant-fold; literal operands defer to runtime", + mapping: `root = false || "x"`, runtimeErr: "expected bool", + }, + { + spec: "§5.3", name: "| (coalesce) does NOT constant-fold; runs at runtime", + mapping: `root = null | "fallback"`, output: "fallback", + }, + + // §5.3 / §14#24 — short-circuit applies at the operator, not through null-safe access + { + spec: "§14#24", name: "short-circuit: false && X never evaluates X", + // Use a runtime divisor so the RHS is not constant-folded. If + // short-circuit weren't working, this would raise divide-by-zero. + mapping: "root = false && (1 / this.zero > 0)", + input: map[string]any{"zero": int64(0)}, + output: false, + }, + { + spec: "§14#24", name: "this != null && this.foo > 0 is NOT safe on {}", + mapping: "root = this != null && this.foo > 0", input: map[string]any{}, + runtimeErr: "compare types null", + }, + + // §5.3 — comparison operand coercion + { + spec: "§5.3", name: "true == 1 is true (bool path coerces number to bool)", + mapping: "root = true == 1", output: true, + }, + { + spec: "§5.3", name: "1 == true is false (number path cannot coerce bool)", + mapping: "root = 1 == true", output: false, + }, + { + spec: "§5.3", name: "&& coerces a numeric RHS to bool", + mapping: "root = true && 1", output: true, + }, + { + spec: "§5.3", name: "|| rejects a string RHS", + mapping: `root = false || "x"`, runtimeErr: "expected bool", + }, + { + spec: "§5.3", name: "% silently truncates float operands to int64", + mapping: "root = 7.5 % 2.5", output: int64(1), + }, + + // §6.4 — target grammar + { + spec: "§6.4", name: "this as assignment target creates literal \"this\" key", + mapping: `this.foo = "bar"`, + output: map[string]any{"this": map[string]any{"foo": "bar"}}, + }, + { + spec: "§6.4", name: "meta(expr) = v is not a valid assignment target", + mapping: `meta("foo") = "bar"`, compileErr: "", + // Any compile error is acceptable — the parser refuses this form. + }, + { + spec: "§6.4", name: "meta = errors at runtime", + mapping: `meta = "string"`, runtimeErr: "object value", + }, + { + spec: "§6.4", name: "$x = value (var reassignment) is a parse error", + mapping: "let x = 1\n$x = 2", compileErr: "", + // Any compile error is acceptable. + }, + + // §6.3 — numeric-segment writes + { + spec: "§6.3", name: "numeric path write into NEW parent creates object key", + mapping: `root.items.0 = "x"`, + output: map[string]any{"items": map[string]any{"0": "x"}}, + }, + { + spec: "§6.3", name: "numeric path write into EXISTING array indexes the array", + mapping: `root.items = [1, 2, 3] +root.items.0 = "x"`, + output: map[string]any{"items": []any{"x", int64(2), int64(3)}}, + }, + { + spec: "§6.3", name: "numeric path write OOB on existing array errors at runtime", + mapping: `root.items = [1, 2] +root.items.5 = "x"`, + runtimeErr: "exceeded target array size", + }, + + // §6.5 — iterator vs non-iterator argument rebinding + { + spec: "§6.5", name: "iterator method with non-lambda arg rebinds this to element", + mapping: `root = [1, 2, 3].map_each(this * 10)`, + output: []any{int64(10), int64(20), int64(30)}, + }, + { + spec: "§6.5", name: "iterator method with lambda pops context; body this is outer", + mapping: `root = [1, 2, 3].map_each(x -> x + this.offset)`, + input: map[string]any{"offset": int64(100)}, + output: []any{int64(101), int64(102), int64(103)}, + }, + + // §6.1 / §9.4 — path access null-tolerance universal + { + spec: "§12.5", name: "field access on string returns null", + mapping: `root = "hello".missing`, output: nil, + }, + { + spec: "§12.5", name: "field access on number returns null", + mapping: `root = (5).missing`, output: nil, + }, + { + spec: "§9.4", name: "deleted().foo returns null via null-tolerant path access", + mapping: `root = deleted().foo`, output: nil, + }, + { + spec: "§9.4", name: "nothing().foo returns null", + mapping: `root = nothing().foo`, output: nil, + }, + { + spec: "§12.5", name: ".index() OOB is a runtime error, not null", + mapping: `root = [1, 2, 3].index(10)`, + runtimeErr: "out of bounds", + }, + + // §9.4 — sentinels in array/object literals + { + spec: "§9.4", name: "nothing() in array literal is elided", + mapping: "root = [1, nothing(), 3]", + output: []any{int64(1), int64(3)}, + }, + { + spec: "§9.4", name: "deleted() in array literal is elided", + mapping: "root = [1, deleted(), 3]", + output: []any{int64(1), int64(3)}, + }, + { + spec: "§9.4", name: "nothing() in object literal elides the key", + mapping: `root = {"a": 1, "b": nothing()}`, + output: map[string]any{"a": int64(1)}, + }, + { + spec: "§9.4", name: "deleted() in object literal elides the key", + mapping: `root = {"a": 1, "b": deleted()}`, + output: map[string]any{"a": int64(1)}, + }, + + // §7.2 — quoted let names are write-only + { + spec: "§7.2", name: "let with quoted non-ident name parses", + mapping: `let "has space" = 5 +root = 1`, output: int64(1), + }, + { + spec: "§7.2", name: "reading a quoted-non-ident var is a parse error", + mapping: `let "has space" = 5 +root = $"has space"`, compileErr: "expected query", + }, + + // §7.3 — statement-form if with null condition errors (vs expression form) + { + spec: "§7.3", name: "statement-form if null errors", + mapping: `if null { root.x = 1 } else { root.x = 2 }`, + runtimeErr: "non-boolean", + }, + { + spec: "§8.3", name: "expression-form if null treats null as falsy", + mapping: `root = if null { "yes" } else { "no" }`, + output: "no", + }, + + // §8.3 / §8.4 — nothing sentinel from if/match + { + spec: "§8.3", name: "if-without-else no-match returns nothing sentinel (field absent)", + mapping: `root.a = 1 +root.a = if false { 99 }`, + output: map[string]any{"a": int64(1)}, + }, + { + spec: "§8.4", name: "match with no matching arm returns nothing (assignment skipped)", + mapping: `root.a = 1 +root.a = match this.t { "no" => 2 }`, + input: map[string]any{"t": "yes"}, + output: map[string]any{"a": int64(1)}, + }, + { + spec: "§14#17", name: "root = nothing() preserves input unchanged", + mapping: `root = nothing()`, + input: map[string]any{"pass": "through"}, + output: map[string]any{"pass": "through"}, + }, + + // §8.4 — match literal classification after constant folding + { + spec: "§8.4", name: "match pattern (2+1) is constant-folded and used as literal", + mapping: `root = match 3 { (2+1) => "matched" }`, + output: "matched", + }, + + // §8.5 — sort multi-param lambda rejected + { + spec: "§8.5", name: "sort(left, right -> ...) multi-param lambda is a parse error", + mapping: `root = [3,1,2].sort(left, right -> left > right)`, + compileErr: "wrong number of arguments", + }, + { + spec: "§8.5", name: "sort(left > right) implicit-param form works", + mapping: `root = [3,1,2].sort(left > right)`, + output: []any{int64(3), int64(2), int64(1)}, + }, + + // §9.0 — methods that do NOT exist in V1 (require impl/pure NOT to help) + { + spec: "§9.0", name: "sqrt method does not exist", + mapping: `root = (4).sqrt()`, compileErr: "unrecognised method", + }, + { + spec: "§9.0", name: "map_values method does not exist", + mapping: `root = {"a":1}.map_values(v -> v * 2)`, compileErr: "unrecognised method", + }, + { + spec: "§9.0", name: "map_entries method does not exist", + mapping: `root = {"a":1}.map_entries(e -> e)`, compileErr: "unrecognised method", + }, + { + spec: "§9.0", name: "filter_entries method does not exist", + mapping: `root = {"a":1}.filter_entries(e -> true)`, compileErr: "unrecognised method", + }, + { + spec: "§9.0", name: "collect method does not exist", + mapping: `root = [1,2].collect(x -> x)`, compileErr: "unrecognised method", + }, + { + spec: "§9.0", name: "chunk method does not exist", + mapping: `root = [1,2,3].chunk(2)`, compileErr: "unrecognised method", + }, + { + spec: "§9.0", name: "char method does not exist", + mapping: `root = (65).char()`, compileErr: "unrecognised method", + }, + { + spec: "§9.0", name: "array reverse does not exist (strings only)", + mapping: `root = [1,2,3].reverse()`, + runtimeErr: "expected string value, got array", + }, + { + spec: "§9.0", name: ".round(N) with precision arg does not exist", + mapping: `root = 3.14.round(2)`, + compileErr: "wrong number of arguments", + }, + { + spec: "§9.0", name: "ts_add_duration does not exist (use ts_add_iso8601)", + mapping: `root = now().ts_add_duration("1h")`, compileErr: "unrecognised method", + }, + + // §9.0 — methods that DO exist (impl/pure loaded by harness) + { + spec: "§9.0", name: "abs exists (impl/pure)", + mapping: `root = (-5).abs()`, output: int64(5), + }, + { + spec: "§9.0", name: "int64 typed conversion exists (impl/pure)", + mapping: `root = "42".int64()`, output: int64(42), + }, + { + spec: "§9.0", name: "ceil is a core method (no impl/pure required)", + mapping: `root = 3.2.ceil()`, output: int64(4), + }, + { + spec: "§9.0", name: "round (zero-arg) is a core method", + mapping: `root = 3.6.round()`, output: int64(4), + }, + + // §14#64 — throw error mentions `why`, not `throw` + { + spec: "§14#64", name: "throw with non-string arg errors mentioning `why`", + mapping: `root = throw(5)`, compileErr: "why", + }, + + // §14#67 — path-collision runtime error + { + spec: "§14#67", name: "assigning into scalar path errors with \"non-object type\"", + mapping: `root.user = "Alice" +root.user.name = "Jane"`, + runtimeErr: "non-object type", + }, + } { + t.Run(c.spec+"/"+c.name, func(t *testing.T) { + m, cerr := v1spec.V1Interp{}.Compile(c.mapping, nil) + + if c.compileErr != "" || (c.compileErr == "" && cerr != nil && c.runtimeErr == "" && c.output == nil) { + // Compile-error case. + if cerr == nil { + t.Fatalf("expected compile error containing %q, got success", c.compileErr) + } + if c.compileErr != "" && !strings.Contains(cerr.Error(), c.compileErr) { + t.Fatalf("compile error %q does not contain %q", cerr.Error(), c.compileErr) + } + return + } + if cerr != nil { + t.Fatalf("unexpected compile error: %v", cerr) + } + + out, _, _, rerr := m.Exec(c.input, map[string]any{}) + if c.runtimeErr != "" { + if rerr == nil { + t.Fatalf("expected runtime error containing %q, got success (output=%#v)", c.runtimeErr, out) + } + if !strings.Contains(rerr.Error(), c.runtimeErr) { + t.Fatalf("runtime error %q does not contain %q", rerr.Error(), c.runtimeErr) + } + return + } + if rerr != nil { + t.Fatalf("unexpected runtime error: %v", rerr) + } + + if !reflect.DeepEqual(out, c.output) { + t.Fatalf("output mismatch:\n got: %#v (%T)\n want: %#v (%T)", out, out, c.output, c.output) + } + }) + } +} diff --git a/internal/bloblang2/migrator/v1spec/v1spec_test.go b/internal/bloblang2/migrator/v1spec/v1spec_test.go new file mode 100644 index 000000000..e003cfbdc --- /dev/null +++ b/internal/bloblang2/migrator/v1spec/v1spec_test.go @@ -0,0 +1,14 @@ +package v1spec_test + +import ( + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1spec" +) + +// TestBloblangV1Spec runs every YAML test under ./tests against the V1 +// Bloblang interpreter, using the shared spectest schema. Tests marked with a +// `skip:` field in the YAML are reported via t.Skip and do not execute. +func TestBloblangV1Spec(t *testing.T) { + v1spec.RunT(t, "tests", v1spec.V1Interp{}) +} From 1a88bc7560d38a5a578933dec558e711910a3c18 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Thu, 23 Apr 2026 10:40:47 +0100 Subject: [PATCH 11/20] bloblang(v2): Add V1 AST and parser Adds internal/bloblang2/migrator/v1ast/, a hand-written scanner, parser, and AST for Bloblang V1, plus a printer used to round-trip V1 source through the migrator. The package preserves comments and blank-line trivia so the V1 -> V2 translation pipeline can emit V2 source that retains the V1 author's formatting intent. This is the V1 front-end consumed by the translator package that follows. --- internal/bloblang2/migrator/v1ast/ast.go | 470 +++++++ internal/bloblang2/migrator/v1ast/parser.go | 1231 +++++++++++++++++ .../bloblang2/migrator/v1ast/parser_test.go | 613 ++++++++ internal/bloblang2/migrator/v1ast/printer.go | 442 ++++++ internal/bloblang2/migrator/v1ast/scanner.go | 481 +++++++ .../bloblang2/migrator/v1ast/trivia_test.go | 145 ++ 6 files changed, 3382 insertions(+) create mode 100644 internal/bloblang2/migrator/v1ast/ast.go create mode 100644 internal/bloblang2/migrator/v1ast/parser.go create mode 100644 internal/bloblang2/migrator/v1ast/parser_test.go create mode 100644 internal/bloblang2/migrator/v1ast/printer.go create mode 100644 internal/bloblang2/migrator/v1ast/scanner.go create mode 100644 internal/bloblang2/migrator/v1ast/trivia_test.go diff --git a/internal/bloblang2/migrator/v1ast/ast.go b/internal/bloblang2/migrator/v1ast/ast.go new file mode 100644 index 000000000..00e9bec10 --- /dev/null +++ b/internal/bloblang2/migrator/v1ast/ast.go @@ -0,0 +1,470 @@ +package v1ast + +// Node is the interface implemented by every AST node. +type Node interface { + // NodePos returns the source position of this node. + NodePos() Pos +} + +// Expr is implemented by every expression node. +type Expr interface { + Node + exprNode() +} + +// Stmt is implemented by every statement node. +// +// Every Stmt carries a TriviaSet so that comments and blank lines collected +// by the parser survive the round trip to the V1→V2 translator. Use +// Leading() / Trailing() to read; the parser sets them via the embedded +// TriviaSet on each concrete type. +type Stmt interface { + Node + stmtNode() + // Trivia returns the statement's leading+trailing trivia bucket. + // The returned pointer is the statement's own storage — mutation sticks. + Trivia() *TriviaSet +} + +// TriviaKind identifies the kind of a trivia entry. +type TriviaKind int + +// Trivia kinds. +const ( + // TriviaComment is a `# ...` line comment. Text excludes the leading `#` + // and the trailing newline, verbatim otherwise. + TriviaComment TriviaKind = iota + // TriviaBlankLine marks a run of two or more consecutive newlines with + // no content between them — i.e. an empty line in the source. + TriviaBlankLine +) + +// Trivia is a single entry in a TriviaSet. +type Trivia struct { + Kind TriviaKind + // Text is the comment text (without `#` or trailing newline). Empty for + // blank-line trivia. + Text string + Pos Pos +} + +// TriviaSet groups leading and trailing trivia for a node. +// +// Leading trivia is everything between the previous statement's end and +// this statement's start (standalone comment lines, blank lines). +// Trailing trivia is a comment that appears on the same line as the +// statement's last significant token. +type TriviaSet struct { + Leading []Trivia + Trailing []Trivia +} + +// Trivia returns the set itself so *TriviaSet satisfies the Stmt contract +// when embedded. +func (t *TriviaSet) Trivia() *TriviaSet { return t } + +// Program is the root of a V1 mapping AST. Maps and imports live alongside +// regular statements. The original source ordering is preserved on `Stmts` +// (maps and imports also appear in Stmts in order; convenience slices Maps / +// Imports are provided for quick access). +type Program struct { + Stmts []Stmt + Maps []*MapDecl + Imports []*ImportStmt + Pos Pos +} + +// NodePos returns the source position of this node. +func (p *Program) NodePos() Pos { return p.Pos } + +// +// Statements +// + +// Assignment is ` = ` at statement position. +type Assignment struct { + TriviaSet + Target AssignTarget + Value Expr + Pos Pos +} + +// NodePos returns the source position of this node. +func (a *Assignment) NodePos() Pos { return a.Pos } +func (a *Assignment) stmtNode() {} + +// AssignTargetKind enumerates the shapes of assignment targets. V1 is +// restrictive on the LHS (§6.4). +type AssignTargetKind int + +const ( + // TargetRoot is `root` optionally followed by path segments. + TargetRoot AssignTargetKind = iota + // TargetThis is `this` optionally followed by path segments. Note: the V1 + // parser accepts this and produces literal top-level "this" key behaviour + // (quirk 72); the AST preserves it verbatim. + TargetThis + // TargetBare is a bare-identifier first segment followed by more path + // segments (legacy, equivalent to root.…). + TargetBare + // TargetMeta is `meta` with no key (wholesale replace), or `meta ` + // / `meta "key"` for a single entry. + TargetMeta +) + +// AssignTarget is the LHS of an `=` assignment. +type AssignTarget struct { + Kind AssignTargetKind + // Path is the list of segments after the root keyword. For TargetBare, + // Path[0].Name is the bare identifier (and Quoted=false). For TargetMeta, + // Path has at most one entry (the key); it is empty for wholesale meta. + Path []PathSegment + Pos Pos +} + +// PathSegment is a dotted path component. +type PathSegment struct { + Name string // the literal segment name (or unescaped quoted string) + Quoted bool // true if the segment was written in quoted form + Pos Pos +} + +// LetStmt is `let = ` or `let "" = `. +type LetStmt struct { + TriviaSet + Name string + NameQuoted bool + NamePos Pos + Value Expr + Pos Pos +} + +// NodePos returns the source position of this node. +func (l *LetStmt) NodePos() Pos { return l.Pos } +func (l *LetStmt) stmtNode() {} + +// MapDecl is `map { ... }`. +type MapDecl struct { + TriviaSet + Name string + NamePos Pos + Body []Stmt + Pos Pos +} + +// NodePos returns the source position of this node. +func (m *MapDecl) NodePos() Pos { return m.Pos } +func (m *MapDecl) stmtNode() {} + +// ImportStmt is `import "path"`. +type ImportStmt struct { + TriviaSet + Path Expr // string literal + Pos Pos +} + +// NodePos returns the source position of this node. +func (i *ImportStmt) NodePos() Pos { return i.Pos } +func (i *ImportStmt) stmtNode() {} + +// FromStmt is `from "path"`. +type FromStmt struct { + TriviaSet + Path Expr + Pos Pos +} + +// NodePos returns the source position of this node. +func (f *FromStmt) NodePos() Pos { return f.Pos } +func (f *FromStmt) stmtNode() {} + +// IfStmt is the statement form of `if / else if / else { ... }`. +type IfStmt struct { + TriviaSet + Branches []IfBranch // first is the if, rest are else-if + Else []Stmt // may be nil if no else clause + Pos Pos +} + +// NodePos returns the source position of this node. +func (i *IfStmt) NodePos() Pos { return i.Pos } +func (i *IfStmt) stmtNode() {} + +// IfBranch is one `(if|else if) cond { body }` branch. +type IfBranch struct { + Cond Expr + Body []Stmt + Pos Pos +} + +// BareExprStmt is a lone expression acting as the whole mapping (shorthand +// for `root = expr`). Legal only when it is the sole statement. +type BareExprStmt struct { + TriviaSet + Expr Expr + Pos Pos +} + +// NodePos returns the source position of this node. +func (b *BareExprStmt) NodePos() Pos { return b.Pos } +func (b *BareExprStmt) stmtNode() {} + +// +// Expressions +// + +// LiteralKind identifies the kind of a Literal. +type LiteralKind int + +// Literal kinds. +const ( + LitNull LiteralKind = iota + LitBool + LitInt + LitFloat + LitString + LitRawString +) + +// Literal represents null, true, false, integers, floats, strings. +type Literal struct { + Kind LiteralKind + // Raw is the original source text (for INT/FLOAT preserved as-is; for + // strings it is the raw text of the token — quoted or triple-quoted). May + // be empty if synthesised. + Raw string + // Str is the decoded string value for LitString / LitRawString. Bool/Int + // readers can consult Raw. + Str string + Bool bool + Int int64 + Float float64 + TokPos Pos +} + +// NodePos returns the source position of this node. +func (l *Literal) NodePos() Pos { return l.TokPos } +func (l *Literal) exprNode() {} + +// Ident is a bare identifier at expression position (the legacy `foo` = +// `this.foo` form). The parser intentionally does NOT rewrite this; the +// migrator is free to decide. +type Ident struct { + Name string + TokPos Pos +} + +// NodePos returns the source position of this node. +func (i *Ident) NodePos() Pos { return i.TokPos } +func (i *Ident) exprNode() {} + +// ThisExpr is the literal `this` keyword. +type ThisExpr struct{ TokPos Pos } + +// NodePos returns the source position of this node. +func (t *ThisExpr) NodePos() Pos { return t.TokPos } +func (t *ThisExpr) exprNode() {} + +// RootExpr is the literal `root` keyword at expression position. +type RootExpr struct{ TokPos Pos } + +// NodePos returns the source position of this node. +func (r *RootExpr) NodePos() Pos { return r.TokPos } +func (r *RootExpr) exprNode() {} + +// VarRef is `$name`. +type VarRef struct { + Name string + TokPos Pos +} + +// NodePos returns the source position of this node. +func (v *VarRef) NodePos() Pos { return v.TokPos } +func (v *VarRef) exprNode() {} + +// MetaRef is `@` (whole metadata, Name empty) or `@name` / `@"name"`. +type MetaRef struct { + Name string // empty for bare `@` + Quoted bool + TokPos Pos +} + +// NodePos returns the source position of this node. +func (m *MetaRef) NodePos() Pos { return m.TokPos } +func (m *MetaRef) exprNode() {} + +// BinaryExpr is a binary-operator expression. +type BinaryExpr struct { + Left, Right Expr + Op TokenKind + OpPos Pos +} + +// NodePos returns the source position of this node. +func (b *BinaryExpr) NodePos() Pos { return b.Left.NodePos() } +func (b *BinaryExpr) exprNode() {} + +// UnaryExpr is `!x` or `-x`. +type UnaryExpr struct { + Op TokenKind + Operand Expr + OpPos Pos +} + +// NodePos returns the source position of this node. +func (u *UnaryExpr) NodePos() Pos { return u.OpPos } +func (u *UnaryExpr) exprNode() {} + +// ParenExpr wraps an expression in parentheses. Preserved in the AST so the +// printer can round-trip. +type ParenExpr struct { + Inner Expr + TokPos Pos +} + +// NodePos returns the source position of this node. +func (p *ParenExpr) NodePos() Pos { return p.TokPos } +func (p *ParenExpr) exprNode() {} + +// FieldAccess is `recv.` where Name is an identifier-class or quoted +// path segment. +type FieldAccess struct { + Recv Expr + Seg PathSegment +} + +// NodePos returns the source position of this node. +func (f *FieldAccess) NodePos() Pos { return f.Recv.NodePos() } +func (f *FieldAccess) exprNode() {} + +// MethodCall is `recv.name(args)`. +type MethodCall struct { + Recv Expr + Name string + NamePos Pos + Args []CallArg + Named bool // all arguments are named (name: value) +} + +// NodePos returns the source position of this node. +func (m *MethodCall) NodePos() Pos { return m.Recv.NodePos() } +func (m *MethodCall) exprNode() {} + +// FunctionCall is a top-level call `name(args)`. +type FunctionCall struct { + Name string + NamePos Pos + Args []CallArg + Named bool +} + +// NodePos returns the source position of this node. +func (f *FunctionCall) NodePos() Pos { return f.NamePos } +func (f *FunctionCall) exprNode() {} + +// MetaCall is `meta()` used as an expression (read form). +type MetaCall struct { + Key Expr + TokPos Pos +} + +// NodePos returns the source position of this node. +func (m *MetaCall) NodePos() Pos { return m.TokPos } +func (m *MetaCall) exprNode() {} + +// CallArg is one argument, optionally named. +type CallArg struct { + Name string // empty for positional + Value Expr + Pos Pos +} + +// MapExpr is `recv.(body)` — a path-scoped subexpression that rebinds +// `this`. For the named-capture variant `recv.(name -> body)` Body is a +// Lambda. +type MapExpr struct { + Recv Expr + Body Expr + TokPos Pos // position of the '.' before '(' +} + +// NodePos returns the source position of this node. +func (m *MapExpr) NodePos() Pos { return m.Recv.NodePos() } +func (m *MapExpr) exprNode() {} + +// Lambda is ` -> ` or `_ -> `. +type Lambda struct { + Param string + Discard bool // true if param is `_` + ParamPos Pos + Body Expr + ArrowPos Pos +} + +// NodePos returns the source position of this node. +func (l *Lambda) NodePos() Pos { return l.ParamPos } +func (l *Lambda) exprNode() {} + +// ArrayLit is `[...]`. +type ArrayLit struct { + Elems []Expr + TokPos Pos // '[' +} + +// NodePos returns the source position of this node. +func (a *ArrayLit) NodePos() Pos { return a.TokPos } +func (a *ArrayLit) exprNode() {} + +// ObjectLit is `{...}`. +type ObjectLit struct { + Entries []ObjectEntry + TokPos Pos // '{' +} + +// NodePos returns the source position of this node. +func (o *ObjectLit) NodePos() Pos { return o.TokPos } +func (o *ObjectLit) exprNode() {} + +// ObjectEntry is one `key: value` member. +type ObjectEntry struct { + Key Expr // may be *Literal (QuotedString) or any other expression (dynamic) + Value Expr +} + +// IfExpr is the expression form of if/else if/else, where each branch body +// is a single expression. +type IfExpr struct { + Branches []IfExprBranch + Else Expr // nil if no else + TokPos Pos +} + +// NodePos returns the source position of this node. +func (i *IfExpr) NodePos() Pos { return i.TokPos } +func (i *IfExpr) exprNode() {} + +// IfExprBranch is one arm of an IfExpr. +type IfExprBranch struct { + Cond Expr + Body Expr + Pos Pos +} + +// MatchExpr is `match [subject] { cases }`. +type MatchExpr struct { + Subject Expr // nil for subject-less match + Cases []MatchCase + TokPos Pos +} + +// NodePos returns the source position of this node. +func (m *MatchExpr) NodePos() Pos { return m.TokPos } +func (m *MatchExpr) exprNode() {} + +// MatchCase is one `pattern => body` arm. +type MatchCase struct { + Pattern Expr // nil for wildcard `_` + Wildcard bool + Body Expr + Pos Pos +} diff --git a/internal/bloblang2/migrator/v1ast/parser.go b/internal/bloblang2/migrator/v1ast/parser.go new file mode 100644 index 000000000..22c9ecc03 --- /dev/null +++ b/internal/bloblang2/migrator/v1ast/parser.go @@ -0,0 +1,1231 @@ +package v1ast + +import ( + "fmt" + "strconv" + "strings" +) + +// ParseError carries a position and a human-readable message. +type ParseError struct { + Pos Pos + Msg string +} + +func (e *ParseError) Error() string { + return fmt.Sprintf("%s: %s", e.Pos, e.Msg) +} + +// Parse parses a V1 mapping source string into a *Program. +func Parse(src string) (*Program, error) { + sc := NewScanner(src) + toks, err := sc.All() + if err != nil { + return nil, err + } + p := &parser{toks: toks} + return p.parseProgram() +} + +type parser struct { + toks []Token + pos int + + // Trivia collection. Comment tokens are transparently skipped by peek() + // and advance() so existing non-trivia parsing logic is unchanged; the + // skipped comments + blank-line markers land in one of two buckets: + // - pendingTrailing: a comment on the same line as the preceding + // significant token. Drained into the just-finished statement's + // Trailing set. + // - pendingLeading: a comment on its own line, or a blank-line marker. + // Drained into the next statement's Leading set. + pendingTrailing []Trivia + pendingLeading []Trivia + // newlinesBuffered counts consecutive TokNewline advances since the + // last leading-trivia decision. A run of 2+ produces a blank-line + // marker inserted at the exact chronological moment (before the next + // leading comment, or before returning to a significant token). + newlinesBuffered int + // lastSigLine tracks the line of the last consumed significant token, + // used to classify a comment as trailing-vs-leading. + lastSigLine int +} + +// +// Token cursor helpers +// + +// peek returns the next significant token, transparently skipping and +// stashing any comment tokens encountered. +func (p *parser) peek() Token { + p.stashComments() + return p.toks[p.pos] +} + +// peekAt returns the i-th significant token ahead of the current position, +// skipping over any comment tokens in its count. +func (p *parser) peekAt(i int) Token { + p.stashComments() + j := p.pos + for n := 0; n < i; n++ { + j++ + if j >= len(p.toks) { + return p.toks[len(p.toks)-1] + } + for j < len(p.toks) && p.toks[j].Kind == TokComment { + j++ + } + } + if j >= len(p.toks) { + return p.toks[len(p.toks)-1] + } + return p.toks[j] +} + +// advance consumes the current significant token and returns it. +func (p *parser) advance() Token { + p.stashComments() + t := p.toks[p.pos] + switch t.Kind { + case TokNewline: + p.newlinesBuffered++ + case TokEOF: + // leave counters alone + default: + p.newlinesBuffered = 0 + p.lastSigLine = t.Pos.Line + } + if p.pos < len(p.toks)-1 { + p.pos++ + } + return t +} + +// stashComments consumes TokComment tokens at the current cursor position, +// routing each into pendingTrailing (same-line as previous significant +// token, no newline between) or pendingLeading (own-line). Before stashing +// a leading comment, any pending blank-line marker is emitted so the +// trivia list stays in source order. +func (p *parser) stashComments() { + for p.pos < len(p.toks) && p.toks[p.pos].Kind == TokComment { + tok := p.toks[p.pos] + tri := Trivia{Kind: TriviaComment, Text: tok.Text, Pos: tok.Pos} + if p.newlinesBuffered == 0 && p.lastSigLine != 0 && tok.Pos.Line == p.lastSigLine { + p.pendingTrailing = append(p.pendingTrailing, tri) + } else { + p.flushBlankLine() + p.pendingLeading = append(p.pendingLeading, tri) + // The comment itself occupies a line; subsequent newlines count + // afresh toward a possible next blank-line marker. + p.newlinesBuffered = 0 + } + p.pos++ + } +} + +// flushBlankLine emits a BlankLine marker into pendingLeading if +// newlinesBuffered is 2+ (meaning two or more consecutive newlines have +// passed without any content on one of the intervening lines). +// Collapses consecutive runs into a single marker and resets the counter. +func (p *parser) flushBlankLine() { + if p.newlinesBuffered < 2 { + return + } + if len(p.pendingLeading) == 0 || p.pendingLeading[len(p.pendingLeading)-1].Kind != TriviaBlankLine { + p.pendingLeading = append(p.pendingLeading, Trivia{Kind: TriviaBlankLine}) + } + p.newlinesBuffered = 0 +} + +// flushLeading returns and clears the pendingLeading buffer. +func (p *parser) flushLeading() []Trivia { + out := p.pendingLeading + p.pendingLeading = nil + return out +} + +// flushTrailing returns and clears the pendingTrailing buffer. +func (p *parser) flushTrailing() []Trivia { + out := p.pendingTrailing + p.pendingTrailing = nil + return out +} + +func (p *parser) errAt(pos Pos, format string, args ...any) *ParseError { + return &ParseError{Pos: pos, Msg: fmt.Sprintf(format, args...)} +} + +// skipNewlines discards newline tokens at a statement boundary. A run of +// two or more newlines produces a blank-line trivia entry, flushed either +// when a leading comment is stashed mid-run (see stashComments) or here, +// before the next significant token begins. +func (p *parser) skipNewlines() { + for p.peek().Kind == TokNewline { + p.advance() + } + p.flushBlankLine() +} + +// skipInlineNewlines is used in contexts where a newline is tolerated +// (inside brackets, braces, after an operator/arrow, etc.). Does not +// record blank-line trivia — the newlines here are expression continuation +// whitespace, not structural blank lines between statements. +func (p *parser) skipInlineNewlines() { + for p.peek().Kind == TokNewline { + p.advance() + } +} + +// expect advances past a token of the given kind, returning an error otherwise. +func (p *parser) expect(kind TokenKind, what string) (Token, error) { + t := p.peek() + if t.Kind != kind { + return t, p.errAt(t.Pos, "expected %s, got %s (%q)", what, t.Kind, t.Text) + } + return p.advance(), nil +} + +// +// Program / statement layer +// + +func (p *parser) parseProgram() (*Program, error) { + prog := &Program{Pos: p.peek().Pos} + p.skipNewlines() + // If the input is a single bare expression, treat the whole thing as a + // BareExprStmt. Heuristic: attempt statement parsing; if the first + // top-level item is not an obvious statement start AND nothing follows, + // interpret as bare expression. + for p.peek().Kind != TokEOF { + leading := p.flushLeading() + stmt, err := p.parseStmt() + if err != nil { + return nil, err + } + if stmt != nil { + // Leading trivia accumulated before the statement, trailing + // stashed during its parse (same-line comment, if any). + tri := stmt.Trivia() + tri.Leading = append(tri.Leading, leading...) + tri.Trailing = append(tri.Trailing, p.flushTrailing()...) + prog.Stmts = append(prog.Stmts, stmt) + switch s := stmt.(type) { + case *MapDecl: + prog.Maps = append(prog.Maps, s) + case *ImportStmt: + prog.Imports = append(prog.Imports, s) + } + } + // After each statement, require a newline or EOF. + tok := p.peek() + if tok.Kind == TokNewline { + p.skipNewlines() + continue + } + if tok.Kind == TokEOF { + break + } + return nil, p.errAt(tok.Pos, "expected newline or EOF after statement, got %s", tok.Kind) + } + return prog, nil +} + +// parseStmt dispatches on keyword / target shape. +func (p *parser) parseStmt() (Stmt, error) { + t := p.peek() + switch t.Kind { + case TokIdent: + switch t.Text { + case "let": + return p.parseLet() + case "map": + return p.parseMapDecl() + case "import": + return p.parseImport() + case "from": + return p.parseFrom() + case "if": + return p.parseIfStmt() + case "meta": + // Could be `meta = v`, `meta = v`, or an expression + // starting with meta(...). + return p.parseMetaOrBare() + case "root", "this": + return p.parseAssignOrBare() + } + // Bare identifier — could be a target (bare path) or bare expression. + return p.parseAssignOrBare() + } + // Anything else: bare expression. + return p.parseBareExprStmt() +} + +func (p *parser) parseLet() (Stmt, error) { + tok := p.advance() // 'let' + // Whitespace after 'let' is required, but the scanner already handles + // spaces so we just need something after it. + next := p.peek() + st := &LetStmt{Pos: tok.Pos} + switch next.Kind { + case TokIdent: + st.Name = next.Text + st.NamePos = next.Pos + p.advance() + case TokString: + st.Name = next.Text + st.NameQuoted = true + st.NamePos = next.Pos + p.advance() + default: + return nil, p.errAt(next.Pos, "expected identifier or quoted string after 'let', got %s", next.Kind) + } + if err := p.consumeAssignEquals(); err != nil { + return nil, err + } + expr, err := p.parseExpr() + if err != nil { + return nil, err + } + st.Value = expr + return st, nil +} + +// consumeAssignEquals enforces whitespace-around-'=' at statement position +// (§14#68). It expects the current token to be `=` preceded by whitespace, +// and the next token to be preceded by whitespace. +func (p *parser) consumeAssignEquals() error { + eq := p.peek() + if eq.Kind != TokAssign { + return p.errAt(eq.Pos, "expected '=', got %s", eq.Kind) + } + if !eq.PrecededBySpace && !eq.PrecededByNewline { + return p.errAt(eq.Pos, "expected whitespace before '='") + } + p.advance() + next := p.peek() + if !next.PrecededBySpace && !next.PrecededByNewline { + return p.errAt(next.Pos, "expected whitespace after '='") + } + return nil +} + +func (p *parser) parseImport() (Stmt, error) { + tok := p.advance() + str, err := p.parseExpr() + if err != nil { + return nil, err + } + return &ImportStmt{Path: str, Pos: tok.Pos}, nil +} + +func (p *parser) parseFrom() (Stmt, error) { + tok := p.advance() + str, err := p.parseExpr() + if err != nil { + return nil, err + } + return &FromStmt{Path: str, Pos: tok.Pos}, nil +} + +func (p *parser) parseMapDecl() (Stmt, error) { + tok := p.advance() // 'map' + nameTok := p.peek() + if nameTok.Kind != TokIdent { + return nil, p.errAt(nameTok.Pos, "expected map name, got %s", nameTok.Kind) + } + p.advance() + if _, err := p.expect(TokLBrace, "'{'"); err != nil { + return nil, err + } + p.skipNewlines() + var body []Stmt + for p.peek().Kind != TokRBrace && p.peek().Kind != TokEOF { + leading := p.flushLeading() + st, err := p.parseStmt() + if err != nil { + return nil, err + } + if st != nil { + tri := st.Trivia() + tri.Leading = append(tri.Leading, leading...) + tri.Trailing = append(tri.Trailing, p.flushTrailing()...) + body = append(body, st) + } + if p.peek().Kind == TokNewline { + p.skipNewlines() + continue + } + break + } + if _, err := p.expect(TokRBrace, "'}'"); err != nil { + return nil, err + } + return &MapDecl{Name: nameTok.Text, NamePos: nameTok.Pos, Body: body, Pos: tok.Pos}, nil +} + +func (p *parser) parseIfStmt() (Stmt, error) { + tok := p.advance() // 'if' + br, err := p.parseIfBranch(tok.Pos) + if err != nil { + return nil, err + } + st := &IfStmt{Pos: tok.Pos, Branches: []IfBranch{br}} + + for p.peek().Kind == TokIdent && p.peek().Text == "else" { + elseTok := p.advance() + if p.peek().Kind == TokIdent && p.peek().Text == "if" { + p.advance() + nb, err := p.parseIfBranch(elseTok.Pos) + if err != nil { + return nil, err + } + st.Branches = append(st.Branches, nb) + continue + } + // final else + if _, err := p.expect(TokLBrace, "'{'"); err != nil { + return nil, err + } + body, err := p.parseStmtBlock() + if err != nil { + return nil, err + } + st.Else = body + if _, err := p.expect(TokRBrace, "'}'"); err != nil { + return nil, err + } + break + } + return st, nil +} + +func (p *parser) parseIfBranch(pos Pos) (IfBranch, error) { + cond, err := p.parseExpr() + if err != nil { + return IfBranch{}, err + } + if _, err := p.expect(TokLBrace, "'{'"); err != nil { + return IfBranch{}, err + } + body, err := p.parseStmtBlock() + if err != nil { + return IfBranch{}, err + } + if _, err := p.expect(TokRBrace, "'}'"); err != nil { + return IfBranch{}, err + } + return IfBranch{Cond: cond, Body: body, Pos: pos}, nil +} + +// parseStmtBlock parses statements up to (but not including) the closing '}'. +func (p *parser) parseStmtBlock() ([]Stmt, error) { + p.skipNewlines() + var out []Stmt + for p.peek().Kind != TokRBrace && p.peek().Kind != TokEOF { + leading := p.flushLeading() + st, err := p.parseStmt() + if err != nil { + return nil, err + } + if st != nil { + tri := st.Trivia() + tri.Leading = append(tri.Leading, leading...) + tri.Trailing = append(tri.Trailing, p.flushTrailing()...) + out = append(out, st) + } + if p.peek().Kind == TokNewline { + p.skipNewlines() + continue + } + break + } + return out, nil +} + +// parseMetaOrBare handles `meta ... = ...`, `meta = ...`, and bare +// `meta(...)` expressions. +func (p *parser) parseMetaOrBare() (Stmt, error) { + start := p.peek() + // Look ahead: if we see `meta = …`, `meta = …`, or `meta + // "" = …`, it's an assignment. If we see `meta(` with no assignment + // afterwards, it's a bare expression. + save := p.pos + _ = p.advance() // 'meta' + next := p.peek() + switch next.Kind { + case TokAssign: + // `meta = …` + tgt := AssignTarget{Kind: TargetMeta, Pos: start.Pos} + return p.finishAssignment(tgt, start.Pos) + case TokIdent: + // Could be `meta foo = …` or — if the next next is something else — a + // bare expression. The ident form requires `=` eventually, so peek + // ahead. + if p.peekAt(1).Kind == TokAssign { + keyTok := p.advance() + tgt := AssignTarget{ + Kind: TargetMeta, + Pos: start.Pos, + Path: []PathSegment{{Name: keyTok.Text, Pos: keyTok.Pos}}, + } + return p.finishAssignment(tgt, start.Pos) + } + case TokString: + if p.peekAt(1).Kind == TokAssign { + keyTok := p.advance() + tgt := AssignTarget{ + Kind: TargetMeta, + Pos: start.Pos, + Path: []PathSegment{{Name: keyTok.Text, Quoted: true, Pos: keyTok.Pos}}, + } + return p.finishAssignment(tgt, start.Pos) + } + } + // Not a meta assignment — rewind and treat as bare expression. + p.pos = save + return p.parseBareExprStmt() +} + +// parseAssignOrBare tries to parse an assignment target followed by `= +// `; if no `=` is present, parses the initial tokens as a bare +// expression. +func (p *parser) parseAssignOrBare() (Stmt, error) { + save := p.pos + tgt, ok := p.tryParseTarget() + if ok { + // Need to see `=` next. + if p.peek().Kind == TokAssign && (p.peek().PrecededBySpace || p.peek().PrecededByNewline) { + return p.finishAssignment(tgt, tgt.Pos) + } + } + p.pos = save + return p.parseBareExprStmt() +} + +// tryParseTarget attempts to parse an assignment target. Returns (tgt, true) +// on success; on any failure the parser's position is not guaranteed to be +// reset — callers should save/restore. +func (p *parser) tryParseTarget() (AssignTarget, bool) { + start := p.peek() + if start.Kind != TokIdent { + return AssignTarget{}, false + } + tgt := AssignTarget{Pos: start.Pos} + switch start.Text { + case "root": + tgt.Kind = TargetRoot + p.advance() + case "this": + tgt.Kind = TargetThis + p.advance() + case "meta": + // Handled by parseMetaOrBare, but we tolerate here too. + tgt.Kind = TargetMeta + p.advance() + if p.peek().Kind == TokIdent && p.peekAt(1).Kind == TokAssign { + keyTok := p.advance() + tgt.Path = []PathSegment{{Name: keyTok.Text, Pos: keyTok.Pos}} + } else if p.peek().Kind == TokString && p.peekAt(1).Kind == TokAssign { + keyTok := p.advance() + tgt.Path = []PathSegment{{Name: keyTok.Text, Quoted: true, Pos: keyTok.Pos}} + } + return tgt, true + default: + // Bare identifier first segment — legacy root shorthand. + tgt.Kind = TargetBare + tgt.Path = []PathSegment{{Name: start.Text, Pos: start.Pos}} + p.advance() + } + // Subsequent segments: `.ident` or `."quoted"`. Each dot must have no + // leading whitespace. + for p.peek().Kind == TokDot && !p.peek().PrecededBySpace && !p.peek().PrecededByNewline { + p.advance() + seg := p.peek() + switch seg.Kind { + case TokIdent: + tgt.Path = append(tgt.Path, PathSegment{Name: seg.Text, Pos: seg.Pos}) + p.advance() + case TokInt: + // numeric segment (e.g. root.items.0) + tgt.Path = append(tgt.Path, PathSegment{Name: seg.Text, Pos: seg.Pos}) + p.advance() + case TokFloat: + // Split merged "N.M" float into two numeric segments. + p.advance() + parts := strings.SplitN(seg.Text, ".", 2) + tgt.Path = append(tgt.Path, PathSegment{Name: parts[0], Pos: seg.Pos}) + secondPos := seg.Pos + secondPos.Column += len(parts[0]) + 1 + secondPos.Offset += len(parts[0]) + 1 + tgt.Path = append(tgt.Path, PathSegment{Name: parts[1], Pos: secondPos}) + case TokString: + tgt.Path = append(tgt.Path, PathSegment{Name: seg.Text, Quoted: true, Pos: seg.Pos}) + p.advance() + default: + return tgt, false + } + } + return tgt, true +} + +func (p *parser) finishAssignment(tgt AssignTarget, pos Pos) (Stmt, error) { + if err := p.consumeAssignEquals(); err != nil { + return nil, err + } + expr, err := p.parseExpr() + if err != nil { + return nil, err + } + return &Assignment{Target: tgt, Value: expr, Pos: pos}, nil +} + +func (p *parser) parseBareExprStmt() (Stmt, error) { + t := p.peek() + expr, err := p.parseExpr() + if err != nil { + return nil, err + } + return &BareExprStmt{Expr: expr, Pos: t.Pos}, nil +} + +// +// Expression layer +// + +// parseExpr parses a full expression, including binary chains. +// +// V1 parses arithmetic expressions flat and then reduces precedence in four +// passes. We mirror that: collect operand/operator sequence first, then +// reduce by precedence level. +func (p *parser) parseExpr() (Expr, error) { + // Operand | op | operand | op | operand ... + first, err := p.parseTerm() + if err != nil { + return nil, err + } + operands := []Expr{first} + var ops []opTok + for { + t := p.peek() + if t.PrecededByNewline { + // A newline before a binary operator is rejected (§2.1). Exit the + // chain. + break + } + op, isOp := binaryOp(t.Kind) + if !isOp { + break + } + p.advance() + // After a binary operator, newlines are tolerated. + p.skipInlineNewlines() + right, err := p.parseTerm() + if err != nil { + return nil, err + } + ops = append(ops, opTok{kind: op, pos: t.Pos}) + operands = append(operands, right) + } + return reducePrecedence(operands, ops), nil +} + +type opTok struct { + kind TokenKind + pos Pos +} + +// binaryOp reports whether a token is a binary operator at expression +// position. +func binaryOp(t TokenKind) (TokenKind, bool) { + switch t { + case TokPlus, TokMinus, TokStar, TokSlash, TokPercent, + TokEq, TokNeq, TokLt, TokLte, TokGt, TokGte, + TokAnd, TokOr, TokPipe: + return t, true + } + return 0, false +} + +// precedence order — lower number binds tighter (matches V1). +// +// level 1: * / % | +// level 2: + - +// level 3: == != < <= > >= +// level 4: && || (flat; left-to-right) +func opLevel(k TokenKind) int { + switch k { + case TokStar, TokSlash, TokPercent, TokPipe: + return 1 + case TokPlus, TokMinus: + return 2 + case TokEq, TokNeq, TokLt, TokLte, TokGt, TokGte: + return 3 + case TokAnd, TokOr: + return 4 + } + return 99 +} + +func reducePrecedence(operands []Expr, ops []opTok) Expr { + for level := 1; level <= 4; level++ { + operands, ops = reduceLevel(operands, ops, level) + } + if len(operands) != 1 { + // Should not happen; return the first as a graceful fallback. + return operands[0] + } + return operands[0] +} + +func reduceLevel(operands []Expr, ops []opTok, level int) ([]Expr, []opTok) { + newOperands := []Expr{operands[0]} + var newOps []opTok + for i, op := range ops { + if opLevel(op.kind) == level { + left := newOperands[len(newOperands)-1] + right := operands[i+1] + newOperands[len(newOperands)-1] = &BinaryExpr{ + Left: left, Op: op.kind, OpPos: op.pos, Right: right, + } + } else { + newOps = append(newOps, op) + newOperands = append(newOperands, operands[i+1]) + } + } + return newOperands, newOps +} + +// parseTerm parses a prefix-unary-ed primary with method tails. V1 permits +// `- - x` (each term accepts an optional `-`); `!!x` is rejected. +func (p *parser) parseTerm() (Expr, error) { + var negs []Token + for p.peek().Kind == TokMinus { + negs = append(negs, p.advance()) + } + var not *Token + if p.peek().Kind == TokBang { + n := p.advance() + // §5.1: `!!x` is a parse error. + if p.peek().Kind == TokBang { + return nil, p.errAt(p.peek().Pos, "stacked '!!' not permitted; write '!(!x)'") + } + not = &n + } + prim, err := p.parsePrimary() + if err != nil { + return nil, err + } + prim, err = p.parseTails(prim) + if err != nil { + return nil, err + } + if not != nil { + prim = &UnaryExpr{Op: TokBang, Operand: prim, OpPos: not.Pos} + } + for i := len(negs) - 1; i >= 0; i-- { + prim = &UnaryExpr{Op: TokMinus, Operand: prim, OpPos: negs[i].Pos} + } + return prim, nil +} + +// parseTails applies .field / .method() / .(expr) chains to `recv`. +func (p *parser) parseTails(recv Expr) (Expr, error) { + for { + t := p.peek() + // Whitespace before the dot rejects the tail (§2.1). + if t.Kind != TokDot || t.PrecededBySpace || t.PrecededByNewline { + return recv, nil + } + dotTok := p.advance() + // After '.', newlines/comments/whitespace are tolerated. + p.skipInlineNewlines() + + next := p.peek() + switch next.Kind { + case TokLParen: + // Map-expression — .(expr) or .(name -> body). Parse the inner + // expression. + p.advance() + p.skipInlineNewlines() + body, err := p.parseExpr() + if err != nil { + return nil, err + } + p.skipInlineNewlines() + if _, err := p.expect(TokRParen, "')'"); err != nil { + return nil, err + } + recv = &MapExpr{Recv: recv, Body: body, TokPos: dotTok.Pos} + case TokIdent: + nameTok := p.advance() + // Method call? Look for `(` directly after (no space). + if p.peek().Kind == TokLParen && !p.peek().PrecededBySpace && !p.peek().PrecededByNewline { + p.advance() + args, named, err := p.parseCallArgs() + if err != nil { + return nil, err + } + recv = &MethodCall{ + Recv: recv, Name: nameTok.Text, NamePos: nameTok.Pos, + Args: args, Named: named, + } + } else { + recv = &FieldAccess{Recv: recv, Seg: PathSegment{Name: nameTok.Text, Pos: nameTok.Pos}} + } + case TokInt: + // Numeric segment — array index / stringy key. + nameTok := p.advance() + recv = &FieldAccess{Recv: recv, Seg: PathSegment{Name: nameTok.Text, Pos: nameTok.Pos}} + case TokFloat: + // The scanner merged `N.M` into a float because it's context-free, + // but in a path-tail context N and M are independent numeric + // segments. Split the float back into two segments. + nameTok := p.advance() + parts := strings.SplitN(nameTok.Text, ".", 2) + recv = &FieldAccess{Recv: recv, Seg: PathSegment{Name: parts[0], Pos: nameTok.Pos}} + // Second segment inherits a synthetic position one column after + // the '.'. + secondPos := nameTok.Pos + secondPos.Column += len(parts[0]) + 1 + secondPos.Offset += len(parts[0]) + 1 + recv = &FieldAccess{Recv: recv, Seg: PathSegment{Name: parts[1], Pos: secondPos}} + case TokString: + nameTok := p.advance() + recv = &FieldAccess{Recv: recv, Seg: PathSegment{Name: nameTok.Text, Quoted: true, Pos: nameTok.Pos}} + default: + return nil, p.errAt(next.Pos, "expected field, method call, or .(expr) after '.', got %s", next.Kind) + } + } +} + +// parseCallArgs reads arguments up to and including the closing ')'. Newlines +// inside argument lists are tolerated. +func (p *parser) parseCallArgs() ([]CallArg, bool, error) { + var args []CallArg + named := false + first := true + for { + p.skipInlineNewlines() + if p.peek().Kind == TokRParen { + p.advance() + return args, named, nil + } + if !first { + if _, err := p.expect(TokComma, "','"); err != nil { + return nil, false, err + } + p.skipInlineNewlines() + } + // Tolerate trailing comma. + if p.peek().Kind == TokRParen { + p.advance() + return args, named, nil + } + arg, err := p.parseCallArg() + if err != nil { + return nil, false, err + } + if arg.Name != "" { + named = true + } + args = append(args, arg) + first = false + } +} + +func (p *parser) parseCallArg() (CallArg, error) { + // Named arg detection: ident ':' — but only if this ident is not + // followed by '.' (which would start a path expression). + t := p.peek() + if t.Kind == TokIdent { + next := p.peekAt(1) + if next.Kind == TokColon { + nameTok := p.advance() + p.advance() // colon + p.skipInlineNewlines() + val, err := p.parseExpr() + if err != nil { + return CallArg{}, err + } + return CallArg{Name: nameTok.Text, Value: val, Pos: nameTok.Pos}, nil + } + } + val, err := p.parseExpr() + if err != nil { + return CallArg{}, err + } + return CallArg{Value: val, Pos: t.Pos}, nil +} + +// parsePrimary parses a primary expression without any tails or prefix ops. +func (p *parser) parsePrimary() (Expr, error) { + t := p.peek() + switch t.Kind { + case TokInt: + p.advance() + n, _ := strconv.ParseInt(t.Text, 10, 64) + return &Literal{Kind: LitInt, Raw: t.Text, Int: n, TokPos: t.Pos}, nil + case TokFloat: + p.advance() + f, _ := strconv.ParseFloat(t.Text, 64) + return &Literal{Kind: LitFloat, Raw: t.Text, Float: f, TokPos: t.Pos}, nil + case TokString: + p.advance() + return &Literal{Kind: LitString, Raw: t.Text, Str: t.Text, TokPos: t.Pos}, nil + case TokRawString: + p.advance() + return &Literal{Kind: LitRawString, Raw: t.Text, Str: t.Text, TokPos: t.Pos}, nil + case TokLParen: + p.advance() + p.skipInlineNewlines() + inner, err := p.parseExpr() + if err != nil { + return nil, err + } + p.skipInlineNewlines() + if _, err := p.expect(TokRParen, "')'"); err != nil { + return nil, err + } + return &ParenExpr{Inner: inner, TokPos: t.Pos}, nil + case TokLBracket: + return p.parseArrayLit() + case TokLBrace: + return p.parseObjectLit() + case TokDollar: + p.advance() + nameTok := p.peek() + if nameTok.Kind != TokIdent { + return nil, p.errAt(nameTok.Pos, "expected variable name after '$'") + } + if nameTok.PrecededBySpace || nameTok.PrecededByNewline { + return nil, p.errAt(nameTok.Pos, "unexpected whitespace after '$'") + } + p.advance() + return &VarRef{Name: nameTok.Text, TokPos: t.Pos}, nil + case TokAt: + p.advance() + next := p.peek() + if !next.PrecededBySpace && !next.PrecededByNewline { + if next.Kind == TokIdent { + p.advance() + return &MetaRef{Name: next.Text, TokPos: t.Pos}, nil + } + if next.Kind == TokString { + p.advance() + return &MetaRef{Name: next.Text, Quoted: true, TokPos: t.Pos}, nil + } + } + return &MetaRef{TokPos: t.Pos}, nil + case TokIdent: + return p.parseIdentPrimary() + } + return nil, p.errAt(t.Pos, "unexpected token %s (%q) in expression", t.Kind, t.Text) +} + +func (p *parser) parseIdentPrimary() (Expr, error) { + t := p.peek() + switch t.Text { + case "true": + p.advance() + return &Literal{Kind: LitBool, Raw: "true", Bool: true, TokPos: t.Pos}, nil + case "false": + p.advance() + return &Literal{Kind: LitBool, Raw: "false", Bool: false, TokPos: t.Pos}, nil + case "null": + p.advance() + return &Literal{Kind: LitNull, Raw: "null", TokPos: t.Pos}, nil + case "this": + p.advance() + return &ThisExpr{TokPos: t.Pos}, nil + case "root": + p.advance() + return &RootExpr{TokPos: t.Pos}, nil + case "if": + return p.parseIfExpr() + case "match": + return p.parseMatchExpr() + case "meta": + // Check for meta() vs plain bare-ident meta. + if p.peekAt(1).Kind == TokLParen && !p.peekAt(1).PrecededBySpace && !p.peekAt(1).PrecededByNewline { + tok := p.advance() // 'meta' + p.advance() // '(' + p.skipInlineNewlines() + key, err := p.parseExpr() + if err != nil { + return nil, err + } + p.skipInlineNewlines() + if _, err := p.expect(TokRParen, "')'"); err != nil { + return nil, err + } + return &MetaCall{Key: key, TokPos: tok.Pos}, nil + } + } + + // Check for lambda: ` -> `. `->` must be on the same line. + if p.peekAt(1).Kind == TokArrow && !p.peekAt(1).PrecededByNewline { + paramTok := p.advance() + arrowTok := p.advance() + // Body must be on the same line as ->. + if p.peek().PrecededByNewline { + return nil, p.errAt(p.peek().Pos, "lambda body must start on the same line as '->'") + } + body, err := p.parseExpr() + if err != nil { + return nil, err + } + return &Lambda{ + Param: paramTok.Text, Discard: paramTok.Text == "_", ParamPos: paramTok.Pos, + Body: body, ArrowPos: arrowTok.Pos, + }, nil + } + // Function call? `(` with no space. + if p.peekAt(1).Kind == TokLParen && !p.peekAt(1).PrecededBySpace && !p.peekAt(1).PrecededByNewline { + nameTok := p.advance() + p.advance() // '(' + args, named, err := p.parseCallArgs() + if err != nil { + return nil, err + } + return &FunctionCall{Name: nameTok.Text, NamePos: nameTok.Pos, Args: args, Named: named}, nil + } + // Plain bare-ident — legacy `foo` = `this.foo` form. + p.advance() + return &Ident{Name: t.Text, TokPos: t.Pos}, nil +} + +func (p *parser) parseArrayLit() (Expr, error) { + open := p.advance() // '[' + arr := &ArrayLit{TokPos: open.Pos} + p.skipInlineNewlines() + if p.peek().Kind == TokRBracket { + p.advance() + return arr, nil + } + for { + e, err := p.parseExpr() + if err != nil { + return nil, err + } + arr.Elems = append(arr.Elems, e) + p.skipInlineNewlines() + if p.peek().Kind == TokComma { + p.advance() + p.skipInlineNewlines() + // trailing comma tolerance + if p.peek().Kind == TokRBracket { + p.advance() + return arr, nil + } + continue + } + break + } + p.skipInlineNewlines() + if _, err := p.expect(TokRBracket, "']'"); err != nil { + return nil, err + } + return arr, nil +} + +func (p *parser) parseObjectLit() (Expr, error) { + open := p.advance() // '{' + obj := &ObjectLit{TokPos: open.Pos} + p.skipInlineNewlines() + if p.peek().Kind == TokRBrace { + p.advance() + return obj, nil + } + for { + entry, err := p.parseObjectEntry() + if err != nil { + return nil, err + } + obj.Entries = append(obj.Entries, entry) + p.skipInlineNewlines() + if p.peek().Kind == TokComma { + p.advance() + p.skipInlineNewlines() + if p.peek().Kind == TokRBrace { + p.advance() + return obj, nil + } + continue + } + break + } + p.skipInlineNewlines() + if _, err := p.expect(TokRBrace, "'}'"); err != nil { + return nil, err + } + return obj, nil +} + +func (p *parser) parseObjectEntry() (ObjectEntry, error) { + // Key: quoted string OR any expression. Non-string literal keys are + // rejected (§4.5). We accept the expression as-is and let downstream + // tooling validate type. + keyTok := p.peek() + var key Expr + if keyTok.Kind == TokString { + p.advance() + key = &Literal{Kind: LitString, Raw: keyTok.Text, Str: keyTok.Text, TokPos: keyTok.Pos} + } else { + e, err := p.parseExpr() + if err != nil { + return ObjectEntry{}, err + } + key = e + // Reject non-string literal keys (bare int/float/bool/null + // etc). + if lit, ok := key.(*Literal); ok { + switch lit.Kind { + case LitInt, LitFloat, LitBool, LitNull, LitRawString: + return ObjectEntry{}, &ParseError{ + Pos: lit.TokPos, + Msg: fmt.Sprintf("object keys must be strings, got %s literal", litKindName(lit.Kind)), + } + } + } + } + p.skipInlineNewlines() + if _, err := p.expect(TokColon, "':'"); err != nil { + return ObjectEntry{}, err + } + p.skipInlineNewlines() + val, err := p.parseExpr() + if err != nil { + return ObjectEntry{}, err + } + return ObjectEntry{Key: key, Value: val}, nil +} + +func litKindName(k LiteralKind) string { + switch k { + case LitNull: + return "null" + case LitBool: + return "bool" + case LitInt: + return "int" + case LitFloat: + return "float" + case LitString, LitRawString: + return "string" + } + return "?" +} + +func (p *parser) parseIfExpr() (Expr, error) { + tok := p.advance() // 'if' + ex := &IfExpr{TokPos: tok.Pos} + br, err := p.parseIfExprBranch(tok.Pos) + if err != nil { + return nil, err + } + ex.Branches = append(ex.Branches, br) + for p.peek().Kind == TokIdent && p.peek().Text == "else" { + elseTok := p.advance() + if p.peek().Kind == TokIdent && p.peek().Text == "if" { + p.advance() + nb, err := p.parseIfExprBranch(elseTok.Pos) + if err != nil { + return nil, err + } + ex.Branches = append(ex.Branches, nb) + continue + } + if _, err := p.expect(TokLBrace, "'{'"); err != nil { + return nil, err + } + p.skipInlineNewlines() + body, err := p.parseExpr() + if err != nil { + return nil, err + } + p.skipInlineNewlines() + if _, err := p.expect(TokRBrace, "'}'"); err != nil { + return nil, err + } + ex.Else = body + break + } + return ex, nil +} + +func (p *parser) parseIfExprBranch(pos Pos) (IfExprBranch, error) { + cond, err := p.parseExpr() + if err != nil { + return IfExprBranch{}, err + } + if _, err := p.expect(TokLBrace, "'{'"); err != nil { + return IfExprBranch{}, err + } + p.skipInlineNewlines() + body, err := p.parseExpr() + if err != nil { + return IfExprBranch{}, err + } + p.skipInlineNewlines() + if _, err := p.expect(TokRBrace, "'}'"); err != nil { + return IfExprBranch{}, err + } + return IfExprBranch{Cond: cond, Body: body, Pos: pos}, nil +} + +func (p *parser) parseMatchExpr() (Expr, error) { + tok := p.advance() // 'match' + ex := &MatchExpr{TokPos: tok.Pos} + // Optional subject: any expression before '{'. + if p.peek().Kind != TokLBrace { + subj, err := p.parseExpr() + if err != nil { + return nil, err + } + ex.Subject = subj + } + if _, err := p.expect(TokLBrace, "'{'"); err != nil { + return nil, err + } + p.skipInlineNewlines() + for p.peek().Kind != TokRBrace && p.peek().Kind != TokEOF { + c, err := p.parseMatchCase() + if err != nil { + return nil, err + } + ex.Cases = append(ex.Cases, c) + // Case separator: newline or comma. + for p.peek().Kind == TokComma || p.peek().Kind == TokNewline { + p.advance() + } + } + if _, err := p.expect(TokRBrace, "'}'"); err != nil { + return nil, err + } + return ex, nil +} + +func (p *parser) parseMatchCase() (MatchCase, error) { + t := p.peek() + c := MatchCase{Pos: t.Pos} + if t.Kind == TokIdent && t.Text == "_" && p.peekAt(1).Kind == TokFatArrow { + p.advance() + c.Wildcard = true + } else { + pat, err := p.parseExpr() + if err != nil { + return MatchCase{}, err + } + c.Pattern = pat + } + if _, err := p.expect(TokFatArrow, "'=>'"); err != nil { + return MatchCase{}, err + } + p.skipInlineNewlines() + body, err := p.parseExpr() + if err != nil { + return MatchCase{}, err + } + c.Body = body + return c, nil +} diff --git a/internal/bloblang2/migrator/v1ast/parser_test.go b/internal/bloblang2/migrator/v1ast/parser_test.go new file mode 100644 index 000000000..768996c3f --- /dev/null +++ b/internal/bloblang2/migrator/v1ast/parser_test.go @@ -0,0 +1,613 @@ +package v1ast + +import ( + "fmt" + "os" + "path/filepath" + "reflect" + "sort" + "strings" + "testing" + + "gopkg.in/yaml.v3" +) + +// +// YAML test-corpus loader +// + +type corpusCase struct { + Name string `yaml:"name"` + Skip string `yaml:"skip"` + Mapping string `yaml:"mapping"` + CompileError string `yaml:"compile_error"` + Cases []subCase `yaml:"cases"` + // Files is intentionally untyped — it may be either a map or a slice + // across the corpus, and we don't use it in these tests. + Files any `yaml:"files"` +} + +type subCase struct { + Name string `yaml:"name"` +} + +type corpusFileDoc struct { + Tests []corpusCase `yaml:"tests"` +} + +// corpusRoot returns the directory holding the V1 spec YAML tests. +func corpusRoot(t *testing.T) string { + t.Helper() + wd, err := os.Getwd() + if err != nil { + t.Fatalf("getwd: %v", err) + } + return filepath.Join(wd, "..", "v1spec", "tests") +} + +func listCorpusFiles(t *testing.T, root string) []string { + t.Helper() + var files []string + err := filepath.Walk(root, func(p string, info os.FileInfo, walkErr error) error { + if walkErr != nil { + return walkErr + } + if info.IsDir() { + return nil + } + if strings.HasSuffix(p, ".yaml") { + files = append(files, p) + } + return nil + }) + if err != nil { + t.Fatalf("walking corpus: %v", err) + } + sort.Strings(files) + return files +} + +// loadCorpusFile reads and decodes a single YAML file. +func loadCorpusFile(path string) ([]corpusCase, error) { + data, err := os.ReadFile(path) + if err != nil { + return nil, err + } + var doc corpusFileDoc + if err := yaml.Unmarshal(data, &doc); err != nil { + return nil, err + } + return doc.Tests, nil +} + +// TestParseCorpus: parse every non-skipped mapping. Report per-file stats +// and assert overall success rate >=95%. +func TestParseCorpus(t *testing.T) { + root := corpusRoot(t) + files := listCorpusFiles(t, root) + if len(files) == 0 { + t.Fatalf("no corpus files found under %s", root) + } + + var total, passed, skipped, expectedFail int + type failure struct { + file, name string + err error + src string + } + var failures []failure + + for _, file := range files { + cases, err := loadCorpusFile(file) + if err != nil { + t.Errorf("%s: load error: %v", file, err) + continue + } + for _, c := range cases { + if c.Skip != "" { + skipped++ + continue + } + if strings.TrimSpace(c.Mapping) == "" { + continue + } + total++ + _, perr := Parse(c.Mapping) + if perr == nil { + passed++ + continue + } + // A `compile_error` test case expects semantic failure, not + // lexical failure, so we still consider parse success the + // desired outcome in most cases. Bucket these separately so + // the user sees the breakdown. + if c.CompileError != "" { + expectedFail++ + continue + } + failures = append(failures, failure{file: file, name: c.Name, err: perr, src: c.Mapping}) + } + } + + // Per-file failure counts. + perFile := map[string]int{} + for _, f := range failures { + perFile[f.file]++ + } + keys := make([]string, 0, len(perFile)) + for k := range perFile { + keys = append(keys, k) + } + sort.Slice(keys, func(i, j int) bool { return perFile[keys[i]] > perFile[keys[j]] }) + + rate := float64(passed) / float64(total) + t.Logf("corpus: %d total, %d passed (%.1f%%), %d failed, %d compile_error (ignored), %d skipped", + total, passed, rate*100, len(failures), expectedFail, skipped) + for _, k := range keys { + rel, _ := filepath.Rel(root, k) + t.Logf(" %d failures: %s", perFile[k], rel) + } + + // Emit up to 15 failure samples for debugging. + if len(failures) > 0 { + show := len(failures) + if show > 15 { + show = 15 + } + for i := 0; i < show; i++ { + f := failures[i] + rel, _ := filepath.Rel(root, f.file) + t.Logf("-- %s :: %s --\n%s\nerror: %v", rel, f.name, f.src, f.err) + } + } + + if rate < 0.95 { + t.Fatalf("corpus parse rate %.1f%% < 95%%", rate*100) + } +} + +// TestRoundTrip: parse corpus, print, re-parse, compare ASTs. +func TestRoundTrip(t *testing.T) { + root := corpusRoot(t) + files := listCorpusFiles(t, root) + + var total, passed int + type failure struct { + file, name string + err string + src string + printed string + } + var failures []failure + + for _, file := range files { + cases, err := loadCorpusFile(file) + if err != nil { + continue + } + for _, c := range cases { + if c.Skip != "" { + continue + } + if strings.TrimSpace(c.Mapping) == "" { + continue + } + first, perr := Parse(c.Mapping) + if perr != nil { + continue + } + total++ + printed := PrintString(first) + second, perr := Parse(printed) + if perr != nil { + failures = append(failures, failure{ + file: file, name: c.Name, + err: "re-parse: " + perr.Error(), src: c.Mapping, printed: printed, + }) + continue + } + if diff, equal := astEqual(first, second); !equal { + failures = append(failures, failure{ + file: file, name: c.Name, err: "AST diff: " + diff, + src: c.Mapping, printed: printed, + }) + continue + } + passed++ + } + } + + rate := 1.0 + if total > 0 { + rate = float64(passed) / float64(total) + } + t.Logf("roundtrip: %d/%d passed (%.1f%%)", passed, total, rate*100) + if len(failures) > 0 { + show := len(failures) + if show > 15 { + show = 15 + } + for i := 0; i < show; i++ { + f := failures[i] + rel, _ := filepath.Rel(root, f.file) + t.Logf("-- %s :: %s --\nORIG: %s\nPRINT: %s\n%s", rel, f.name, f.src, f.printed, f.err) + } + } + if rate < 1.0 { + t.Fatalf("roundtrip rate %.1f%% < 100%%", rate*100) + } +} + +// astEqual compares two programs structurally, ignoring Pos values. +// Returns a diff description on mismatch. +func astEqual(a, b *Program) (string, bool) { + if diff, ok := nodeEqual(a, b, ""); !ok { + return diff, false + } + return "", true +} + +// nodeEqual is a best-effort reflection-based comparator that skips fields +// whose names contain "Pos". +func nodeEqual(a, b any, path string) (string, bool) { + va, vb := reflect.ValueOf(a), reflect.ValueOf(b) + if !va.IsValid() && !vb.IsValid() { + return "", true + } + if va.IsValid() != vb.IsValid() { + return fmt.Sprintf("%s: validity mismatch", path), false + } + if va.Kind() != vb.Kind() { + return fmt.Sprintf("%s: kind %s vs %s", path, va.Kind(), vb.Kind()), false + } + switch va.Kind() { + case reflect.Ptr, reflect.Interface: + if va.IsNil() && vb.IsNil() { + return "", true + } + if va.IsNil() != vb.IsNil() { + return fmt.Sprintf("%s: nil mismatch", path), false + } + return nodeEqual(va.Elem().Interface(), vb.Elem().Interface(), path) + case reflect.Slice: + if va.Len() != vb.Len() { + return fmt.Sprintf("%s: slice len %d vs %d", path, va.Len(), vb.Len()), false + } + for i := 0; i < va.Len(); i++ { + if diff, ok := nodeEqual(va.Index(i).Interface(), vb.Index(i).Interface(), + fmt.Sprintf("%s[%d]", path, i)); !ok { + return diff, false + } + } + return "", true + case reflect.Struct: + t := va.Type() + for i := 0; i < va.NumField(); i++ { + f := t.Field(i) + if !f.IsExported() { + continue + } + // Skip positional metadata. + if strings.Contains(f.Name, "Pos") { + continue + } + if diff, ok := nodeEqual(va.Field(i).Interface(), vb.Field(i).Interface(), + path+"."+f.Name); !ok { + return diff, false + } + } + return "", true + case reflect.Map: + if va.Len() != vb.Len() { + return fmt.Sprintf("%s: map len %d vs %d", path, va.Len(), vb.Len()), false + } + keys := va.MapKeys() + for _, k := range keys { + bv := vb.MapIndex(k) + if !bv.IsValid() { + return fmt.Sprintf("%s: missing key %v", path, k.Interface()), false + } + if diff, ok := nodeEqual(va.MapIndex(k).Interface(), bv.Interface(), + fmt.Sprintf("%s[%v]", path, k.Interface())); !ok { + return diff, false + } + } + return "", true + default: + if !reflect.DeepEqual(a, b) { + return fmt.Sprintf("%s: %#v vs %#v", path, a, b), false + } + return "", true + } +} + +// TestUnit: small hand-crafted assertions for specific grammar quirks. +func TestUnit(t *testing.T) { + tests := []struct { + name string + src string + want func(t *testing.T, p *Program) + }{ + { + name: "root assignment with literal", + src: `root.foo = "bar"`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if a.Target.Kind != TargetRoot { + t.Fatalf("target kind = %v, want TargetRoot", a.Target.Kind) + } + if len(a.Target.Path) != 1 || a.Target.Path[0].Name != "foo" { + t.Fatalf("target path = %+v", a.Target.Path) + } + }, + }, + { + name: "this target preserved, not rewritten", + src: `this.foo = "bar"`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if a.Target.Kind != TargetThis { + t.Fatalf("target kind = %v, want TargetThis", a.Target.Kind) + } + }, + }, + { + name: "bare-identifier target preserved", + src: `foo.bar = 1`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if a.Target.Kind != TargetBare { + t.Fatalf("target kind = %v, want TargetBare", a.Target.Kind) + } + }, + }, + { + name: "let binding", + src: `let x = 5`, + want: func(t *testing.T, p *Program) { + if _, ok := p.Stmts[0].(*LetStmt); !ok { + t.Fatalf("expected LetStmt, got %T", p.Stmts[0]) + } + }, + }, + { + name: "meta bare key assignment", + src: `meta foo = 1`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if a.Target.Kind != TargetMeta { + t.Fatalf("target kind = %v, want TargetMeta", a.Target.Kind) + } + if len(a.Target.Path) != 1 || a.Target.Path[0].Quoted { + t.Fatalf("path = %+v", a.Target.Path) + } + }, + }, + { + name: "meta quoted key assignment", + src: `meta "foo bar" = 1`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if !a.Target.Path[0].Quoted || a.Target.Path[0].Name != "foo bar" { + t.Fatalf("path = %+v", a.Target.Path) + } + }, + }, + { + name: "meta whole replacement", + src: `meta = {"x": 1}`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if len(a.Target.Path) != 0 { + t.Fatalf("expected empty path for whole meta assign, got %+v", a.Target.Path) + } + }, + }, + { + name: "variable reference", + src: `root = $x`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if _, ok := a.Value.(*VarRef); !ok { + t.Fatalf("value = %T, want *VarRef", a.Value) + } + }, + }, + { + name: "bare @ reference", + src: `root = @`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + mr, ok := a.Value.(*MetaRef) + if !ok || mr.Name != "" { + t.Fatalf("value = %+v", a.Value) + } + }, + }, + { + name: "meta(expr) call", + src: `root = meta("x")`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if _, ok := a.Value.(*MetaCall); !ok { + t.Fatalf("value = %T, want *MetaCall", a.Value) + } + }, + }, + { + name: "arithmetic precedence: high-precedence |", + src: `root = a + b | c`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + bin, ok := a.Value.(*BinaryExpr) + if !ok || bin.Op != TokPlus { + t.Fatalf("expected top-level '+', got %T %+v", a.Value, a.Value) + } + if _, ok := bin.Right.(*BinaryExpr); !ok { + t.Fatalf("right should be '|' BinaryExpr, got %T", bin.Right) + } + }, + }, + { + name: "method chain", + src: `root = this.foo.uppercase()`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if _, ok := a.Value.(*MethodCall); !ok { + t.Fatalf("value = %T, want *MethodCall", a.Value) + } + }, + }, + { + name: "lambda in method arg", + src: `root = items.map_each(x -> x.value)`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + mc, ok := a.Value.(*MethodCall) + if !ok { + t.Fatalf("expected method call, got %T", a.Value) + } + if len(mc.Args) != 1 { + t.Fatalf("args = %+v", mc.Args) + } + if _, ok := mc.Args[0].Value.(*Lambda); !ok { + t.Fatalf("arg = %T, want *Lambda", mc.Args[0].Value) + } + }, + }, + { + name: ".(expr) map expression", + src: `root = this.thing.(article | comment)`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if _, ok := a.Value.(*MapExpr); !ok { + t.Fatalf("value = %T, want *MapExpr", a.Value) + } + }, + }, + { + name: "if expression", + src: `root = if this.x > 0 { "big" } else { "small" }`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if _, ok := a.Value.(*IfExpr); !ok { + t.Fatalf("value = %T, want *IfExpr", a.Value) + } + }, + }, + { + name: "match expression", + src: "root = match this {\n" + + " \"a\" => 1\n" + + " _ => 2\n" + + "}", + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + m, ok := a.Value.(*MatchExpr) + if !ok { + t.Fatalf("value = %T, want *MatchExpr", a.Value) + } + if len(m.Cases) != 2 { + t.Fatalf("cases = %+v", m.Cases) + } + }, + }, + { + name: "array literal", + src: `root = [1, 2, 3]`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if _, ok := a.Value.(*ArrayLit); !ok { + t.Fatalf("value = %T", a.Value) + } + }, + }, + { + name: "object literal", + src: `root = {"a": 1, "b": 2}`, + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + if _, ok := a.Value.(*ObjectLit); !ok { + t.Fatalf("value = %T", a.Value) + } + }, + }, + { + name: "map decl", + src: "map greet {\n" + + " root = \"hi\"\n" + + "}", + want: func(t *testing.T, p *Program) { + if len(p.Maps) != 1 || p.Maps[0].Name != "greet" { + t.Fatalf("maps = %+v", p.Maps) + } + }, + }, + { + name: "import", + src: `import "./lib.blobl"`, + want: func(t *testing.T, p *Program) { + if len(p.Imports) != 1 { + t.Fatalf("imports = %+v", p.Imports) + } + }, + }, + { + name: "triple-quoted string preserved raw", + src: "root = \"\"\"a\\nb\"\"\"", + want: func(t *testing.T, p *Program) { + a := firstAssign(t, p) + lit, ok := a.Value.(*Literal) + if !ok || lit.Kind != LitRawString { + t.Fatalf("value = %+v", a.Value) + } + if lit.Str != `a\nb` { + t.Fatalf("raw body = %q, want %q", lit.Str, `a\nb`) + } + }, + }, + } + + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + prog, err := Parse(tc.src) + if err != nil { + t.Fatalf("parse error: %v", err) + } + tc.want(t, prog) + }) + } +} + +func firstAssign(t *testing.T, p *Program) *Assignment { + t.Helper() + if len(p.Stmts) == 0 { + t.Fatalf("no statements parsed") + } + a, ok := p.Stmts[0].(*Assignment) + if !ok { + t.Fatalf("stmt[0] = %T, want *Assignment", p.Stmts[0]) + } + return a +} + +// TestParseErrorCases covers specific quirks that MUST be parse errors. +func TestParseErrorCases(t *testing.T) { + cases := []struct { + name string + src string + }{ + {"double-not", `root = !!x`}, + {"this[0] bracket indexing", `root = this[0]`}, + {"equals without spaces", `root.a=1`}, + {"equals no space right", `root.a =1`}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + _, err := Parse(tc.src) + if err == nil { + t.Fatalf("expected parse error for %q", tc.src) + } + }) + } +} diff --git a/internal/bloblang2/migrator/v1ast/printer.go b/internal/bloblang2/migrator/v1ast/printer.go new file mode 100644 index 000000000..9a715d00f --- /dev/null +++ b/internal/bloblang2/migrator/v1ast/printer.go @@ -0,0 +1,442 @@ +package v1ast + +import ( + "fmt" + "io" + "strconv" + "strings" +) + +// Print writes the V1 source form of a *Program to w. The printed text +// round-trips via Parse back to a structurally equivalent AST (ignoring +// Pos values). +func Print(w io.Writer, p *Program) error { + pr := &printer{w: w} + for i, st := range p.Stmts { + if i > 0 { + pr.writeln("") + } + pr.printStmt(st, 0) + } + return pr.err +} + +// PrintString is a convenience wrapper returning the printed output. +func PrintString(p *Program) string { + var sb strings.Builder + _ = Print(&sb, p) + return sb.String() +} + +type printer struct { + w io.Writer + err error +} + +func (pr *printer) write(s string) { + if pr.err != nil { + return + } + _, pr.err = io.WriteString(pr.w, s) +} + +func (pr *printer) writeln(s string) { + pr.write(s) + pr.write("\n") +} + +func (pr *printer) indent(level int) string { + return strings.Repeat(" ", level) +} + +func (pr *printer) printStmt(st Stmt, indent int) { + ind := pr.indent(indent) + pr.printLeadingTrivia(st.Trivia().Leading, ind) + switch s := st.(type) { + case *Assignment: + pr.write(ind) + pr.printTarget(s.Target) + pr.write(" = ") + pr.printExpr(s.Value) + case *LetStmt: + pr.write(ind) + pr.write("let ") + if s.NameQuoted { + pr.write(strconv.Quote(s.Name)) + } else { + pr.write(s.Name) + } + pr.write(" = ") + pr.printExpr(s.Value) + case *MapDecl: + pr.write(ind) + pr.write("map ") + pr.write(s.Name) + pr.write(" {\n") + for _, inner := range s.Body { + pr.printStmt(inner, indent+1) + pr.write("\n") + } + pr.write(ind) + pr.write("}") + case *ImportStmt: + pr.write(ind) + pr.write("import ") + pr.printExpr(s.Path) + case *FromStmt: + pr.write(ind) + pr.write("from ") + pr.printExpr(s.Path) + case *IfStmt: + pr.write(ind) + for i, br := range s.Branches { + if i == 0 { + pr.write("if ") + } else { + pr.write(" else if ") + } + pr.printExpr(br.Cond) + pr.write(" {\n") + for _, inner := range br.Body { + pr.printStmt(inner, indent+1) + pr.write("\n") + } + pr.write(ind) + pr.write("}") + } + if s.Else != nil { + pr.write(" else {\n") + for _, inner := range s.Else { + pr.printStmt(inner, indent+1) + pr.write("\n") + } + pr.write(ind) + pr.write("}") + } + case *BareExprStmt: + pr.write(ind) + pr.printExpr(s.Expr) + default: + pr.err = fmt.Errorf("printer: unknown statement type %T", st) + } + for _, t := range st.Trivia().Trailing { + if t.Kind == TriviaComment { + pr.write(" #") + pr.write(t.Text) + } + } +} + +// printLeadingTrivia emits blank lines and standalone comments preceding a +// statement. Blank-line trivia becomes an empty line; comments render as +// `# ` on their own lines. +func (pr *printer) printLeadingTrivia(tri []Trivia, ind string) { + for _, t := range tri { + switch t.Kind { + case TriviaBlankLine: + pr.write("\n") + case TriviaComment: + pr.write(ind) + pr.write("#") + pr.write(t.Text) + pr.write("\n") + } + } +} + +func (pr *printer) printTarget(t AssignTarget) { + switch t.Kind { + case TargetRoot: + pr.write("root") + pr.printPathSegments(t.Path) + case TargetThis: + pr.write("this") + pr.printPathSegments(t.Path) + case TargetBare: + for i, seg := range t.Path { + if i > 0 { + pr.write(".") + } + pr.printSegment(seg) + } + case TargetMeta: + pr.write("meta") + if len(t.Path) == 1 { + pr.write(" ") + if t.Path[0].Quoted { + pr.write(strconv.Quote(t.Path[0].Name)) + } else { + pr.write(t.Path[0].Name) + } + } + } +} + +func (pr *printer) printPathSegments(segs []PathSegment) { + for _, s := range segs { + pr.write(".") + pr.printSegment(s) + } +} + +func (pr *printer) printSegment(s PathSegment) { + if s.Quoted { + pr.write(strconv.Quote(s.Name)) + } else { + pr.write(s.Name) + } +} + +func (pr *printer) printExpr(e Expr) { + switch n := e.(type) { + case *Literal: + pr.printLiteral(n) + case *Ident: + pr.write(n.Name) + case *ThisExpr: + pr.write("this") + case *RootExpr: + pr.write("root") + case *VarRef: + pr.write("$") + pr.write(n.Name) + case *MetaRef: + pr.write("@") + if n.Name != "" { + if n.Quoted { + pr.write(strconv.Quote(n.Name)) + } else { + pr.write(n.Name) + } + } + case *MetaCall: + pr.write("meta(") + pr.printExpr(n.Key) + pr.write(")") + case *ParenExpr: + pr.write("(") + pr.printExpr(n.Inner) + pr.write(")") + case *UnaryExpr: + switch n.Op { + case TokBang: + pr.write("!") + case TokMinus: + pr.write("-") + } + pr.printExpr(n.Operand) + case *BinaryExpr: + pr.printExpr(n.Left) + pr.write(" ") + pr.write(opSymbol(n.Op)) + pr.write(" ") + pr.printExpr(n.Right) + case *FieldAccess: + pr.printExpr(n.Recv) + pr.write(".") + pr.printSegment(n.Seg) + case *MethodCall: + pr.printExpr(n.Recv) + pr.write(".") + pr.write(n.Name) + pr.write("(") + pr.printArgs(n.Args) + pr.write(")") + case *FunctionCall: + pr.write(n.Name) + pr.write("(") + pr.printArgs(n.Args) + pr.write(")") + case *MapExpr: + pr.printExpr(n.Recv) + pr.write(".(") + pr.printExpr(n.Body) + pr.write(")") + case *Lambda: + if n.Discard { + pr.write("_") + } else { + pr.write(n.Param) + } + pr.write(" -> ") + pr.printExpr(n.Body) + case *ArrayLit: + pr.write("[") + for i, el := range n.Elems { + if i > 0 { + pr.write(", ") + } + pr.printExpr(el) + } + pr.write("]") + case *ObjectLit: + pr.write("{") + for i, ent := range n.Entries { + if i > 0 { + pr.write(", ") + } + pr.printObjectKey(ent.Key) + pr.write(": ") + pr.printExpr(ent.Value) + } + pr.write("}") + case *IfExpr: + for i, br := range n.Branches { + if i == 0 { + pr.write("if ") + } else { + pr.write(" else if ") + } + pr.printExpr(br.Cond) + pr.write(" { ") + pr.printExpr(br.Body) + pr.write(" }") + } + if n.Else != nil { + pr.write(" else { ") + pr.printExpr(n.Else) + pr.write(" }") + } + case *MatchExpr: + pr.write("match") + if n.Subject != nil { + pr.write(" ") + pr.printExpr(n.Subject) + } + pr.write(" {\n") + for _, c := range n.Cases { + pr.write(" ") + if c.Wildcard { + pr.write("_") + } else { + pr.printExpr(c.Pattern) + } + pr.write(" => ") + pr.printExpr(c.Body) + pr.write("\n") + } + pr.write("}") + default: + pr.err = fmt.Errorf("printer: unknown expression type %T", e) + } +} + +func (pr *printer) printLiteral(l *Literal) { + switch l.Kind { + case LitNull: + pr.write("null") + case LitBool: + if l.Bool { + pr.write("true") + } else { + pr.write("false") + } + case LitInt: + // Prefer the original raw text if available so oversize literals (e.g. + // uint64 max) round-trip without int64 overflow. + if l.Raw != "" { + pr.write(l.Raw) + } else { + pr.write(strconv.FormatInt(l.Int, 10)) + } + case LitFloat: + if l.Raw != "" { + pr.write(l.Raw) + } else { + s := strconv.FormatFloat(l.Float, 'f', -1, 64) + if !strings.ContainsAny(s, ".eE") { + s += ".0" + } + pr.write(s) + } + case LitString: + pr.write(strconv.Quote(l.Str)) + case LitRawString: + // Raw strings may contain characters that strconv.Quote can't + // reproduce byte-for-byte (literal newlines). Preserve via """...""" + // form. + pr.write(`"""`) + pr.write(l.Str) + pr.write(`"""`) + } +} + +func (pr *printer) printObjectKey(k Expr) { + // Quoted-string keys are emitted as string literals. For expressions + // that can appear as an object key without ambiguity (anything that + // does not start with a quoted string literal), emit verbatim. Only + // add parens when needed. + if lit, ok := k.(*Literal); ok && (lit.Kind == LitString || lit.Kind == LitRawString) { + pr.write(strconv.Quote(lit.Str)) + return + } + if _, ok := k.(*ParenExpr); ok { + pr.printExpr(k) + return + } + if keyNeedsParens(k) { + pr.write("(") + pr.printExpr(k) + pr.write(")") + return + } + pr.printExpr(k) +} + +// keyNeedsParens reports whether an object-literal key expression must be +// wrapped in parens. V1's object-key parser is `OneOf(QuotedString, +// queryParser)`: if the key starts with a quoted string literal token, it +// will be committed as the key and any following `.method()` / `+ x` tails +// would fail. So we only need parens when the expression's head token is a +// quoted string literal that has something after it (tails or a binary +// operator). +func keyNeedsParens(e Expr) bool { + // Atoms without tails are safe. + switch e.(type) { + case *Literal, *Ident, *ThisExpr, *RootExpr, *VarRef, *MetaRef, + *MetaCall, *FunctionCall, *ArrayLit, *ObjectLit, + *IfExpr, *MatchExpr, *Lambda, *ParenExpr: + return false + } + // Anything with a receiver that begins with a quoted string literal + // needs parens. + return startsWithStringLit(e) +} + +func startsWithStringLit(e Expr) bool { + switch n := e.(type) { + case *Literal: + return n.Kind == LitString || n.Kind == LitRawString + case *FieldAccess: + return startsWithStringLit(n.Recv) + case *MethodCall: + return startsWithStringLit(n.Recv) + case *MapExpr: + return startsWithStringLit(n.Recv) + case *BinaryExpr: + return startsWithStringLit(n.Left) + case *UnaryExpr: + return false + } + return false +} + +func (pr *printer) printArgs(args []CallArg) { + for i, a := range args { + if i > 0 { + pr.write(", ") + } + if a.Name != "" { + pr.write(a.Name) + pr.write(": ") + } + pr.printExpr(a.Value) + } +} + +func opSymbol(k TokenKind) string { + if n, ok := tokenNames[k]; ok { + return n + } + return "?" +} diff --git a/internal/bloblang2/migrator/v1ast/scanner.go b/internal/bloblang2/migrator/v1ast/scanner.go new file mode 100644 index 000000000..9c57b9faf --- /dev/null +++ b/internal/bloblang2/migrator/v1ast/scanner.go @@ -0,0 +1,481 @@ +// Package v1ast implements a dedicated parser for Bloblang V1 that produces an +// inspectable, position-preserving AST. It is intended for migration tooling +// (V1 -> V2) and is deliberately separate from internal/bloblang/parser, which +// produces closures rather than AST nodes. +package v1ast + +import ( + "fmt" + "strconv" + "strings" + "unicode/utf8" +) + +// Pos is a source position. +type Pos struct { + Line int // 1-based + Column int // 1-based + Offset int // byte offset from start of input +} + +// String renders Pos as "line:col". +func (p Pos) String() string { return fmt.Sprintf("%d:%d", p.Line, p.Column) } + +// TokenKind identifies the type of a lexical token. +type TokenKind int + +// Token kinds. +const ( + TokEOF TokenKind = iota + TokNewline + TokComment // # to end-of-line (text excludes the # and trailing newline) + TokIdent // lenient: [A-Za-z0-9_]+ + TokInt // digits (no sign) + TokFloat // digits "." digits + TokString // double-quoted (already unescaped) + TokRawString // triple-quoted (raw) + TokDollar // $ + TokAt // @ + TokLParen // ( + TokRParen // ) + TokLBracket // [ + TokRBracket // ] + TokLBrace // { + TokRBrace // } + TokDot // . + TokComma // , + TokColon // : + TokBang // ! + TokAssign // = + TokEq // == + TokNeq // != + TokLt // < + TokLte // <= + TokGt // > + TokGte // >= + TokAnd // && + TokOr // || + TokPipe // | (coalesce) + TokPlus // + + TokMinus // - + TokStar // * + TokSlash // / + TokPercent // % + TokArrow // -> + TokFatArrow // => + TokIllegal +) + +var tokenNames = map[TokenKind]string{ + TokEOF: "EOF", TokNewline: "NEWLINE", TokComment: "COMMENT", TokIdent: "IDENT", + TokInt: "INT", TokFloat: "FLOAT", TokString: "STRING", TokRawString: "RAW_STRING", + TokDollar: "$", TokAt: "@", TokLParen: "(", TokRParen: ")", + TokLBracket: "[", TokRBracket: "]", TokLBrace: "{", TokRBrace: "}", + TokDot: ".", TokComma: ",", TokColon: ":", TokBang: "!", + TokAssign: "=", TokEq: "==", TokNeq: "!=", + TokLt: "<", TokLte: "<=", TokGt: ">", TokGte: ">=", + TokAnd: "&&", TokOr: "||", TokPipe: "|", + TokPlus: "+", TokMinus: "-", TokStar: "*", TokSlash: "/", TokPercent: "%", + TokArrow: "->", TokFatArrow: "=>", TokIllegal: "ILLEGAL", +} + +// String returns a human-readable token name. +func (t TokenKind) String() string { + if s, ok := tokenNames[t]; ok { + return s + } + return fmt.Sprintf("TOK(%d)", int(t)) +} + +// Token is one lexical unit. +type Token struct { + Kind TokenKind + Text string // raw text for idents/numbers, unescaped content for strings + Pos Pos + + // PrecededBySpace indicates that the token was preceded by inline whitespace + // (space or tab) but not by a newline. Used for assignment = whitespace rule + // and for rejecting `a .b` where a space precedes a dot. + PrecededBySpace bool + // PrecededByNewline indicates the token is on a different line than the + // previous token. (Comments do not count as newlines, but an actual newline + // rune does.) + PrecededByNewline bool +} + +// Scanner produces tokens from source input. +type Scanner struct { + src []rune + offset int // index into src + line int + col int + + // next-token hints + precSpace, precNL bool +} + +// NewScanner constructs a scanner over the given source. +func NewScanner(src string) *Scanner { + return &Scanner{src: []rune(src), line: 1, col: 1} +} + +// pos returns the current source position. +func (s *Scanner) pos() Pos { + return Pos{Line: s.line, Column: s.col, Offset: s.byteOffset()} +} + +func (s *Scanner) byteOffset() int { + // Convert rune offset to approximate byte offset. Cheap & good enough. + b := 0 + for i := 0; i < s.offset && i < len(s.src); i++ { + b += utf8.RuneLen(s.src[i]) + } + return b +} + +func (s *Scanner) peek() rune { + if s.offset >= len(s.src) { + return 0 + } + return s.src[s.offset] +} + +func (s *Scanner) peekAt(i int) rune { + if s.offset+i >= len(s.src) { + return 0 + } + return s.src[s.offset+i] +} + +func (s *Scanner) advance() rune { + if s.offset >= len(s.src) { + return 0 + } + r := s.src[s.offset] + s.offset++ + if r == '\n' { + s.line++ + s.col = 1 + } else { + s.col++ + } + return r +} + +// All tokenises the entire input and returns every token (including the +// terminating EOF). +func (s *Scanner) All() ([]Token, error) { + var out []Token + for { + tok, err := s.Next() + if err != nil { + return out, err + } + out = append(out, tok) + if tok.Kind == TokEOF { + return out, nil + } + } +} + +// skipInlineWhitespace consumes spaces, tabs, and \r (treating \r\n as a +// newline at \n). It stops at \n, # (comment), or a non-whitespace rune. +// Comments are NOT consumed here — Next() emits them as TokComment tokens +// so the parser can collect them as trivia. +// Returns whether any inline space was seen. +func (s *Scanner) skipInlineWhitespace() bool { + sawSpace := false + for s.offset < len(s.src) { + r := s.src[s.offset] + switch r { + case ' ', '\t': + s.advance() + sawSpace = true + case '\r': + // Treat \r as whitespace unless followed by \n (handled by Next as newline). + if s.peekAt(1) == '\n' { + return sawSpace + } + s.advance() + sawSpace = true + default: + return sawSpace + } + } + return sawSpace +} + +// Next returns the next token. +func (s *Scanner) Next() (Token, error) { + // Reset "preceded by" flags for this token. + sawSpace := s.skipInlineWhitespace() + if sawSpace { + s.precSpace = true + } + + // Line comment — emit as a TokComment trivia token, preserving the text + // verbatim (without the leading '#' or the trailing newline). + if s.offset < len(s.src) && s.src[s.offset] == '#' { + p := s.pos() + s.advance() // skip '#' + start := s.offset + for s.offset < len(s.src) && s.src[s.offset] != '\n' { + s.advance() + } + tok := Token{ + Kind: TokComment, Text: string(s.src[start:s.offset]), Pos: p, + PrecededBySpace: s.precSpace, PrecededByNewline: s.precNL, + } + // Comments do not reset precSpace/precNL; whatever the next token + // sees is whatever the comment was preceded by. Newlines bubble on. + return tok, nil + } + + // Handle newlines as tokens. + if s.offset < len(s.src) { + r := s.src[s.offset] + if r == '\n' || (r == '\r' && s.peekAt(1) == '\n') { + p := s.pos() + if r == '\r' { + s.advance() + } + s.advance() + tok := Token{Kind: TokNewline, Text: "\n", Pos: p, PrecededBySpace: s.precSpace, PrecededByNewline: s.precNL} + s.precSpace = false + s.precNL = true + return tok, nil + } + } + + if s.offset >= len(s.src) { + tok := Token{Kind: TokEOF, Pos: s.pos(), PrecededBySpace: s.precSpace, PrecededByNewline: s.precNL} + s.precSpace = false + s.precNL = false + return tok, nil + } + + p := s.pos() + precSpace, precNL := s.precSpace, s.precNL + s.precSpace = false + s.precNL = false + + r := s.src[s.offset] + + // Strings. + if r == '"' { + if s.peekAt(1) == '"' && s.peekAt(2) == '"' { + return s.scanRawString(p, precSpace, precNL) + } + return s.scanString(p, precSpace, precNL) + } + + // Numbers: must start with a digit. (Unary minus is produced separately.) + if r >= '0' && r <= '9' { + return s.scanNumber(p, precSpace, precNL) + } + + // Identifiers (lenient): letters, digits, underscore, but must not start + // with a digit because digits alone are numbers. The scanner emits any + // [A-Za-z_] run (with digits allowed from position 2). Path segment parser + // allows leading-digit segments via TokInt reinterpretation. + if isIdentStart(r) { + return s.scanIdent(p, precSpace, precNL) + } + + // Operators and punctuation. + switch r { + case '(': + s.advance() + return Token{Kind: TokLParen, Text: "(", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case ')': + s.advance() + return Token{Kind: TokRParen, Text: ")", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '[': + s.advance() + return Token{Kind: TokLBracket, Text: "[", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case ']': + s.advance() + return Token{Kind: TokRBracket, Text: "]", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '{': + s.advance() + return Token{Kind: TokLBrace, Text: "{", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '}': + s.advance() + return Token{Kind: TokRBrace, Text: "}", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '.': + s.advance() + return Token{Kind: TokDot, Text: ".", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case ',': + s.advance() + return Token{Kind: TokComma, Text: ",", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case ':': + s.advance() + return Token{Kind: TokColon, Text: ":", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '$': + s.advance() + return Token{Kind: TokDollar, Text: "$", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '@': + s.advance() + return Token{Kind: TokAt, Text: "@", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '+': + s.advance() + return Token{Kind: TokPlus, Text: "+", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '*': + s.advance() + return Token{Kind: TokStar, Text: "*", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '/': + s.advance() + return Token{Kind: TokSlash, Text: "/", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '%': + s.advance() + return Token{Kind: TokPercent, Text: "%", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '-': + s.advance() + if s.peek() == '>' { + s.advance() + return Token{Kind: TokArrow, Text: "->", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokMinus, Text: "-", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '=': + s.advance() + if s.peek() == '=' { + s.advance() + return Token{Kind: TokEq, Text: "==", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + if s.peek() == '>' { + s.advance() + return Token{Kind: TokFatArrow, Text: "=>", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokAssign, Text: "=", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '!': + s.advance() + if s.peek() == '=' { + s.advance() + return Token{Kind: TokNeq, Text: "!=", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokBang, Text: "!", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '<': + s.advance() + if s.peek() == '=' { + s.advance() + return Token{Kind: TokLte, Text: "<=", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokLt, Text: "<", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '>': + s.advance() + if s.peek() == '=' { + s.advance() + return Token{Kind: TokGte, Text: ">=", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokGt, Text: ">", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + case '&': + s.advance() + if s.peek() == '&' { + s.advance() + return Token{Kind: TokAnd, Text: "&&", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokIllegal, Text: "&", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, + fmt.Errorf("%s: unexpected '&'", p) + case '|': + s.advance() + if s.peek() == '|' { + s.advance() + return Token{Kind: TokOr, Text: "||", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokPipe, Text: "|", Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + + return Token{Kind: TokIllegal, Text: string(r), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, + fmt.Errorf("%s: unexpected character %q", p, r) +} + +func isIdentStart(r rune) bool { + return r == '_' || (r >= 'a' && r <= 'z') || (r >= 'A' && r <= 'Z') +} + +func isIdentPart(r rune) bool { + return isIdentStart(r) || (r >= '0' && r <= '9') +} + +func (s *Scanner) scanIdent(p Pos, precSpace, precNL bool) (Token, error) { + start := s.offset + for s.offset < len(s.src) && isIdentPart(s.src[s.offset]) { + s.advance() + } + return Token{Kind: TokIdent, Text: string(s.src[start:s.offset]), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil +} + +func (s *Scanner) scanNumber(p Pos, precSpace, precNL bool) (Token, error) { + start := s.offset + for s.offset < len(s.src) && s.src[s.offset] >= '0' && s.src[s.offset] <= '9' { + s.advance() + } + // Optional fractional part: must have digits on both sides. + if s.peek() == '.' && s.peekAt(1) >= '0' && s.peekAt(1) <= '9' { + s.advance() // dot + for s.offset < len(s.src) && s.src[s.offset] >= '0' && s.src[s.offset] <= '9' { + s.advance() + } + return Token{Kind: TokFloat, Text: string(s.src[start:s.offset]), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + return Token{Kind: TokInt, Text: string(s.src[start:s.offset]), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil +} + +func (s *Scanner) scanString(p Pos, precSpace, precNL bool) (Token, error) { + start := s.offset + s.advance() // opening " + var buf strings.Builder + buf.WriteByte('"') + for s.offset < len(s.src) { + r := s.src[s.offset] + if r == '\\' { + buf.WriteRune(r) + s.advance() + if s.offset < len(s.src) { + buf.WriteRune(s.src[s.offset]) + s.advance() + } + continue + } + if r == '"' { + buf.WriteRune(r) + s.advance() + // Use strconv.Unquote for standard Go-style unescape. + raw := buf.String() + val, err := strconv.Unquote(raw) + if err != nil { + return Token{Kind: TokIllegal, Text: raw, Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, + fmt.Errorf("%s: invalid string literal %q: %w", p, raw, err) + } + return Token{Kind: TokString, Text: val, Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + if r == '\n' { + return Token{Kind: TokIllegal, Text: string(s.src[start:s.offset]), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, + fmt.Errorf("%s: unterminated string literal", p) + } + buf.WriteRune(r) + s.advance() + } + return Token{Kind: TokIllegal, Text: string(s.src[start:s.offset]), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, + fmt.Errorf("%s: unterminated string literal", p) +} + +func (s *Scanner) scanRawString(p Pos, precSpace, precNL bool) (Token, error) { + // Already know src[offset..+2] == `"""`. + s.advance() + s.advance() + s.advance() + var buf strings.Builder + for s.offset < len(s.src) { + if s.src[s.offset] == '"' && s.peekAt(1) == '"' && s.peekAt(2) == '"' { + s.advance() + s.advance() + s.advance() + return Token{Kind: TokRawString, Text: buf.String(), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, nil + } + buf.WriteRune(s.src[s.offset]) + s.advance() + } + return Token{Kind: TokIllegal, Text: buf.String(), Pos: p, PrecededBySpace: precSpace, PrecededByNewline: precNL}, + fmt.Errorf("%s: unterminated triple-quoted string", p) +} diff --git a/internal/bloblang2/migrator/v1ast/trivia_test.go b/internal/bloblang2/migrator/v1ast/trivia_test.go new file mode 100644 index 000000000..50026739d --- /dev/null +++ b/internal/bloblang2/migrator/v1ast/trivia_test.go @@ -0,0 +1,145 @@ +package v1ast_test + +import ( + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +func TestScannerEmitsCommentTokens(t *testing.T) { + src := "root.a = 1 # trail\n# standalone\nroot.b = 2\n" + toks, err := v1ast.NewScanner(src).All() + if err != nil { + t.Fatalf("scan error: %v", err) + } + var comments []string + for _, tok := range toks { + if tok.Kind == v1ast.TokComment { + comments = append(comments, tok.Text) + } + } + want := []string{" trail", " standalone"} + if len(comments) != len(want) { + t.Fatalf("expected %d comments, got %d: %v", len(want), len(comments), comments) + } + for i, c := range comments { + if c != want[i] { + t.Errorf("comment[%d] = %q, want %q", i, c, want[i]) + } + } +} + +func TestParserAttachesLeadingTrivia(t *testing.T) { + src := `# file header +# second line + +# after blank line +root.x = 1 +` + prog, err := v1ast.Parse(src) + if err != nil { + t.Fatalf("parse: %v", err) + } + if len(prog.Stmts) != 1 { + t.Fatalf("want 1 stmt, got %d", len(prog.Stmts)) + } + tri := prog.Stmts[0].Trivia() + var kinds []string + for _, t := range tri.Leading { + switch t.Kind { + case v1ast.TriviaComment: + kinds = append(kinds, "comment:"+strings.TrimSpace(t.Text)) + case v1ast.TriviaBlankLine: + kinds = append(kinds, "blank") + } + } + want := []string{"comment:file header", "comment:second line", "blank", "comment:after blank line"} + if len(kinds) != len(want) { + t.Fatalf("leading trivia = %v, want %v", kinds, want) + } + for i := range kinds { + if kinds[i] != want[i] { + t.Errorf("kinds[%d] = %s, want %s", i, kinds[i], want[i]) + } + } +} + +func TestParserAttachesTrailingComment(t *testing.T) { + src := "root.x = 1 # why\n" + prog, err := v1ast.Parse(src) + if err != nil { + t.Fatalf("parse: %v", err) + } + if len(prog.Stmts) != 1 { + t.Fatalf("want 1 stmt, got %d", len(prog.Stmts)) + } + tri := prog.Stmts[0].Trivia() + if len(tri.Trailing) != 1 { + t.Fatalf("want 1 trailing, got %d", len(tri.Trailing)) + } + if tri.Trailing[0].Kind != v1ast.TriviaComment || strings.TrimSpace(tri.Trailing[0].Text) != "why" { + t.Errorf("trailing = %+v", tri.Trailing[0]) + } +} + +func TestParserCollectsTriviaInsideMapBody(t *testing.T) { + src := `map process { + # inside map + root.x = this.x + + # after blank + root.y = this.y # trail +} +` + prog, err := v1ast.Parse(src) + if err != nil { + t.Fatalf("parse: %v", err) + } + if len(prog.Maps) != 1 { + t.Fatalf("want 1 map, got %d", len(prog.Maps)) + } + body := prog.Maps[0].Body + if len(body) != 2 { + t.Fatalf("want 2 body stmts, got %d", len(body)) + } + firstLeading := body[0].Trivia().Leading + if len(firstLeading) != 1 || firstLeading[0].Kind != v1ast.TriviaComment { + t.Errorf("first stmt leading = %+v", firstLeading) + } + secondLeading := body[1].Trivia().Leading + if len(secondLeading) != 2 { + t.Fatalf("second stmt leading = %+v (want blank + comment)", secondLeading) + } + if secondLeading[0].Kind != v1ast.TriviaBlankLine { + t.Errorf("want blank line first, got %+v", secondLeading[0]) + } + if secondLeading[1].Kind != v1ast.TriviaComment { + t.Errorf("want comment second, got %+v", secondLeading[1]) + } + secondTrailing := body[1].Trivia().Trailing + if len(secondTrailing) != 1 || strings.TrimSpace(secondTrailing[0].Text) != "trail" { + t.Errorf("second stmt trailing = %+v", secondTrailing) + } +} + +func TestParserCollapsesMultipleBlankLines(t *testing.T) { + src := "root.a = 1\n\n\n\nroot.b = 2\n" + prog, err := v1ast.Parse(src) + if err != nil { + t.Fatalf("parse: %v", err) + } + if len(prog.Stmts) != 2 { + t.Fatalf("want 2 stmts, got %d", len(prog.Stmts)) + } + leading := prog.Stmts[1].Trivia().Leading + var blanks int + for _, t := range leading { + if t.Kind == v1ast.TriviaBlankLine { + blanks++ + } + } + if blanks != 1 { + t.Errorf("want 1 collapsed blank-line trivia, got %d (full leading: %+v)", blanks, leading) + } +} From 8b4742bd13b4c8260fb7eac3d7352dce49c8889c Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 5 May 2026 11:45:57 +0100 Subject: [PATCH 12/20] bloblang(v2): Add V1 -> V2 mapping translator Adds internal/bloblang2/migrator/translator/, the core V1 -> V2 translation pipeline. The translator consumes V1 AST produced by v1ast, walks expressions and statements applying targeted rewrite rules (methods, imports, control flow, lambdas, mapping mode), and emits V2 source via the syntax printer while preserving trivia. The package layers its rules so behaviour-equivalent rewrites are applied directly and known V1 -> V2 semantic divergences are flagged as SemanticChange entries on the change report rather than silently rewritten. A corpus regression test plus per-rule, property, and spec-coverage tests pin behaviour. --- .../bloblang2/migrator/translator/change.go | 400 ++++++ .../migrator/translator/change_test.go | 110 ++ .../bloblang2/migrator/translator/context.go | 80 ++ .../migrator/translator/corpus_test.go | 472 ++++++++ .../migrator/translator/expressions.go | 1075 +++++++++++++++++ .../migrator/translator/fold_test.go | 98 ++ .../bloblang2/migrator/translator/imports.go | 152 +++ .../migrator/translator/imports_test.go | 374 ++++++ .../bloblang2/migrator/translator/methods.go | 947 +++++++++++++++ .../migrator/translator/methods_audit_test.go | 257 ++++ .../bloblang2/migrator/translator/migrate.go | 303 +++++ .../migrator/translator/migrate_test.go | 110 ++ .../migrator/translator/property_test.go | 127 ++ .../migrator/translator/rules_test.go | 556 +++++++++ .../migrator/translator/speccoverage_test.go | 215 ++++ .../migrator/translator/statements.go | 488 ++++++++ .../migrator/translator/translate.go | 317 +++++ .../bloblang2/migrator/translator/trivia.go | 70 ++ .../migrator/translator/trivia_test.go | 119 ++ 19 files changed, 6270 insertions(+) create mode 100644 internal/bloblang2/migrator/translator/change.go create mode 100644 internal/bloblang2/migrator/translator/change_test.go create mode 100644 internal/bloblang2/migrator/translator/context.go create mode 100644 internal/bloblang2/migrator/translator/corpus_test.go create mode 100644 internal/bloblang2/migrator/translator/expressions.go create mode 100644 internal/bloblang2/migrator/translator/fold_test.go create mode 100644 internal/bloblang2/migrator/translator/imports.go create mode 100644 internal/bloblang2/migrator/translator/imports_test.go create mode 100644 internal/bloblang2/migrator/translator/methods.go create mode 100644 internal/bloblang2/migrator/translator/methods_audit_test.go create mode 100644 internal/bloblang2/migrator/translator/migrate.go create mode 100644 internal/bloblang2/migrator/translator/migrate_test.go create mode 100644 internal/bloblang2/migrator/translator/property_test.go create mode 100644 internal/bloblang2/migrator/translator/rules_test.go create mode 100644 internal/bloblang2/migrator/translator/speccoverage_test.go create mode 100644 internal/bloblang2/migrator/translator/statements.go create mode 100644 internal/bloblang2/migrator/translator/translate.go create mode 100644 internal/bloblang2/migrator/translator/trivia.go create mode 100644 internal/bloblang2/migrator/translator/trivia_test.go diff --git a/internal/bloblang2/migrator/translator/change.go b/internal/bloblang2/migrator/translator/change.go new file mode 100644 index 000000000..2bf29ca3f --- /dev/null +++ b/internal/bloblang2/migrator/translator/change.go @@ -0,0 +1,400 @@ +// Package translator converts Bloblang V1 mappings to Bloblang V2. +// +// The public entry point is Migrate. Given V1 source text and Options, it +// returns a Report containing the V2 source, a list of semantic Change +// records describing any behavioural divergences, and a Coverage summary. +// An error is returned only when Coverage.Ratio falls below Options.MinCoverage. +// +// V2 is an intentional redesign that fixes ambiguities in V1. Where V1 and V2 +// differ semantically, the translator by default adopts V2 semantics and +// records a Change describing the shift. It is the caller's responsibility to +// audit Changes before relying on the translated mapping. +package translator + +import "fmt" + +// Severity classifies how much the user should care about a Change. +type Severity int + +const ( + // SeverityInfo marks a benign rewrite — the V1 and V2 forms are + // equivalent, but the V1 form was non-canonical (e.g. a bare identifier) + // or idiomatic V2 differs from idiomatic V1. + SeverityInfo Severity = iota + + // SeverityWarning means the V1 and V2 forms may diverge on some inputs, + // and the caller should audit the translated mapping. + SeverityWarning + + // SeverityError means the translator could not produce a V2 form that + // preserves V1 semantics at all, and the emitted mapping almost + // certainly behaves differently. The affected span may also have been + // elided. + SeverityError +) + +// String satisfies fmt.Stringer for ergonomic test output. +func (s Severity) String() string { + switch s { + case SeverityInfo: + return "info" + case SeverityWarning: + return "warning" + case SeverityError: + return "error" + } + return fmt.Sprintf("severity(%d)", s) +} + +// Category groups Changes by the kind of translation decision. +type Category int + +const ( + // CategoryIdiomRewrite flags that the V1 form was rewritten to an + // idiomatic V2 form with identical semantics. Always SeverityInfo. + CategoryIdiomRewrite Category = iota + + // CategorySemanticChange flags that the translator deliberately adopted + // V2 semantics where V1 and V2 diverge. The caller should audit. + CategorySemanticChange + + // CategoryUnsupported flags a V1 construct with no V2 equivalent. The + // emitted mapping contains a "# MIGRATION: " comment at the + // affected site and does not translate the construct. + CategoryUnsupported + + // CategoryUncertain flags that the translator couldn't determine the V1 + // behaviour confidently (e.g. ambiguous precedence, context-dependent + // rebinding). The emitted form is best-effort. + CategoryUncertain +) + +// String satisfies fmt.Stringer. +func (c Category) String() string { + switch c { + case CategoryIdiomRewrite: + return "idiom-rewrite" + case CategorySemanticChange: + return "semantic-change" + case CategoryUnsupported: + return "unsupported" + case CategoryUncertain: + return "uncertain" + } + return fmt.Sprintf("category(%d)", c) +} + +// RuleID is a stable identifier for a translation rule. Each rule emits Changes +// tagged with its RuleID. RuleID values survive spec renumbering: a renamed +// §14 quirk still maps to the same RuleID. +// +// Add new rules by appending here. Never reuse values. +type RuleID int + +// RuleID values. Trailing-line comments describe the rule. See the +// bloblang_v1_spec.md §14 quirk anchors in parentheses. +const ( + // RuleUnknown is the zero value; only appears when a Change was built + // without setting a rule. + RuleUnknown RuleID = iota + + // Naming & shape. + RuleRootToOutput // root -> output + RuleThisToInput // this -> input (read position) + RuleThisTargetToOutput // this as write target -> output (§14#72) + RuleBareIdentToInput // bare ident `foo` -> `input.foo` (§14#1) + RuleBarePathToOutput // bare-path target `foo.bar = v` -> `output.foo.bar = v` (§14#2) + + // Metadata rules. + RuleMetaTargetToOutputMeta // `meta foo = v` -> `output@.foo = v` + RuleMetaReadToInputMeta // `meta("k")` or `@k` -> `input@.k` or `input@[k]` + + // Operator rules. + RuleCoalescePrecedence // `a + b | c` parens preserved (§14#4) + RuleAndOrSameLevel // V1 &&/|| coerce non-bool operands (§14#48); V2 requires bool + RuleBoolNumberEquality // `true == 1` / `1 == true` asymmetry (§14#38) + RuleModuloFloatTruncation // `%` silent float->int64 truncation (§14#39) + RuleIntDivReturnsFloat // `/` on ints returns float64 (§14#5) + + // Sentinel and error-model rules. + RuleOrCatchesErrors // V1 `.or()` catches errors; V2 `.or()` doesn't (§12.2) + + // Control-flow rules. + RuleIfNoElseNothing // `if cond { x }` no-else produces nothing sentinel (§14#44) + RuleMatchSubjectRebinds // match arms rebind `this` to subject (§8.4) + + // Path and indexing rules. + RuleNoBracketIndexing // `this[0]` not valid; use `.index(0)` (§14#10) + + // String rules. + RuleStringLengthBytes // `.length()` on string returns byte count (§14#40) + + // Method and function existence/rename rules. + RuleMethodDoesNotExist // e.g. map_values, collect, chunk, char — no V2/V1 equivalent + RuleNowReturnsString // `now()` returns a string in V1 (§14#57) + + // Map and import rules. + RuleMapDeclTranslation // `map foo { body }` -> V2 `map foo { body }` + RuleImportStatement // `import "path"` -> V2 equivalent + RuleFromStatement // `from "path"` whole-mapping include (§10.5) + + // RuleUnsupportedConstruct is the catch-all when no more specific rule + // applies. + RuleUnsupportedConstruct + + // RuleEmittedInvalidV2 flags that the translator's emitted V2 text did + // not parse under syntax.Parse. This is either a genuine translator bug + // (when V1 input was valid) or an echo of a V1 compile error the V2 + // parser also rejects (e.g. chained `<`, missing imports, duplicate + // namespaces). Callers that want to detect real bugs can filter on this + // rule; the report is still returned with the best-effort V2 text. + RuleEmittedInvalidV2 + + // RuleBlockScopedLet flags a `let` declaration inside an if/else branch + // body. V1 scopes variables at the mapping level so declarations leak + // out; V2 scopes them per block. If the variable is referenced outside + // the branch, the V2 output will fail to compile. + RuleBlockScopedLet +) + +// String satisfies fmt.Stringer. +func (r RuleID) String() string { + switch r { + case RuleUnknown: + return "unknown" + case RuleRootToOutput: + return "root-to-output" + case RuleThisToInput: + return "this-to-input" + case RuleThisTargetToOutput: + return "this-target-to-output" + case RuleBareIdentToInput: + return "bare-ident-to-input" + case RuleBarePathToOutput: + return "bare-path-to-output" + case RuleMetaTargetToOutputMeta: + return "meta-target-to-output-meta" + case RuleMetaReadToInputMeta: + return "meta-read-to-input-meta" + case RuleCoalescePrecedence: + return "coalesce-precedence" + case RuleAndOrSameLevel: + return "and-or-same-level" + case RuleBoolNumberEquality: + return "bool-number-equality" + case RuleModuloFloatTruncation: + return "modulo-float-truncation" + case RuleIntDivReturnsFloat: + return "int-div-returns-float" + case RuleOrCatchesErrors: + return "or-catches-errors" + case RuleIfNoElseNothing: + return "if-no-else-nothing" + case RuleMatchSubjectRebinds: + return "match-subject-rebinds" + case RuleNoBracketIndexing: + return "no-bracket-indexing" + case RuleStringLengthBytes: + return "string-length-bytes" + case RuleMethodDoesNotExist: + return "method-does-not-exist" + case RuleNowReturnsString: + return "now-returns-string" + case RuleMapDeclTranslation: + return "map-decl-translation" + case RuleImportStatement: + return "import-statement" + case RuleFromStatement: + return "from-statement" + case RuleUnsupportedConstruct: + return "unsupported-construct" + case RuleEmittedInvalidV2: + return "emitted-invalid-v2" + case RuleBlockScopedLet: + return "block-scoped-let" + } + return fmt.Sprintf("rule(%d)", r) +} + +// Change records one translation decision worth surfacing to the caller. +type Change struct { + Line, Column int // start of the affected V1 span + EndLine int // end line (may equal Line) + EndColumn int // end column + Severity Severity + Category Category + RuleID RuleID + SpecRef string // e.g. "§14#48"; current spec anchor for docs + Original string // V1 snippet (for citation) + Translated string // V2 snippet emitted; empty if dropped + Explanation string // one-line human-readable +} + +// Report is the result of a successful Migrate call. +type Report struct { + V2Mapping string + // V2Files is the set of imported files translated from V1 to V2. Keys + // are the paths used by the V1 source's import statements. Empty when + // Options.Files was empty. + V2Files map[string]string + Changes []Change + Coverage Coverage +} + +// Coverage summarises the translator's progress over the V1 input. +type Coverage struct { + Total int // total V1 AST nodes weighed + Translated int // translated exactly (Exact) + Rewritten int // translated with a SemanticChange + Unsupported int // dropped / replaced with a MIGRATION comment + Ratio float64 // (Translated*1.0 + Rewritten*0.9) / Total +} + +// Options controls Migrate. +type Options struct { + // MinCoverage is the minimum Coverage.Ratio required before Migrate + // returns successfully. If the computed ratio is below this value, + // Migrate returns (nil, *CoverageError). Default 0.75. + MinCoverage float64 + + // Verbose emits Info-severity Changes. Without this, only Warning and + // Error Changes are recorded, keeping the report focused on items that + // need human attention. + Verbose bool + + // TreatWarningsAsErrors causes Warning-severity Changes to be promoted + // to Error; useful for CI. + TreatWarningsAsErrors bool + + // Files is a virtual filesystem for `import` resolution. Keys are + // treated as canonical identifiers for files: an entry keyed + // "helpers.blobl" satisfies any import statement whose path string + // resolves (after FileResolver, if set) to "helpers.blobl". When + // FileResolver is nil the keys are matched directly against the + // path strings written in the V1 source (path-as-canonical-key). + Files map[string]string + + // FileResolver, when set, lazily resolves V1 imports during Migrate. + // The migrator walks the closure of imports starting from the main + // source and any transitively imported files, calling the resolver + // for each import path it encounters. + // + // parentKey is the canonical key of the file the import appears in + // (empty for imports in the main V1 source). importPath is the path + // string as written in the import statement. The returned + // canonicalKey identifies the resolved file for de-duplication and + // Report.V2Files emission — two import statements that resolve to + // the same canonicalKey are translated once. + // + // Pre-populated Files take precedence: if Files contains importPath + // as a key, the resolver is not consulted and importPath itself is + // treated as the canonical key. + // + // Returning ok=false records an Unsupported RuleImportStatement at + // the import site and continues with the rest of the migration. + FileResolver FileResolver + + // V2ImportPathRewriter, when set, rewrites V1 import path strings to + // their V2 equivalents in the emitted V2 source. Default: identity. + // Useful for callers that emit V2-translated files at sibling paths + // (e.g. "helpers.blobl" -> "helpers.v5.blobl"). Operates on the + // verbatim path string from the V1 source so locality is preserved + // (relative imports stay relative). + V2ImportPathRewriter V2ImportPathRewriter + + // Mode selects how the V1 mapping's implicit root is treated. + // + // V1 ships two Bloblang-executing processors with different root + // defaults: + // - `mapping` — `root` starts as the *input* document; a mapping + // that makes no assignments passes the input + // through unchanged. + // - `mutation` — `root` starts as `{}`; a mapping that makes no + // assignments emits an empty object. + // + // V2's `output` always starts as `{}`, matching V1's `mutation`. To + // preserve V1 `mapping` semantics the translator prepends an + // `output = input` statement to the V2 output when Mode is + // ModeMapping. When Mode is ModeMutation (or unset — the safe + // default), no prelude is inserted. + Mode Mode + + // CustomMethodRules is keyed by V1 method name. Hooks registered + // here run *before* the built-in method-rewrite switch (custom + // rules win on name collision per the migrator's design). Returning + // handled=false falls through to built-in rules. + CustomMethodRules map[string]MethodRuleHook + + // CustomFunctionRules is the function-call analogue of + // CustomMethodRules. + CustomFunctionRules map[string]FunctionRuleHook +} + +// FileResolver lazily resolves a V1 import path during Migrate. See +// Options.FileResolver for semantics. +type FileResolver func(parentKey, importPath string) (canonicalKey, content string, ok bool) + +// V2ImportPathRewriter rewrites V1 import path strings to their V2 +// equivalents. See Options.V2ImportPathRewriter. +type V2ImportPathRewriter func(v1Path string) string + +// Mode classifies the V1 execution context the translated mapping will +// replace. See Options.Mode. +type Mode int + +const ( + // ModeMutation is the default: V1 `mutation` processor semantics. + // `root` starts empty; no prelude injected by the translator. + ModeMutation Mode = iota + + // ModeMapping selects V1 `mapping` processor semantics. `root` + // starts as the input document; the translator prepends + // `output = input` so the V2 output behaves the same way. + ModeMapping +) + +// String satisfies fmt.Stringer for ergonomic test/log output. +func (m Mode) String() string { + switch m { + case ModeMutation: + return "mutation" + case ModeMapping: + return "mapping" + } + return fmt.Sprintf("mode(%d)", m) +} + +// DefaultOptions returns reasonable defaults. +func DefaultOptions() Options { + return Options{ + MinCoverage: 0.75, + } +} + +// CoverageError is returned by Migrate when Coverage.Ratio < Options.MinCoverage. +type CoverageError struct { + Coverage Coverage + Min float64 + Report *Report // the would-be report; inspect for context even on error +} + +// Error satisfies the error interface. +func (e *CoverageError) Error() string { + return fmt.Sprintf( + "migrator: translation coverage %.2f is below threshold %.2f (translated=%d rewritten=%d unsupported=%d total=%d)", + e.Coverage.Ratio, e.Min, + e.Coverage.Translated, e.Coverage.Rewritten, e.Coverage.Unsupported, e.Coverage.Total, + ) +} + +// computeRatio applies the weighted formula: +// +// (Translated*1.0 + Rewritten*0.9) / Total +// +// Returns 1.0 when Total is zero (nothing to translate is 100% successful). +func computeRatio(c Coverage) float64 { + if c.Total == 0 { + return 1.0 + } + return (float64(c.Translated)*1.0 + float64(c.Rewritten)*0.9) / float64(c.Total) +} diff --git a/internal/bloblang2/migrator/translator/change_test.go b/internal/bloblang2/migrator/translator/change_test.go new file mode 100644 index 000000000..3338553f5 --- /dev/null +++ b/internal/bloblang2/migrator/translator/change_test.go @@ -0,0 +1,110 @@ +package translator + +import ( + "strings" + "testing" +) + +func TestCoverageRatio(t *testing.T) { + for _, c := range []struct { + name string + coverage Coverage + wantRatio float64 + }{ + {"zero is 1.0", Coverage{}, 1.0}, + {"all exact", Coverage{Total: 10, Translated: 10}, 1.0}, + {"all rewritten", Coverage{Total: 10, Rewritten: 10}, 0.9}, + {"mixed", Coverage{Total: 10, Translated: 5, Rewritten: 5}, 0.95}, + {"unsupported hurts", Coverage{Total: 10, Translated: 5, Unsupported: 5}, 0.5}, + {"below threshold", Coverage{Total: 4, Translated: 1, Rewritten: 1, Unsupported: 2}, (1.0 + 0.9) / 4}, + } { + t.Run(c.name, func(t *testing.T) { + got := computeRatio(c.coverage) + if got != c.wantRatio { + t.Fatalf("computeRatio got %v want %v", got, c.wantRatio) + } + }) + } +} + +func TestRecorderEmitsByVerbosity(t *testing.T) { + r := newRecorder(Options{Verbose: false}) + r.Exact() + r.Rewritten(Change{Severity: SeverityInfo, RuleID: RuleRootToOutput}) + r.Rewritten(Change{Severity: SeverityWarning, RuleID: RuleOrCatchesErrors}) + rep := r.finalise("output = input") + + if got, want := rep.Coverage.Total, 3; got != want { + t.Fatalf("Total got %d want %d", got, want) + } + if got, want := rep.Coverage.Translated, 1; got != want { + t.Fatalf("Translated got %d want %d", got, want) + } + if got, want := rep.Coverage.Rewritten, 2; got != want { + t.Fatalf("Rewritten got %d want %d", got, want) + } + // Verbose=false should suppress Info but keep Warning. + if got, want := len(rep.Changes), 1; got != want { + t.Fatalf("Changes got %d want %d (non-verbose should drop Info)", got, want) + } + if rep.Changes[0].Severity != SeverityWarning { + t.Fatalf("expected Warning, got %v", rep.Changes[0].Severity) + } +} + +func TestRecorderVerboseIncludesInfo(t *testing.T) { + r := newRecorder(Options{Verbose: true}) + r.Rewritten(Change{Severity: SeverityInfo, RuleID: RuleRootToOutput}) + r.Rewritten(Change{Severity: SeverityWarning, RuleID: RuleOrCatchesErrors}) + rep := r.finalise("") + if got, want := len(rep.Changes), 2; got != want { + t.Fatalf("verbose should record both; got %d want %d", got, want) + } +} + +func TestRecorderWarningsAsErrors(t *testing.T) { + r := newRecorder(Options{Verbose: true, TreatWarningsAsErrors: true}) + r.Rewritten(Change{Severity: SeverityWarning, RuleID: RuleOrCatchesErrors}) + rep := r.finalise("") + if got := rep.Changes[0].Severity; got != SeverityError { + t.Fatalf("expected promotion to Error, got %v", got) + } +} + +func TestRecorderUnsupportedFixesFields(t *testing.T) { + r := newRecorder(Options{Verbose: true}) + r.Unsupported(Change{Severity: SeverityInfo, Category: CategoryIdiomRewrite, RuleID: RuleMethodDoesNotExist}) + rep := r.finalise("") + if rep.Changes[0].Severity != SeverityError { + t.Fatalf("Unsupported should force Error") + } + if rep.Changes[0].Category != CategoryUnsupported { + t.Fatalf("Unsupported should force CategoryUnsupported") + } + if rep.Coverage.Unsupported != 1 { + t.Fatalf("Unsupported counter not incremented") + } +} + +func TestCoverageError(t *testing.T) { + e := &CoverageError{ + Coverage: Coverage{Total: 10, Translated: 5, Unsupported: 5, Ratio: 0.5}, + Min: 0.75, + } + if !strings.Contains(e.Error(), "below threshold") { + t.Fatalf("error message missing context: %q", e.Error()) + } +} + +func TestRuleIDStringIsUnique(t *testing.T) { + // Guard against accidentally giving two RuleIDs the same String. + // This also documents the expected naming pattern. + seen := map[string]RuleID{} + for id := RuleUnknown; id <= RuleUnsupportedConstruct; id++ { + s := id.String() + if existing, ok := seen[s]; ok { + t.Errorf("RuleID %d and %d both stringify to %q", existing, id, s) + } + seen[s] = id + } +} diff --git a/internal/bloblang2/migrator/translator/context.go b/internal/bloblang2/migrator/translator/context.go new file mode 100644 index 000000000..0f286f597 --- /dev/null +++ b/internal/bloblang2/migrator/translator/context.go @@ -0,0 +1,80 @@ +package translator + +// recorder accumulates Changes and per-node classifications during a single +// Migrate call. Translation rules call its Record* methods as they visit V1 +// AST nodes; the final Coverage is computed from the counts. +// +// The recorder is not safe for concurrent use; Migrate is single-threaded by +// design. +type recorder struct { + opts Options + changes []Change + coverage Coverage + // warningsAsErrors mirrors opts.TreatWarningsAsErrors; a separate field + // keeps the hot path branch-free. + warningsAsErrors bool +} + +// newRecorder constructs a recorder from options. +func newRecorder(opts Options) *recorder { + return &recorder{ + opts: opts, + warningsAsErrors: opts.TreatWarningsAsErrors, + } +} + +// Exact increments the Translated counter: a V1 node mapped 1:1 to V2 with no +// semantic divergence. No Change is recorded. +func (r *recorder) Exact() { + r.coverage.Total++ + r.coverage.Translated++ +} + +// Rewritten increments the Rewritten counter and records a Change. Use this +// when the translator chose V2 semantics that differ from V1, or when the +// rewrite is a pure idiom (Info-level) that the caller may want to note. +func (r *recorder) Rewritten(ch Change) { + r.coverage.Total++ + r.coverage.Rewritten++ + r.emit(ch) +} + +// Unsupported increments the Unsupported counter and records an Error Change. +// Use when the V1 construct has no V2 equivalent and the translator emits a +// MIGRATION comment in place. +func (r *recorder) Unsupported(ch Change) { + r.coverage.Total++ + r.coverage.Unsupported++ + ch.Severity = SeverityError + ch.Category = CategoryUnsupported + r.emit(ch) +} + +// Note records a Change without touching the coverage counters. Use for +// cross-cutting diagnostics that aren't tied to a single V1 AST node (e.g. +// the post-translation sanity-check Parse). +func (r *recorder) Note(ch Change) { + r.emit(ch) +} + +// emit writes the Change to the report, respecting verbose and warnings-as- +// errors options. +func (r *recorder) emit(ch Change) { + if r.warningsAsErrors && ch.Severity == SeverityWarning { + ch.Severity = SeverityError + } + if ch.Severity == SeverityInfo && !r.opts.Verbose { + return + } + r.changes = append(r.changes, ch) +} + +// finalise computes the final Coverage ratio and returns the Report. +func (r *recorder) finalise(v2 string) *Report { + r.coverage.Ratio = computeRatio(r.coverage) + return &Report{ + V2Mapping: v2, + Changes: r.changes, + Coverage: r.coverage, + } +} diff --git a/internal/bloblang2/migrator/translator/corpus_test.go b/internal/bloblang2/migrator/translator/corpus_test.go new file mode 100644 index 000000000..d9fa603b4 --- /dev/null +++ b/internal/bloblang2/migrator/translator/corpus_test.go @@ -0,0 +1,472 @@ +package translator_test + +import ( + "fmt" + "os" + "path/filepath" + "sort" + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/spectest" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" + + // Register impl/pure — many V1 corpus mappings use methods that live + // there (ts_parse, ts_format, .abs(), typed numeric coercers, etc.). + // The V2 interpreter uses the same global registry, so this side- + // effect import applies to both V1 translation probing and V2 execution. + _ "github.com/redpanda-data/benthos/v4/internal/impl/pure" +) + +// TestCorpusRegression is Layer 4 of the migrator testing framework: take +// every non-skipped V1 mapping in ../v1spec/tests/ (the V2-translated corpus), +// translate it to V2, execute the V2 output against V2's interpreter with the +// test's input, and compare to the test's expected output. +// +// This test is about measurement, not correctness. It reports per-file +// progress and flags regressions but expects non-zero failure counts during +// development — rule implementation is incremental. +// +// Failures surface as t.Log calls; the test fails only when the overall pass +// rate drops below a threshold (currently very low, will be tightened as rules +// land). +func TestCorpusRegression(t *testing.T) { + if testing.Short() { + t.Skip("corpus regression is long-running; skipped in -short mode") + } + + root := corpusRoot(t) + files := discoverFiles(t, root) + if len(files) == 0 { + t.Fatalf("no V1 corpus files found under %s", root) + } + + stats := &runStats{} + interp := &bloblang2.Interp{} + + for _, path := range files { + tf, err := spectest.LoadFile(path) + if err != nil { + stats.loadError++ + continue + } + rel, _ := filepath.Rel(root, path) + for i := range tf.Tests { + tc := &tf.Tests[i] + // Multi-case tests have a shared mapping; skip them for now — + // they need the cases-expansion treatment that spectest.RunT + // does, and aren't the main attraction for translation work. + if len(tc.Cases) > 0 { + stats.skippedMultiCase++ + continue + } + // Skip V1-untranslatable tests (skip field not modelled in + // spectest schema, but our own v1spec adds it; we consult the + // raw YAML separately). + if isSkipped(path, tc.Name) { + stats.skippedV2Only++ + continue + } + outcome := runOne(interp, tc, tf.Files) + stats.record(outcome) + switch outcome.kind { + case outcomeUnexpected: + t.Logf("[DELTA] %s/%s: %s", rel, tc.Name, outcome.detail) + case outcomeV2CompileFail: + t.Logf("[V2COMP] %s/%s: %s", rel, tc.Name, outcome.detail) + case outcomeTranslateFail: + t.Logf("[TRANS] %s/%s: %s", rel, tc.Name, outcome.detail) + } + } + } + + stats.report(t) + + // Corpus pass-rate floor. Counted passes include OK (exact output + // match) and Flagged (V1-V2 divergence the translator explicitly + // warned about) outcomes. Pin just below the current rate (0.998) + // so a real regression trips the gate; the remaining <1% are + // documented model-level V1/V2 divergences (metadata COW, output + // default shape, root re-shape — see migrator/V1_V2_GAPS.md). + minPassRate := 0.995 + if stats.runRate() < minPassRate { + t.Fatalf("corpus pass rate %.3f below floor %.3f", stats.runRate(), minPassRate) + } +} + +// corpusRoot returns the path to the V1 YAML corpus, resolved relative to the +// test's own directory (go test runs with cwd = package dir). +func corpusRoot(t *testing.T) string { + t.Helper() + // corpus_test.go lives in migrator/translator/; tests live at + // ../v1spec/tests/. + p, err := filepath.Abs("../v1spec/tests") + if err != nil { + t.Fatal(err) + } + return p +} + +// discoverFiles returns every .yaml under root in sorted order. +func discoverFiles(t *testing.T, root string) []string { + t.Helper() + var out []string + err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error { + if err != nil { + return err + } + if info.IsDir() { + return nil + } + if strings.HasSuffix(info.Name(), ".yaml") { + out = append(out, path) + } + return nil + }) + if err != nil { + t.Fatal(err) + } + sort.Strings(out) + return out +} + +// isSkipped reads the raw YAML to check whether the test case carries a +// `skip:` field (the v1spec extension to the shared schema). Caches parsed +// skip sets per file. +var skipCache = map[string]map[string]bool{} + +func isSkipped(path, name string) bool { + skips, ok := skipCache[path] + if !ok { + skips = loadSkips(path) + skipCache[path] = skips + } + return skips[name] +} + +func loadSkips(path string) map[string]bool { + data, err := os.ReadFile(path) + if err != nil { + return nil + } + // Cheap line-based probe: find `- name: "X"` and then check if the next + // non-trivial field at the same indent has `skip:`. Good enough — the + // YAML structure in the corpus is regular. + out := map[string]bool{} + lines := strings.Split(string(data), "\n") + current := "" + for _, l := range lines { + trim := strings.TrimSpace(l) + if strings.HasPrefix(trim, "- name:") { + current = strings.TrimSpace(strings.TrimPrefix(trim, "- name:")) + current = strings.Trim(current, `"`) + continue + } + if current != "" && strings.HasPrefix(trim, "skip:") { + out[current] = true + current = "" + } + } + return out +} + +// outcomeKind classifies the result of running one test case through the +// migrator + V2 interpreter pipeline. +type outcomeKind int + +const ( + outcomeOK outcomeKind = iota // V2 output matched expectation (or expected error fired) + outcomeFlagged // V2 diverged from V1 but the translator warned via a SemanticChange or Unsupported + outcomeTranslateFail // translator returned an error + outcomeV2CompileFail // translator emitted invalid V2 + outcomeUnexpected // V2 ran but output / error differed from expectation without any warning + outcomeInvalidTest // the test case itself is malformed for our purposes +) + +type outcome struct { + kind outcomeKind + detail string +} + +func runOne(interp spectest.Interpreter, tc *spectest.TestCase, fileLevel map[string]string) outcome { + if tc.Mapping == "" { + return outcome{outcomeInvalidTest, "empty mapping"} + } + + // 1. Translate V1 -> V2. Thread files into Migrate so imports in the + // V1 source resolve against the test's virtual filesystem. Verbose is + // enabled so Info-severity Changes surface on the Report and + // hasFlaggedDivergence can see any SemanticChange the translator + // emitted. + rep, err := translator.Migrate(tc.Mapping, translator.Options{ + MinCoverage: 0.5, + Verbose: true, + Files: mergeFiles(fileLevel, tc.Files), + }) + if err != nil { + return outcome{outcomeTranslateFail, fmt.Sprintf("translate: %v", err)} + } + + // 2. Compile the V2 output against the translated virtual filesystem + // (V1 import contents also migrated to V2). + compiled, compileErr := interp.Compile(rep.V2Mapping, rep.V2Files) + if compileErr != nil { + // If the V1 test expects a compile error, a V2 compile error is + // the faithful outcome. + if tc.CompileError != "" { + return outcome{outcomeOK, ""} + } + // V2 performs more validation at compile time than V1, so V1 + // runtime errors sometimes surface as V2 compile errors. The + // caller still gets a rejection — count as Flagged. + if tc.Error != "" || tc.HasError { + return outcome{outcomeFlagged, fmt.Sprintf("V1 runtime error surfaces as V2 compile error: %v", compileErr)} + } + // V2 is intentionally stricter about some constructs that V1 + // accepted (chained == / !=, void propagation outside the + // `nothing()` sentinel, etc.). When the translator flagged the + // relevant construct up front, this is a known divergence, not a + // translator bug. + if hasFlaggedDivergence(rep) { + return outcome{outcomeFlagged, fmt.Sprintf("V2 compile error, known divergence flagged: %v", compileErr)} + } + return outcome{outcomeV2CompileFail, fmt.Sprintf("V2 compile: %v", compileErr)} + } + if tc.CompileError != "" { + // V1 expected a compile error but V2 compiled cleanly. V2 is + // intentionally more permissive at compile time (many V1 parse- + // time validations run at V2 runtime). This is a known V1-V2 + // design divergence — count as Flagged rather than Unexpected. + return outcome{outcomeFlagged, "V1 compile-time check not performed by V2"} + } + + // 3. Execute. + input, err := spectest.DecodeValue(tc.Input) + if err != nil { + return outcome{outcomeInvalidTest, fmt.Sprintf("input decode: %v", err)} + } + inputMeta := map[string]any{} + if tc.InputMetadata != nil { + raw, _ := spectest.DecodeValue(tc.InputMetadata) + if m, ok := raw.(map[string]any); ok { + inputMeta = m + } + } + + gotOut, _, deleted, runErr := compiled.Exec(input, inputMeta) + + // 4. Check against expectations. + if tc.Error != "" || tc.HasError { + if runErr == nil { + // V1 errored, V2 succeeded. This is a lenient-V2 divergence. + // When the translator flagged the relevant construct, it's a + // known divergence — Flagged rather than Unexpected. + if hasFlaggedDivergence(rep) { + return outcome{outcomeFlagged, fmt.Sprintf("V1 runtime error did not fire under V2 (known divergence flagged); got %v", gotOut)} + } + return outcome{outcomeUnexpected, fmt.Sprintf("expected runtime error, got output %v", gotOut)} + } + return outcome{outcomeOK, ""} + } + if tc.Deleted { + if deleted { + return outcome{outcomeOK, ""} + } + return outcome{outcomeUnexpected, "expected deletion, got output"} + } + if runErr != nil { + // Runtime errors in V2 where V1 succeeded need splitting into two + // classes: + // + // 1. V2 strictness errors — V2 deliberately errors where V1 + // silently coerced (non-boolean `&&`, non-whole-number + // indexing, etc.). These are documented V1↔V2 divergences + // and are allowed to count as Flagged when the translator + // recorded *any* divergence note. + // + // 2. Everything else — almost always a translator bug. A warning + // elsewhere in the mapping does NOT excuse an unrelated + // runtime failure. These count as Unexpected. This was the + // hole that masked the V1→V2 .fold lambda-shape bug: a + // merge-null warning elsewhere was giving unrelated fold + // failures a free pass. + if isV2StrictnessError(runErr.Error()) && hasFlaggedDivergence(rep) { + return outcome{outcomeFlagged, fmt.Sprintf("V2 strictness error, divergence flagged: %v", runErr)} + } + return outcome{outcomeUnexpected, fmt.Sprintf("unexpected runtime error: %v", runErr)} + } + if tc.NoOutputCheck { + return outcome{outcomeOK, ""} + } + + expected, err := spectest.DecodeValue(tc.Output) + if err != nil { + return outcome{outcomeInvalidTest, fmt.Sprintf("expected output decode: %v", err)} + } + if ok, diff := spectest.DeepEqual(expected, gotOut); !ok { + // Output differs. If the translator flagged the relevant construct + // as a SemanticChange or Unsupported, consider this an acceptable + // (known) divergence — the caller was warned. + if hasFlaggedDivergence(rep) { + return outcome{outcomeFlagged, fmt.Sprintf("output mismatch (known divergence flagged): %s", diff)} + } + return outcome{outcomeUnexpected, fmt.Sprintf("output mismatch: %s", diff)} + } + return outcome{outcomeOK, ""} +} + +// hasFlaggedDivergence reports whether the report contains any Change that +// signals a known V1-V2 semantic divergence the caller has been warned about. +func hasFlaggedDivergence(rep *translator.Report) bool { + for _, c := range rep.Changes { + if c.Category == translator.CategorySemanticChange || c.Category == translator.CategoryUnsupported { + return true + } + } + return false +} + +// isV2StrictnessError reports whether a V2 runtime error message matches +// one of the well-known patterns for V2 being deliberately stricter than +// V1 (type coercion, non-whole-number indexing, null/boolean checks, +// etc.). These divergences are legitimate V1↔V2 differences; the +// translator cannot rewrite V1 coercion semantics into V2's stricter +// checks, and the corresponding V1 test is effectively testing V1-only +// behaviour. The corpus test allows these to count as Flagged when the +// translator recorded a divergence note elsewhere; other runtime errors +// are treated as Unexpected (probable translator bugs). +// +// Keep the list conservative: it's better to surface a false positive +// (a new test failure the author must investigate) than to mask a real +// regression. +func isV2StrictnessError(msg string) bool { + patterns := []string{ + // --- Type-strictness (V1 coerced / returned null; V2 errors) --- + "requires boolean", + "must be boolean", + "must return bool", + "must evaluate to bool", + "cannot add", + "not numeric", + "cannot convert", + "must be a whole number", + "requires array", + "requires object", + "requires number", + "requires string", + "is not a sortable type", + "non-string index on object", + // --- Null/void/deleted handling (V2 stricter) --- + "cannot call method on void", + "cannot call method on null", + "cannot call method on deleted", + "does not support null", + "cannot access field", + "cannot index", + "cannot compare", + "returned void", + "deleted value in expression", + "cannot assign deleted", + "cannot delete metadata", + // --- Resource / arithmetic limits (V2 enforces) --- + "maximum recursion depth", + "int64 overflow", + "exceeds float64 exact range", + "index out of bounds", + // --- Scoping (V1 let leaks out of branches; V2 block-scopes) --- + "undefined variable", + // --- V1-only stdlib (no V2 equivalent) --- + "unknown method", + "unknown function", + } + for _, p := range patterns { + if strings.Contains(msg, p) { + return true + } + } + return false +} + +// runStats aggregates pass / fail counts across the corpus. +type runStats struct { + loadError int + skippedMultiCase int + skippedV2Only int + ok int + flagged int + translateFail int + v2CompileFail int + unexpected int + invalidTest int +} + +func (s *runStats) record(o outcome) { + switch o.kind { + case outcomeOK: + s.ok++ + case outcomeFlagged: + s.flagged++ + case outcomeTranslateFail: + s.translateFail++ + case outcomeV2CompileFail: + s.v2CompileFail++ + case outcomeUnexpected: + s.unexpected++ + case outcomeInvalidTest: + s.invalidTest++ + } +} + +func (s *runStats) runRate() float64 { + total := s.total() + if total == 0 { + return 0 + } + // Flagged divergences count as successful outcomes: the migrator did + // its job (translated + warned). The test runner's job is to catch + // unexpected / silent divergences, not known ones. + return float64(s.ok+s.flagged) / float64(total) +} + +func (s *runStats) total() int { + return s.ok + s.flagged + s.translateFail + s.v2CompileFail + s.unexpected + s.invalidTest +} + +func (s *runStats) report(t *testing.T) { + t.Helper() + total := s.total() + t.Logf("corpus regression summary:") + t.Logf(" total-attempted: %d", total) + t.Logf(" ok (matched): %d (%.1f%%)", s.ok, pct(s.ok, total)) + t.Logf(" flagged divergence: %d (%.1f%%)", s.flagged, pct(s.flagged, total)) + t.Logf(" translate-fail: %d (%.1f%%)", s.translateFail, pct(s.translateFail, total)) + t.Logf(" V2-compile-fail: %d (%.1f%%)", s.v2CompileFail, pct(s.v2CompileFail, total)) + t.Logf(" unexpected: %d (%.1f%%)", s.unexpected, pct(s.unexpected, total)) + t.Logf(" invalid-test: %d (%.1f%%)", s.invalidTest, pct(s.invalidTest, total)) + t.Logf(" skipped (V2-only): %d", s.skippedV2Only) + t.Logf(" skipped (multicase): %d", s.skippedMultiCase) +} + +func pct(n, total int) float64 { + if total == 0 { + return 0 + } + return float64(n) / float64(total) * 100 +} + +// mergeFiles combines a file-level Files map with a test-level one. Test- +// level entries win on collision. +func mergeFiles(fileLevel, testLevel map[string]string) map[string]string { + if len(fileLevel) == 0 && len(testLevel) == 0 { + return nil + } + out := make(map[string]string, len(fileLevel)+len(testLevel)) + for k, v := range fileLevel { + out[k] = v + } + for k, v := range testLevel { + out[k] = v + } + return out +} diff --git a/internal/bloblang2/migrator/translator/expressions.go b/internal/bloblang2/migrator/translator/expressions.go new file mode 100644 index 000000000..c4913f67b --- /dev/null +++ b/internal/bloblang2/migrator/translator/expressions.go @@ -0,0 +1,1075 @@ +package translator + +import ( + "fmt" + "strconv" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// translateExpr dispatches expression translation. +func (t *translator) translateExpr(e v1ast.Expr) syntax.Expr { + switch x := e.(type) { + case *v1ast.Literal: + return t.translateLiteral(x) + case *v1ast.ThisExpr: + // Inside a V2 map body, V1 `this` refers to the map's receiver, + // which we surface as a named V2 parameter. Otherwise `this` is + // the top-level input. + if name, ok := t.currentThisRebind(); ok { + t.rec.Exact() + return &syntax.IdentExpr{TokenPos: pos(x.TokPos), Name: name, SlotIndex: -1} + } + t.rec.Rewritten(Change{ + Line: x.TokPos.Line, Column: x.TokPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleThisToInput, Explanation: `"this" rewritten to "input"`, + }) + return &syntax.InputExpr{TokenPos: pos(x.TokPos)} + case *v1ast.RootExpr: + t.rec.Rewritten(Change{ + Line: x.TokPos.Line, Column: x.TokPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleRootToOutput, Explanation: `"root" rewritten to "output"`, + }) + return &syntax.OutputExpr{TokenPos: pos(x.TokPos)} + case *v1ast.VarRef: + t.rec.Exact() + return &syntax.VarExpr{TokenPos: pos(x.TokPos), Name: x.Name, SlotIndex: -1} + case *v1ast.MetaRef: + t.rec.Rewritten(Change{ + Line: x.TokPos.Line, Column: x.TokPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMetaReadToInputMeta, Explanation: "metadata reference rewritten to input@", + }) + return t.metaReadExpr(x) + case *v1ast.Ident: + // If the name is a lambda parameter or named-context binding in + // scope, emit a V2 identifier reference — not the legacy + // bare-ident-to-input rewrite. + if t.isBoundIdent(x.Name) { + t.rec.Exact() + return &syntax.IdentExpr{TokenPos: pos(x.TokPos), Name: x.Name, SlotIndex: -1} + } + // Inside a `this`-rebinding scope (e.g. a query-form predicate + // wrapped into a V2 lambda), bare V1 idents resolve as fields of + // the rebound element — not of the outer V2 input. + if name, ok := t.currentThisRebind(); ok { + t.rec.Exact() + return &syntax.FieldAccessExpr{ + Receiver: &syntax.IdentExpr{TokenPos: pos(x.TokPos), Name: name, SlotIndex: -1}, + Field: x.Name, + FieldPos: pos(x.TokPos), + NullSafe: true, + } + } + // Otherwise, legacy bare identifier in expression position = + // `this.foo` = V2 `input.foo`. V2 errors if the field is absent + // or the receiver isn't an object — V1 silently returned null. + // Emit as NullSafe so the V2 form tolerates a null/absent + // receiver the way V1 did, and flag as a SemanticChange so the + // wider divergence (V2 is type-strict on non-object receivers) + // surfaces on the Report. + t.rec.Rewritten(Change{ + Line: x.TokPos.Line, Column: x.TokPos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleBareIdentToInput, SpecRef: "§14#1", + Explanation: fmt.Sprintf(`bare identifier %q rewritten as "input.%s"; V2 errors on absent fields where V1 returned null`, x.Name, x.Name), + }) + return &syntax.FieldAccessExpr{ + Receiver: &syntax.InputExpr{TokenPos: pos(x.TokPos)}, + Field: x.Name, + FieldPos: pos(x.TokPos), + NullSafe: true, + } + case *v1ast.BinaryExpr: + return t.translateBinary(x) + case *v1ast.UnaryExpr: + return t.translateUnary(x) + case *v1ast.ParenExpr: + // V2 parser also produces ParenExpr? Let me inline — wrap in nothing + // and let the printer add parens as needed via precedence. + inner := t.translateExpr(x.Inner) + t.rec.Exact() + return inner + case *v1ast.FieldAccess: + return t.translateFieldAccess(x) + case *v1ast.MethodCall: + return t.translateMethodCall(x) + case *v1ast.FunctionCall: + return t.translateFunctionCall(x) + case *v1ast.MetaCall: + return t.translateMetaCall(x) + case *v1ast.Lambda: + return t.translateLambda(x) + case *v1ast.ArrayLit: + return t.translateArrayLit(x) + case *v1ast.ObjectLit: + return t.translateObjectLit(x) + case *v1ast.IfExpr: + return t.translateIfExpr(x) + case *v1ast.MatchExpr: + return t.translateMatchExpr(x) + case *v1ast.MapExpr: + return t.translateMapExpr(x) + default: + t.rec.Unsupported(Change{ + Line: e.NodePos().Line, Column: e.NodePos().Column, + RuleID: RuleUnsupportedConstruct, + Explanation: fmt.Sprintf("no translation rule for expression %T", e), + }) + return nil + } +} + +// translateLiteral is a straight passthrough — literals are identical in V1 +// and V2 modulo the `\/` escape (which is not supported in V1, so it never +// appears in parsed input). +func (t *translator) translateLiteral(l *v1ast.Literal) syntax.Expr { + t.rec.Exact() + out := &syntax.LiteralExpr{TokenPos: pos(l.TokPos)} + switch l.Kind { + case v1ast.LitNull: + out.TokenType = syntax.NULL + out.Value = "null" + case v1ast.LitBool: + if l.Bool { + out.TokenType = syntax.TRUE + out.Value = "true" + } else { + out.TokenType = syntax.FALSE + out.Value = "false" + } + case v1ast.LitInt: + out.TokenType = syntax.INT + out.Value = strconv.FormatInt(l.Int, 10) + case v1ast.LitFloat: + out.TokenType = syntax.FLOAT + if l.Raw != "" { + out.Value = l.Raw + } else { + out.Value = fmt.Sprintf("%g", l.Float) + } + case v1ast.LitString: + out.TokenType = syntax.STRING + out.Value = l.Str + case v1ast.LitRawString: + out.TokenType = syntax.RAW_STRING + out.Value = l.Str + } + return out +} + +// metaReadExpr builds the V2 form of a V1 `@name` / `meta("name")` read. +func (t *translator) metaReadExpr(m *v1ast.MetaRef) syntax.Expr { + if m.Name == "" { + return &syntax.InputMetaExpr{TokenPos: pos(m.TokPos)} + } + return &syntax.FieldAccessExpr{ + Receiver: &syntax.InputMetaExpr{TokenPos: pos(m.TokPos)}, + Field: m.Name, + FieldPos: pos(m.TokPos), + } +} + +// translateBinary maps V1 binary operators to V2. Most are 1:1, but `|` +// coalesce and the `&&`/`||` same-precedence quirk need care. +func (t *translator) translateBinary(b *v1ast.BinaryExpr) syntax.Expr { + left := t.translateExpr(b.Left) + right := t.translateExpr(b.Right) + if left == nil || right == nil { + return nil + } + // `|` coalesce: V1 catches BOTH null and errors on the left side; V2 + // `.or` catches null only, and `.catch` catches errors only. Emit + // `left.or(right).catch(_ -> right)` so both V1 paths are covered — + // important for patterns like `arr.N | null` where V2 errors on + // out-of-bounds index. + if b.Op == v1ast.TokPipe { + t.rec.Rewritten(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleCoalescePrecedence, SpecRef: "§14#4", + Explanation: "V1 `|` coalesce rewritten as V2 `.or(x).catch(_ -> x)` (covers V1's combined null + error coalesce)", + }) + orCall := &syntax.MethodCallExpr{ + Receiver: left, + Method: "or", + MethodPos: pos(b.OpPos), + Args: []syntax.CallArg{{Value: right}}, + } + catchLambda := &syntax.LambdaExpr{ + TokenPos: pos(b.OpPos), + Params: []syntax.Param{{Discard: true, Pos: pos(b.OpPos), SlotIndex: -1}}, + Body: &syntax.ExprBody{Result: right}, + } + return &syntax.MethodCallExpr{ + Receiver: orCall, + Method: "catch", + MethodPos: pos(b.OpPos), + Args: []syntax.CallArg{{Value: catchLambda}}, + } + } + op, ok := mapV1BinaryOp(b.Op) + if !ok { + t.rec.Unsupported(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + RuleID: RuleUnsupportedConstruct, + Explanation: fmt.Sprintf("unmapped binary operator kind %v", b.Op), + }) + return nil + } + t.flagOperatorDivergence(b) + t.flagChainedNonAssoc(op, left, right, b.OpPos) + t.rec.Exact() + return &syntax.BinaryExpr{Left: left, Op: op, OpPos: pos(b.OpPos), Right: right} +} + +// nonAssocTier returns the precedence tier of a non-associative V2 operator, +// or -1 for operators that allow chaining. V2 forbids chains within a tier: +// `a == b == c`, `a < b < c`, and `a == b != c` are all syntax errors. +func nonAssocTier(op syntax.TokenType) int { + switch op { + case syntax.EQ, syntax.NE: + return 1 + case syntax.LT, syntax.LE, syntax.GT, syntax.GE: + return 2 + } + return -1 +} + +// flagChainedNonAssoc emits RuleEmittedInvalidV2 when a translated binary +// operator chains with a same-tier non-associative operator on either side. +// V1 accepts `1 < 2 < 3` (and silently produces `bool < int` semantics); V2 +// has no equivalent. We detect this during translation rather than relying on +// the post-print Parse safety net, because the printer wraps such expressions +// in parens to keep the output syntactically valid. +func (t *translator) flagChainedNonAssoc(op syntax.TokenType, left, right syntax.Expr, opPos v1ast.Pos) { + tier := nonAssocTier(op) + if tier < 0 { + return + } + chained := false + if lb, ok := left.(*syntax.BinaryExpr); ok && nonAssocTier(lb.Op) == tier { + chained = true + } + if rb, ok := right.(*syntax.BinaryExpr); ok && nonAssocTier(rb.Op) == tier { + chained = true + } + if !chained { + return + } + t.rec.Note(Change{ + Line: opPos.Line, Column: opPos.Column, + Severity: SeverityError, + Category: CategoryUnsupported, + RuleID: RuleEmittedInvalidV2, + Explanation: "V1 chained comparison has no V2 equivalent; the printed V2 form is parens-wrapped to parse, but evaluates as comparing a boolean to the next operand", + }) +} + +// translateUnary handles `!x` and unary `-x`. +func (t *translator) translateUnary(u *v1ast.UnaryExpr) syntax.Expr { + inner := t.translateExpr(u.Operand) + if inner == nil { + return nil + } + op, ok := mapV1UnaryOp(u.Op) + if !ok { + t.rec.Unsupported(Change{ + Line: u.OpPos.Line, Column: u.OpPos.Column, + RuleID: RuleUnsupportedConstruct, + Explanation: fmt.Sprintf("unmapped unary operator kind %v", u.Op), + }) + return nil + } + t.rec.Exact() + return &syntax.UnaryExpr{Op: op, OpPos: pos(u.OpPos), Operand: inner} +} + +// flagOperatorDivergence records SemanticChange Notes for V1 binary operators +// whose V2 behaviour differs on non-trivial operands. These are fire- +// unconditionally diagnostics — V2 is stricter than V1 about types, so any +// arithmetic/logical op that reaches non-primitive operands at runtime may +// diverge. We record the divergence per operator kind (skip comparison and +// equality, which are stricter in V2 but already flagged elsewhere). +func (t *translator) flagOperatorDivergence(b *v1ast.BinaryExpr) { + switch b.Op { + case v1ast.TokAnd, v1ast.TokOr: + t.rec.Note(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleAndOrSameLevel, + SpecRef: "§14#48", + Explanation: "V1 &&/|| coerce non-boolean operands; V2 requires boolean operands and errors otherwise", + }) + case v1ast.TokPlus: + t.rec.Note(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§14#41", + Explanation: "V1 + concatenates bytes-to-string and string-to-bytes; V2 is type-strict", + }) + case v1ast.TokSlash: + t.rec.Note(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleIntDivReturnsFloat, + SpecRef: "§14#5", + Explanation: "V1 / on int operands returns float64; V2 preserves integer division when both operands are int", + }) + case v1ast.TokStar, v1ast.TokMinus: + t.rec.Note(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§14#26", + Explanation: "V1 arithmetic silently wraps on int64 overflow and coerces across numeric types; V2 errors on overflow and on integers outside the float64 exact range (2^53)", + }) + case v1ast.TokPercent: + t.rec.Note(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleModuloFloatTruncation, + SpecRef: "§14#39", + Explanation: "V1 % silently truncates float operands to int64 before mod; V2 uses fmod and preserves float64", + }) + case v1ast.TokEq, v1ast.TokNeq: + t.rec.Note(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleBoolNumberEquality, + SpecRef: "§14#38", + Explanation: "V1 ==/!= coerces across types (bool==1 is true, string==bytes compares bytes); V2 requires matching types", + }) + case v1ast.TokLt, v1ast.TokLte, v1ast.TokGt, v1ast.TokGte: + t.rec.Note(Change{ + Line: b.OpPos.Line, Column: b.OpPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 <, <=, >, >= accept some cross-type operands and perform coercion; V2 errors unless both operands are numeric/string/bytes of the same family", + }) + } +} + +// mapV1BinaryOp maps a V1 binary operator token kind to its V2 TokenType +// equivalent. Returns false for kinds that don't have a direct V2 mapping +// (e.g. the `|` coalesce, which is handled specially in translateBinary). +func mapV1BinaryOp(k v1ast.TokenKind) (syntax.TokenType, bool) { + switch k { + case v1ast.TokPlus: + return syntax.PLUS, true + case v1ast.TokMinus: + return syntax.MINUS, true + case v1ast.TokStar: + return syntax.STAR, true + case v1ast.TokSlash: + return syntax.SLASH, true + case v1ast.TokPercent: + return syntax.PERCENT, true + case v1ast.TokEq: + return syntax.EQ, true + case v1ast.TokNeq: + return syntax.NE, true + case v1ast.TokLt: + return syntax.LT, true + case v1ast.TokLte: + return syntax.LE, true + case v1ast.TokGt: + return syntax.GT, true + case v1ast.TokGte: + return syntax.GE, true + case v1ast.TokAnd: + return syntax.AND, true + case v1ast.TokOr: + return syntax.OR, true + } + return 0, false +} + +// mapV1UnaryOp maps V1 unary tokens (! and -) to V2. +func mapV1UnaryOp(k v1ast.TokenKind) (syntax.TokenType, bool) { + switch k { + case v1ast.TokBang: + return syntax.BANG, true + case v1ast.TokMinus: + return syntax.MINUS, true + } + return 0, false +} + +// translateFieldAccess recursively walks the V1 field-access chain. +// +// V1 path access is universally null-tolerant: reading any field of a non- +// object returns null (§12.5). V2 defaults to strict: `null.field` errors. +// To preserve V1 semantics we emit the null-safe V2 form `?.field` on every +// field access. This handles the null case; wrong-type receivers (e.g. +// `5.field`) still error in V2 even with `?.`, which is a genuine V1-V2 +// divergence flagged separately when it arises. +// +// A V1 path segment whose name is an all-digit string (e.g. `this.items.0`) +// is V1's array-indexing syntax (§6.3). V2 uses bracket indexing for that, +// so we emit an IndexExpr with the numeric value rather than a literal field +// named "0". +func (t *translator) translateFieldAccess(f *v1ast.FieldAccess) syntax.Expr { + recv := t.translateExpr(f.Recv) + if recv == nil { + return nil + } + if !f.Seg.Quoted && isAllDigits(f.Seg.Name) { + t.rec.Rewritten(Change{ + Line: f.Seg.Pos.Line, Column: f.Seg.Pos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleNoBracketIndexing, SpecRef: "§14#10", + Explanation: "V1 numeric path segment rewritten as V2 index expression", + }) + // V2 rejects out-of-bounds array indices at runtime where V1 + // returned null; flag so such tests surface as known divergences. + t.rec.Note(Change{ + Line: f.Seg.Pos.Line, Column: f.Seg.Pos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleNoBracketIndexing, + Explanation: "V1 numeric path access on arrays tolerates out-of-bounds (returns null); V2 errors", + }) + return &syntax.IndexExpr{ + Receiver: recv, + Index: &syntax.LiteralExpr{ + TokenPos: pos(f.Seg.Pos), + TokenType: syntax.INT, + Value: f.Seg.Name, + }, + LBracketPos: pos(f.Seg.Pos), + NullSafe: true, + } + } + // Flag field accesses whose receiver can't be statically guaranteed + // to be an object. V1 returns null for field access on scalars and + // arrays (§12.5); V2 errors. The `?.` NullSafe modifier catches null + // but not wrong-type receivers — if the receiver's expected type + // isn't object-ish, emit a SemanticChange so the divergence is + // visible. + if !objectLikeReceiver(recv) { + t.rec.Rewritten(Change{ + Line: f.Seg.Pos.Line, Column: f.Seg.Pos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleStringLengthBytes, SpecRef: "§12.5", + Explanation: "V1 path access on non-object returns null; V2 errors on wrong-type receivers (consider .catch(null))", + }) + } else { + t.rec.Exact() + } + return &syntax.FieldAccessExpr{ + Receiver: recv, + Field: f.Seg.Name, + FieldPos: pos(f.Seg.Pos), + NullSafe: true, + } +} + +// objectLikeReceiver returns true if the V2 receiver expression is guaranteed +// (or very likely) to evaluate to an object. We treat input/output roots and +// their chained field accesses as object-like — the common case where V1 and +// V2 agree. Variables, idents, method-call results, and index expressions are +// NOT object-guaranteed: V1 returns null for field access on scalars, V2 +// errors, and without static type info we can't tell. +func objectLikeReceiver(e syntax.Expr) bool { + switch r := e.(type) { + case *syntax.InputExpr, *syntax.OutputExpr, *syntax.InputMetaExpr, *syntax.OutputMetaExpr: + return true + case *syntax.FieldAccessExpr: + return objectLikeReceiver(r.Receiver) + } + return false +} + +// isAllDigits returns true when s is a non-empty string of ASCII digits. +func isAllDigits(s string) bool { + if len(s) == 0 { + return false + } + for i := 0; i < len(s); i++ { + if s[i] < '0' || s[i] > '9' { + return false + } + } + return true +} + +// translateMethodCall rewrites `recv.name(args)`. Some method names are +// renamed or reshape in V2; methodRewrite handles those. Others are 1:1. +func (t *translator) translateMethodCall(m *v1ast.MethodCall) syntax.Expr { + recv := t.translateExpr(m.Recv) + if recv == nil { + return nil + } + if out := t.methodRewrite(m, recv); out != nil { + return out + } + args := t.translateArgs(m.Args) + t.rec.Exact() + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: m.Name, + MethodPos: pos(m.NamePos), + Args: args, + Named: m.Named, + } +} + +// translateFunctionCall rewrites top-level `name(args)` calls. +func (t *translator) translateFunctionCall(f *v1ast.FunctionCall) syntax.Expr { + // V1 `now()` returns a string (RFC3339Nano); V2 returns a typed + // timestamp. Downstream comparisons and formatting differ. + if f.Name == "now" && len(f.Args) == 0 && !f.Named { + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleNowReturnsString, + SpecRef: "§14#57", + Explanation: "V1 now() returns a string; V2 returns a typed timestamp", + }) + } + // V1 `range(a, b, step)` with a descending range and explicit step + // includes one additional element compared with V2 (boundary + // arithmetic differs). + if f.Name == "range" { + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 range(a, b, step) boundary arithmetic differs from V2 on descending ranges — audit array length", + }) + } + // V1 `parse_json` / `parse_yaml` return all numbers as float64; V2 + // distinguishes int64 and float64 based on the serialised form. + // Downstream code that branches on .type() or compares types will + // diverge. + if f.Name == "parse_json" || f.Name == "parse_yaml" { + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§13", + Explanation: "V1 " + f.Name + "() returns all numbers as float64; V2 distinguishes int64 and float64 by serialised form", + }) + } + // V1 `deleted()` as a top-level assignment marker has overlapping but + // distinct V2 semantics: V2 rejects it in some positions where V1 + // silently accepted (e.g. variable assignment, comparison, method + // receiver). Flag so divergences surface. + if f.Name == "deleted" && len(f.Args) == 0 && !f.Named { + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§7.3", + Explanation: "V1 deleted() is widely tolerated; V2 errors when deleted() appears in variable assignments, comparisons, or as a method receiver", + }) + } + // V1 `nothing()` is a sentinel that means different things in + // different positions. V2 split the concepts: `void()` for + // "skip this assignment", `deleted()` for "omit from this + // collection". The translator disambiguates by looking at the + // current rendering context (see ctxKind) and emitting the V2 + // form that matches V1's intent at each site. + if f.Name == "nothing" && len(f.Args) == 0 && !f.Named { + switch t.currentCtx() { + case ctxCollectionLit: + t.rec.Rewritten(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§14#71", + Explanation: "V1 `nothing()` inside a collection literal rewritten as V2 `deleted()` (both elide the entry)", + }) + return &syntax.CallExpr{ + TokenPos: pos(f.NamePos), + Name: "deleted", + } + case ctxVarDeclRHS: + // V1 `let $x = nothing()` deletes $x. V2 errors on + // void in a variable declaration and has no equivalent + // delete-a-variable construct. Emit `void()` so the V2 + // runtime fires the documented error at the right site, + // and flag Error-severity so the migrator user sees that + // manual rewrite is required. + t.rec.Unsupported(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + RuleID: RuleUnsupportedConstruct, + SpecRef: "§14#17", + Explanation: "V1 `let $x = nothing()` deletes the variable; V2 has no equivalent — emitted `void()` which will error at runtime. Rewrite this `let` by hand.", + }) + return &syntax.CallExpr{ + TokenPos: pos(f.NamePos), + Name: "void", + } + } + t.rec.Rewritten(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§14#36", + Explanation: "V1 `nothing()` sentinel rewritten as V2 `void()`", + }) + return &syntax.CallExpr{ + TokenPos: pos(f.NamePos), + Name: "void", + } + } + if rewritten := t.functionRewrite(f); rewritten != nil { + return rewritten + } + args := t.translateArgs(f.Args) + t.rec.Exact() + return &syntax.CallExpr{ + TokenPos: pos(f.NamePos), + Name: f.Name, + Args: args, + Named: f.Named, + } +} + +// functionRewrite handles V1→V2 function-shape translations. Returns a +// non-nil V2 expression on success or nil to fall through to the +// default 1:1 translation. Rules ordered by V1 function name. +func (t *translator) functionRewrite(f *v1ast.FunctionCall) syntax.Expr { + // Custom rules win on name collision (design P2). Same precedence + // model as methodRewrite. + if rule, ok := t.customFunctionRules[f.Name]; ok { + if out, handled := rule(t, f); handled { + if out == nil { + return nil + } + return out + } + } + switch f.Name { + case "metadata", "meta": + return t.metadataReadToInputMeta(f) + case "root_meta": + return t.rootMetaReadToOutputMeta(f) + case "error": + return t.errorStringToErrorWhat(f) + case "error_source_label", "error_source_name", "error_source_path": + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 " + f.Name + "() has no V2 equivalent in this iteration; V2's structured error() will surface source.* fields in a future revision", + }) + return nil + case "json": + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 json(path) is not auto-rewritten; V2 exposes the parsed body as `input` directly (or use `content().parse_json()` to re-parse from bytes)", + }) + return nil + } + return nil +} + +// metadataReadToInputMeta rewrites V1 `metadata("k")` / `meta("k")` to V2 +// `input@["k"]`, and the no-arg form `metadata()` to V2 `input@` (the +// whole metadata object). +// +// V1 `meta` returned strings only; V2 `input@` is value-typed. Flag the +// difference as a Note so callers that compared against string literals +// audit the rewrite. +func (t *translator) metadataReadToInputMeta(f *v1ast.FunctionCall) syntax.Expr { + if len(f.Args) > 1 { + return nil + } + if len(f.Args) == 0 { + t.rec.Rewritten(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMetaReadToInputMeta, + Explanation: "V1 " + f.Name + "() rewritten as V2 `input@`", + }) + return &syntax.InputMetaExpr{TokenPos: pos(f.NamePos)} + } + key := t.translateExpr(f.Args[0].Value) + if key == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMetaReadToInputMeta, + Explanation: "V1 " + f.Name + "(key) rewritten as V2 `input@[key]`", + }) + if f.Name == "meta" { + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMetaReadToInputMeta, + Explanation: "V1 meta() returned strings only; V2 `input@` exposes the value-typed entry — comparisons against bare string literals may diverge", + }) + } + return &syntax.IndexExpr{ + Receiver: &syntax.InputMetaExpr{TokenPos: pos(f.NamePos)}, + Index: key, + LBracketPos: pos(f.NamePos), + } +} + +// rootMetaReadToOutputMeta rewrites V1 `root_meta("k")` to V2 +// `output@["k"]` and the no-arg form to V2 `output@`. +func (t *translator) rootMetaReadToOutputMeta(f *v1ast.FunctionCall) syntax.Expr { + if len(f.Args) > 1 { + return nil + } + if len(f.Args) == 0 { + t.rec.Rewritten(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMetaReadToInputMeta, + Explanation: "V1 root_meta() rewritten as V2 `output@`", + }) + return &syntax.OutputMetaExpr{TokenPos: pos(f.NamePos)} + } + key := t.translateExpr(f.Args[0].Value) + if key == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMetaReadToInputMeta, + Explanation: "V1 root_meta(key) rewritten as V2 `output@[key]`", + }) + return &syntax.IndexExpr{ + Receiver: &syntax.OutputMetaExpr{TokenPos: pos(f.NamePos)}, + Index: key, + LBracketPos: pos(f.NamePos), + } +} + +// errorStringToErrorWhat rewrites V1 `error()` (which returned a string, +// or null when no error) to V2 `error().what` (V2's error() returns a +// structured object `{what: string}` or null). The rewrite preserves +// the V1 string-typed callsite for comparisons / concatenations; a Note +// flags the type change for code that branches on the raw form. +func (t *translator) errorStringToErrorWhat(f *v1ast.FunctionCall) syntax.Expr { + if len(f.Args) > 0 { + return nil + } + t.rec.Rewritten(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 error() returned a string; V2 returns `{what: string}` — call site rewritten as `error().what` to preserve the string-valued contract", + }) + t.rec.Note(Change{ + Line: f.NamePos.Line, Column: f.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V2 error() is structured; future iterations will add source.label / source.name / source.path fields", + }) + call := &syntax.CallExpr{ + TokenPos: pos(f.NamePos), + Name: "error", + } + return &syntax.FieldAccessExpr{ + Receiver: call, + Field: "what", + FieldPos: pos(f.NamePos), + NullSafe: true, + } +} + +// translateMetaCall rewrites `meta(expr)` reads. +func (t *translator) translateMetaCall(m *v1ast.MetaCall) syntax.Expr { + key := t.translateExpr(m.Key) + if key == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: m.TokPos.Line, Column: m.TokPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMetaReadToInputMeta, + Explanation: "meta(expr) read rewritten as input@[expr]", + }) + return &syntax.IndexExpr{ + Receiver: &syntax.InputMetaExpr{TokenPos: pos(m.TokPos)}, + Index: key, + LBracketPos: pos(m.TokPos), + } +} + +// translateLambda rewrites `name -> body`. The parameter name is pushed onto +// the scope stack before translating the body so that identifier references +// to the param are resolved as named-context, not legacy bare-idents. +func (t *translator) translateLambda(l *v1ast.Lambda) syntax.Expr { + paramName := l.Param + if l.Discard { + paramName = "_" + } + t.pushScope(paramName) + body := t.translateExpr(l.Body) + t.popScope() + if body == nil { + return nil + } + t.rec.Exact() + return &syntax.LambdaExpr{ + TokenPos: pos(l.ParamPos), + Params: []syntax.Param{{Name: l.Param, Discard: l.Discard, Pos: pos(l.ParamPos), SlotIndex: -1}}, + Body: &syntax.ExprBody{Result: body}, + } +} + +// translateArrayLit rewrites `[elem, ...]`. Pushes ctxCollectionLit +// while translating each element so nested `nothing()` calls lower to +// V2 `deleted()` (which elides from the array, matching V1) rather +// than V2 `void()` (which would error in array-literal position). +func (t *translator) translateArrayLit(a *v1ast.ArrayLit) syntax.Expr { + out := &syntax.ArrayLiteral{LBracketPos: pos(a.TokPos)} + for _, elem := range a.Elems { + t.pushCtx(ctxCollectionLit) + v := t.translateExpr(elem) + t.popCtx() + if v != nil { + out.Elements = append(out.Elements, v) + } + } + t.rec.Exact() + return out +} + +// translateObjectLit rewrites `{key: value, ...}`. Pushes ctxCollectionLit +// while translating each entry's value for the same reason arrays do. +// Keys are translated without the context — a sentinel as an object key +// would be malformed in either language. +func (t *translator) translateObjectLit(o *v1ast.ObjectLit) syntax.Expr { + out := &syntax.ObjectLiteral{LBracePos: pos(o.TokPos)} + for _, entry := range o.Entries { + key := t.translateExpr(entry.Key) + t.pushCtx(ctxCollectionLit) + value := t.translateExpr(entry.Value) + t.popCtx() + if key == nil || value == nil { + continue + } + out.Entries = append(out.Entries, syntax.ObjectEntry{Key: key, Value: value}) + } + t.rec.Exact() + return out +} + +// translateIfExpr rewrites `if/else if/else` expression form. +// +// V1 without `else` produces a `nothing` sentinel, which silently elided +// from collection literals and skipped assignments. V2 produces void, +// which errors in collection literals. When we're translating an +// if-without-else inside a collection-literal context, synthesize an +// explicit `else { deleted() }` so the resulting V2 expression elides +// the entry rather than erroring. +func (t *translator) translateIfExpr(i *v1ast.IfExpr) syntax.Expr { + out := &syntax.IfExpr{TokenPos: pos(i.TokPos)} + for _, br := range i.Branches { + t.flagNonBoolCond(br.Cond, i.TokPos) + cond := t.translateExpr(br.Cond) + body := t.translateExpr(br.Body) + if cond == nil || body == nil { + continue + } + out.Branches = append(out.Branches, syntax.IfExprBranch{Cond: cond, Body: &syntax.ExprBody{Result: body}}) + } + if i.Else != nil { + if body := t.translateExpr(i.Else); body != nil { + out.Else = &syntax.ExprBody{Result: body} + } + } else if t.currentCtx() == ctxCollectionLit { + t.rec.Rewritten(Change{ + Line: i.TokPos.Line, Column: i.TokPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleIfNoElseNothing, SpecRef: "§14#71", + Explanation: "V1 if-without-else inside a collection literal elides the entry; V2 errors — synthesized `else { deleted() }` to preserve the elision", + }) + out.Else = &syntax.ExprBody{Result: &syntax.CallExpr{ + TokenPos: pos(i.TokPos), + Name: "deleted", + }} + } else { + t.rec.Rewritten(Change{ + Line: i.TokPos.Line, Column: i.TokPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleIfNoElseNothing, SpecRef: "§14#44", + Explanation: "V1 if-without-else produces nothing sentinel; V2 behaviour may differ", + }) + } + t.rec.Exact() + return out +} + +// flagNonBoolCond emits a SemanticChange note when a V1 if condition is a +// literal `null` — V1 treats null as falsy while V2 errors on non-bool +// conditions. Broader analysis (variables, method calls) isn't feasible +// without type inference; this covers the obvious static cases and leaves +// runtime-only divergences to be caught by the general bool-strictness flag. +func (t *translator) flagNonBoolCond(cond v1ast.Expr, tokPos v1ast.Pos) { + lit, ok := cond.(*v1ast.Literal) + if !ok { + return + } + if lit.Kind == v1ast.LitBool { + return + } + t.rec.Note(Change{ + Line: tokPos.Line, Column: tokPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleAndOrSameLevel, + Explanation: "V1 accepts non-bool if-conditions (null is falsy; int/string error); V2 requires a boolean condition", + }) +} + +// translateMatchExpr rewrites `match [subject] { cases }`. +func (t *translator) translateMatchExpr(m *v1ast.MatchExpr) syntax.Expr { + out := &syntax.MatchExpr{TokenPos: pos(m.TokPos), BindingSlot: -1} + if m.Subject != nil { + out.Subject = t.translateExpr(m.Subject) + } else { + // Subject-less match (V1 boolean-case form). V2 requires each + // case pattern to evaluate to bool; V1 coerced non-bool patterns + // (int/string/null) silently. Flag so runtime divergences surface. + t.rec.Note(Change{ + Line: m.TokPos.Line, Column: m.TokPos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMatchSubjectRebinds, + SpecRef: "§8", + Explanation: "V1 boolean-case match coerces non-boolean case patterns; V2 errors when a case doesn't evaluate to bool", + }) + } + hasWildcard := false + for _, c := range m.Cases { + if c.Wildcard { + hasWildcard = true + } + if lit, ok := c.Pattern.(*v1ast.Literal); ok && lit.Kind == v1ast.LitBool { + t.rec.Note(Change{ + Line: lit.TokPos.Line, Column: lit.TokPos.Column, + Severity: SeverityWarning, + Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§8", + Explanation: "V1 allows a boolean literal as a match case pattern (equality match); V2 rejects this — rewrite using `as` binding or an explicit boolean condition.", + }) + } + mc := syntax.MatchCase{Wildcard: c.Wildcard} + if c.Pattern != nil { + mc.Pattern = t.translateExpr(c.Pattern) + } + if body := t.translateExpr(c.Body); body != nil { + mc.Body = body + } else { + continue + } + out.Cases = append(out.Cases, mc) + } + // A V1 match without a wildcard that produces void inside a + // collection literal would elide the entry; V2 errors. Synthesize + // `_ => deleted()` so the elision carries over. + if !hasWildcard && t.currentCtx() == ctxCollectionLit { + t.rec.Rewritten(Change{ + Line: m.TokPos.Line, Column: m.TokPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§14#71", + Explanation: "V1 match-without-wildcard inside a collection literal elides the entry on no match; V2 errors — synthesized `_ => deleted()` to preserve the elision", + }) + out.Cases = append(out.Cases, syntax.MatchCase{ + Wildcard: true, + Body: &syntax.CallExpr{ + TokenPos: pos(m.TokPos), + Name: "deleted", + }, + }) + } + t.rec.Exact() + return out +} + +// translateMapExpr rewrites the path-scoped `recv.(expr)` form. +// +// V1 has two shapes: +// - `recv.(name -> body)` — bind name to recv, `this` unchanged. +// - `recv.(body)` — rebind `this` to recv inside body. +// +// V2's .into(lambda) method (§13.12) maps cleanly onto the named form: +// `recv.(name -> body)` → `recv.into(name -> body)`. The un-named form +// rebinds `this`, which V2 lambdas don't do directly — we synthesize a +// named param and rewrite references to `this` inside the body to that +// name (handled by pushThisRebind during translation of the body). +func (t *translator) translateMapExpr(m *v1ast.MapExpr) syntax.Expr { + recv := t.translateExpr(m.Recv) + if recv == nil { + return nil + } + var lam *syntax.LambdaExpr + var explanation string + if lambda, ok := m.Body.(*v1ast.Lambda); ok { + translatedLambda := t.translateLambda(lambda) + if translatedLambda == nil { + return nil + } + var asLam *syntax.LambdaExpr + asLam, ok = translatedLambda.(*syntax.LambdaExpr) + if !ok { + return nil + } + lam = asLam + explanation = "V1 recv.(name -> body) rewritten as V2 recv.into(name -> body)" + } else { + // Un-named form: synthesize a lambda that rebinds `this` to a fresh + // name while the body is translated, then wrap as .into($name -> body). + const paramName = "__this" + t.pushScope(paramName) + t.pushThisRebind(paramName) + body := t.translateExpr(m.Body) + t.popThisRebind() + t.popScope() + if body == nil { + return nil + } + lam = &syntax.LambdaExpr{ + TokenPos: pos(m.TokPos), + Params: []syntax.Param{{Name: paramName, Pos: pos(m.TokPos), SlotIndex: -1}}, + Body: &syntax.ExprBody{Result: body}, + } + explanation = "V1 recv.(body) (un-named, this-rebinding) rewritten as V2 recv.into(__this -> body) with `this` references replaced" + } + + t.rec.Rewritten(Change{ + Line: m.TokPos.Line, Column: m.TokPos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§5.4", + Explanation: explanation, + }) + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: "into", + MethodPos: pos(m.TokPos), + Args: []syntax.CallArg{{Value: lam}}, + } +} + +// translateArgs translates call arguments. +func (t *translator) translateArgs(args []v1ast.CallArg) []syntax.CallArg { + out := make([]syntax.CallArg, 0, len(args)) + for _, a := range args { + v := t.translateExpr(a.Value) + if v == nil { + continue + } + out = append(out, syntax.CallArg{Name: a.Name, Value: v}) + } + return out +} diff --git a/internal/bloblang2/migrator/translator/fold_test.go b/internal/bloblang2/migrator/translator/fold_test.go new file mode 100644 index 000000000..1cfe02db5 --- /dev/null +++ b/internal/bloblang2/migrator/translator/fold_test.go @@ -0,0 +1,98 @@ +package translator_test + +import ( + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// TestFoldContextToTwoParam is the regression test for the GA4 case study's +// .fold lambda — V1's one-param context object becomes V2's two explicit +// (tally, value) params. The rewrite walks the V1 body, replacing +// `item.tally` and `item.value` field accesses with bare `tally`/`value` +// identifiers that resolve as lambda-param references on the V2 side. +func TestFoldContextToTwoParam(t *testing.T) { + cases := []struct { + name string + v1 string + wants []string // substrings that must appear in V2 output, in order + }{ + { + name: "GA4 key-value merge pattern", + v1: `let params = this.params. + map_each(p -> {"key": p.key, "value": p.value}). + fold({}, item -> item.tally.merge({(item.value.key): item.value.value})) +root.x = $params +`, + wants: []string{ + ".fold({},", + "(tally, value) ->", + "tally.merge(", + "value?.key", + "value?.value", + }, + }, + { + name: "integer accumulator", + v1: `root.sum = this.items.fold(0, item -> item.tally + item.value) +`, + wants: []string{ + ".fold(0,", + "(tally, value) ->", + "tally + value", + }, + }, + { + name: "array accumulator using .merge on tally", + v1: `root.out = this.xs.fold([], item -> item.tally.merge([item.value])) +`, + wants: []string{ + ".fold([],", + "(tally, value) ->", + "tally.merge([value])", + }, + }, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + rep, err := translator.Migrate(tc.v1, translator.Options{MinCoverage: 0}) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + out := rep.V2Mapping + idx := 0 + for _, want := range tc.wants { + j := strings.Index(out[idx:], want) + if j < 0 { + t.Fatalf("V2 output missing %q (or out of order).\nOutput:\n%s", want, out) + } + idx += j + len(want) + } + }) + } +} + +// TestFoldBareContextRefIsFlagged — if the V1 lambda references the +// context param directly (not via .tally/.value) the translator can't +// mechanically rewrite; it falls through and records a Warning so the +// user knows manual conversion is needed. +func TestFoldBareContextRefIsFlagged(t *testing.T) { + v1 := `root.x = this.xs.fold({}, item -> item) +` + rep, err := translator.Migrate(v1, translator.Options{MinCoverage: 0}) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + var sawWarning bool + for _, c := range rep.Changes { + if strings.Contains(c.Explanation, "fold") && c.Severity == translator.SeverityWarning { + sawWarning = true + break + } + } + if !sawWarning { + t.Errorf("expected a Warning change referring to .fold; got changes:\n%+v", rep.Changes) + } +} diff --git a/internal/bloblang2/migrator/translator/imports.go b/internal/bloblang2/migrator/translator/imports.go new file mode 100644 index 000000000..7f505f07c --- /dev/null +++ b/internal/bloblang2/migrator/translator/imports.go @@ -0,0 +1,152 @@ +package translator + +import ( + "fmt" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// siteKey identifies an import site for de-duplication and lookup. +// parentKey is the canonical key of the file the import appears in +// (empty for the main V1 source). importPath is the path string as +// written in the import statement. +type siteKey struct { + parentKey string + importPath string +} + +// fileSet is the result of walking the closure of imports rooted at +// the main V1 source. contents is keyed by canonical key; siteIndex +// maps each visited (parentKey, importPath) pair to the canonical +// key the resolver assigned. Unresolved imports are recorded on +// unresolved so the translator can surface RuleImportStatement +// Unsupported changes at the appropriate sites. +type fileSet struct { + contents map[string]string + siteIndex map[siteKey]string + unresolved map[siteKey]struct{} +} + +func newFileSet() *fileSet { + return &fileSet{ + contents: map[string]string{}, + siteIndex: map[siteKey]string{}, + unresolved: map[siteKey]struct{}{}, + } +} + +// buildFileSet walks the closure of imports rooted at the supplied +// main V1 source. Pre-populated entries in opts.Files take precedence: +// any importPath whose string appears as a key in Files is treated as +// already resolved with the path string as its canonical key. Imports +// not satisfied by Files consult opts.FileResolver, if set. +// +// Imports that cannot be resolved (no Files entry, no resolver, or +// resolver returns ok=false) are recorded on the returned fileSet's +// unresolved map; the translator surfaces them as RuleImportStatement +// Unsupported changes during translation. +// +// A V1 parse error inside an imported file is fatal — the caller is +// supplying broken content. +func buildFileSet(mainSource string, opts Options) (*fileSet, error) { + fs := newFileSet() + + // Seed contents with any pre-populated Files. Each entry is treated + // as already canonicalised with the map key as its canonical key. + for k, v := range opts.Files { + fs.contents[k] = v + } + + // BFS over the import graph. queue entries are (parentKey, source) + // pairs — the source whose imports we still need to discover. + type queueEntry struct { + parentKey string + source string + } + queue := []queueEntry{{parentKey: "", source: mainSource}} + visited := map[string]struct{}{} + + for len(queue) > 0 { + entry := queue[0] + queue = queue[1:] + + prog, err := v1ast.Parse(entry.source) + if err != nil { + if entry.parentKey == "" { + // Main source parse error — return it; the caller's + // migrateSource will attempt the same parse and produce + // a more accurate error message there. + return fs, nil + } + return nil, fmt.Errorf("parsing imported file %q: %w", entry.parentKey, err) + } + + for _, stmt := range prog.Stmts { + var pathStr string + switch s := stmt.(type) { + case *v1ast.ImportStmt: + lit, ok := s.Path.(*v1ast.Literal) + if !ok { + continue + } + pathStr = lit.Str + case *v1ast.FromStmt: + lit, ok := s.Path.(*v1ast.Literal) + if !ok { + continue + } + pathStr = lit.Str + default: + continue + } + site := siteKey{parentKey: entry.parentKey, importPath: pathStr} + + canonical, content, resolved, consulted := resolveImport(opts, site) + if !resolved { + if consulted { + // FileResolver was set and explicitly declined this + // path — surface as unresolved so translateImport + // can flag it. Imports with no resolver and no + // Files entry fall through to the legacy path + // (V2 import statement emitted verbatim; the V2 + // sanity-check Parse surfaces the missing file). + fs.unresolved[site] = struct{}{} + } + continue + } + fs.siteIndex[site] = canonical + if _, seen := visited[canonical]; seen { + continue + } + visited[canonical] = struct{}{} + if _, exists := fs.contents[canonical]; !exists { + fs.contents[canonical] = content + } + queue = append(queue, queueEntry{parentKey: canonical, source: content}) + } + } + + return fs, nil +} + +// resolveImport applies the Files-then-FileResolver precedence to a +// single import site. Returns the canonical key, the V1 content, +// whether the import was resolved, and whether the resolver was +// consulted (i.e. FileResolver was set and called). The consulted +// flag distinguishes an explicit "could not resolve" answer from +// "no resolver configured" — the former is surfaced as Unsupported, +// the latter falls through to the legacy "emit V2 import verbatim" +// path so callers without a resolver retain the old behaviour. +func resolveImport(opts Options, site siteKey) (canonical, content string, ok, consulted bool) { + if c, exists := opts.Files[site.importPath]; exists { + return site.importPath, c, true, false + } + if opts.FileResolver == nil { + return "", "", false, false + } + canonical, content, ok = opts.FileResolver(site.parentKey, site.importPath) + if !ok { + return "", "", false, true + } + return canonical, content, true, true +} diff --git a/internal/bloblang2/migrator/translator/imports_test.go b/internal/bloblang2/migrator/translator/imports_test.go new file mode 100644 index 000000000..4bf4436ab --- /dev/null +++ b/internal/bloblang2/migrator/translator/imports_test.go @@ -0,0 +1,374 @@ +package translator_test + +import ( + "path" + "strings" + "testing" + "time" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// TestFileResolverSimple — a single import resolved via FileResolver. +// Confirms the resolver is consulted, canonical key is honoured, and +// the imported file appears in Report.V2Files under the canonical key. +func TestFileResolverSimple(t *testing.T) { + v1Helpers := `map double { root = this * 2 }` + v1Main := `import "./helpers.blobl" +root.x = 21.apply("double") +` + rep, err := translator.Migrate(v1Main, translator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + if parentKey == "" && importPath == "./helpers.blobl" { + return "/abs/helpers.blobl", v1Helpers, true + } + return "", "", false + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if rep.V2Files == nil { + t.Fatalf("expected Report.V2Files to be populated") + } + // Canonical-key emission: V2Files should be keyed by the canonical + // key the resolver returned. + if _, ok := rep.V2Files["/abs/helpers.blobl"]; !ok { + t.Fatalf("expected V2Files to contain canonical key /abs/helpers.blobl, got keys: %v", keys(rep.V2Files)) + } +} + +// TestFileResolverTransitive — A imports B, B imports C, all resolved +// via FileResolver. parentKey for B's import should be A's canonical +// key, and the entire closure should land in V2Files. +func TestFileResolverTransitive(t *testing.T) { + files := map[string]string{ + "/abs/a.blobl": `import "./b.blobl" +map a_helper { root = this.b_helper.apply() } +`, + "/abs/b.blobl": `import "./c.blobl" +map b_helper { root = this.c_helper.apply() } +`, + "/abs/c.blobl": `map c_helper { root = this * 3 }`, + } + v1Main := `import "/abs/a.blobl" +root.x = 7.apply("a_helper") +` + var resolverCalls int + rep, err := translator.Migrate(v1Main, translator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + resolverCalls++ + var canonical string + if strings.HasPrefix(importPath, "/") { + canonical = importPath + } else { + canonical = path.Join(path.Dir(parentKey), importPath) + } + content, ok := files[canonical] + return canonical, content, ok + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + for _, want := range []string{"/abs/a.blobl", "/abs/b.blobl", "/abs/c.blobl"} { + if _, ok := rep.V2Files[want]; !ok { + t.Fatalf("expected V2Files to contain %q, got keys: %v", want, keys(rep.V2Files)) + } + } + if resolverCalls < 3 { + t.Fatalf("expected resolver to fire at least 3 times, got %d", resolverCalls) + } +} + +// TestFileResolverDedupesByCanonicalKey — two imports resolving to the +// same canonical key should result in a single V2Files entry, not two. +func TestFileResolverDedupesByCanonicalKey(t *testing.T) { + v1Helpers := `map double { root = this * 2 }` + v1Main := `import "./helpers.blobl" +import "helpers.blobl" +root.x = 21.apply("double") +` + var resolverCalls int + rep, err := translator.Migrate(v1Main, translator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + resolverCalls++ + return "/canonical/helpers.blobl", v1Helpers, true + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if got := len(rep.V2Files); got != 1 { + t.Fatalf("expected 1 V2 file (deduped), got %d: %v", got, keys(rep.V2Files)) + } + if resolverCalls != 2 { + t.Fatalf("expected resolver to fire once per import site, got %d", resolverCalls) + } +} + +// TestFileResolverUnresolvedFlagsUnsupported — a resolver that returns +// ok=false should produce an Unsupported RuleImportStatement change +// at the import site, and the V2 source should NOT contain the import. +func TestFileResolverUnresolvedFlagsUnsupported(t *testing.T) { + rep, err := translator.Migrate(`import "./missing.blobl" +root.x = "hi" +`, translator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + return "", "", false + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + var sawUnsupportedImport bool + for _, c := range rep.Changes { + if c.RuleID == translator.RuleImportStatement && c.Severity == translator.SeverityError { + sawUnsupportedImport = true + } + } + if !sawUnsupportedImport { + t.Fatalf("expected an Unsupported RuleImportStatement change, got: %v", rep.Changes) + } + if strings.Contains(rep.V2Mapping, "import") { + t.Fatalf("V2 source should not contain a dropped import statement, got:\n%s", rep.V2Mapping) + } +} + +// TestNoResolverNoFilesPreservesLegacyBehaviour — when neither Files +// nor FileResolver is set, the V2 import statement is emitted verbatim +// (legacy behaviour). The V2 sanity-check parse will still complain +// about the unresolved import via RuleEmittedInvalidV2. +func TestNoResolverNoFilesPreservesLegacyBehaviour(t *testing.T) { + rep, err := translator.Migrate(`import "./helpers.blobl" +root.x = "hi" +`, translator.Options{MinCoverage: 0}) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, `import "./helpers.blobl"`) { + t.Fatalf("expected legacy V2 source to contain verbatim import, got:\n%s", rep.V2Mapping) + } +} + +// TestV2ImportPathRewriter — V1 path strings are rewritten in the +// emitted V2 source, but canonical keys in V2Files are unaffected. +func TestV2ImportPathRewriter(t *testing.T) { + v1Helpers := `map double { root = this * 2 }` + rep, err := translator.Migrate(`import "./helpers.blobl" +root.x = 21.apply("double") +`, translator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + return "/abs/helpers.blobl", v1Helpers, true + }, + V2ImportPathRewriter: func(p string) string { + return strings.TrimSuffix(p, ".blobl") + ".v5.blobl" + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, `import "./helpers.v5.blobl"`) { + t.Fatalf("expected rewritten V2 import path, got:\n%s", rep.V2Mapping) + } + // Canonical-key emission is unaffected by the rewriter. + if _, ok := rep.V2Files["/abs/helpers.blobl"]; !ok { + t.Fatalf("expected V2Files to use canonical key (rewriter affects emitted source only), got keys: %v", keys(rep.V2Files)) + } +} + +// TestFilesTakesPrecedenceOverResolver — pre-populated Files entries +// shadow the resolver. Useful for in-memory test fixtures and for +// callers who want to override specific imports. +func TestFilesTakesPrecedenceOverResolver(t *testing.T) { + var resolverCalled bool + rep, err := translator.Migrate(`import "helpers.blobl" +root.x = 21.apply("double") +`, translator.Options{ + MinCoverage: 0, + Files: map[string]string{ + "helpers.blobl": `map double { root = this * 2 }`, + }, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + resolverCalled = true + return "should-not-happen", "should-not-happen", true + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if resolverCalled { + t.Fatalf("resolver should not be called when Files satisfies the import") + } + if _, ok := rep.V2Files["helpers.blobl"]; !ok { + t.Fatalf("expected V2Files keyed by Files entry's path, got: %v", keys(rep.V2Files)) + } +} + +// TestFromOnlyInlinesResolvedContent — `from "path"` as the entire +// V1 mapping body is replaced by the migrated V2 content of the +// referenced file. The Report records a Rewritten RuleFromStatement +// change. +func TestFromOnlyInlinesResolvedContent(t *testing.T) { + helpers := `root.id = this.id +root.upper_name = this.name.uppercase() +` + rep, err := translator.Migrate(`from "./helpers.blobl"`, translator.Options{ + MinCoverage: 0, + Verbose: true, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + if importPath == "./helpers.blobl" { + return "/abs/helpers.blobl", helpers, true + } + return "", "", false + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, "output.id") { + t.Fatalf("expected helpers.blobl to be migrated and inlined, got:\n%s", rep.V2Mapping) + } + if strings.Contains(rep.V2Mapping, "from ") { + t.Fatalf(`V2 output should not contain a "from" statement, got:\n%s`, rep.V2Mapping) + } + var sawFromRewrite bool + for _, c := range rep.Changes { + if c.RuleID == translator.RuleFromStatement && c.Severity == translator.SeverityInfo { + sawFromRewrite = true + } + } + if !sawFromRewrite { + t.Fatalf("expected a Rewritten RuleFromStatement change, got: %v", rep.Changes) + } +} + +// TestFromTransitiveInlining — A from-references B, B from-references C. +// The closure walker resolves all three and the migrator inlines C's +// content into B and B's (which is C's) into A. +func TestFromTransitiveInlining(t *testing.T) { + files := map[string]string{ + "/abs/a.blobl": `from "./b.blobl"`, + "/abs/b.blobl": `from "./c.blobl"`, + "/abs/c.blobl": `root.kind = "leaf"`, + } + rep, err := translator.Migrate(`from "/abs/a.blobl"`, translator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + var canonical string + if strings.HasPrefix(importPath, "/") { + canonical = importPath + } else { + // Resolve relative paths against parent's directory. + canonical = "/abs/" + strings.TrimPrefix(importPath, "./") + } + content, ok := files[canonical] + return canonical, content, ok + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, `output.kind = "leaf"`) { + t.Fatalf("expected transitive inlining to surface c.blobl content, got:\n%s", rep.V2Mapping) + } +} + +// TestFromUnresolvedFallsThroughToUnsupported — when the closure +// walker can't resolve the from path, the existing Unsupported +// behaviour is preserved (no inlining, RuleFromStatement Error). +func TestFromUnresolvedFallsThroughToUnsupported(t *testing.T) { + _, err := translator.Migrate(`from "./missing.blobl"`, translator.Options{ + MinCoverage: 0.5, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + return "", "", false + }, + }) + cerr, ok := err.(*translator.CoverageError) + if !ok { + t.Fatalf("expected *CoverageError, got %T: %v", err, err) + } + var sawFromUnsupported bool + for _, c := range cerr.Report.Changes { + if c.RuleID == translator.RuleFromStatement { + sawFromUnsupported = true + } + } + if !sawFromUnsupported { + t.Fatalf("expected a RuleFromStatement change, got: %v", cerr.Report.Changes) + } +} + +// TestFromMixedWithOtherStmtsFallsThroughToUnsupported — `from` +// alongside other statements is not the simple whole-mapping replace +// case. Falls through to current Unsupported behaviour. +func TestFromMixedWithOtherStmtsFallsThroughToUnsupported(t *testing.T) { + helpers := `root.id = this.id` + rep, err := translator.Migrate(`from "./helpers.blobl" +root.extra = "foo" +`, translator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + return "/abs/helpers.blobl", helpers, true + }, + }) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + // The `from` should remain Unsupported because it's mixed with + // other statements; we don't try to inline in that case. + var sawUnsupported bool + for _, c := range rep.Changes { + if c.RuleID == translator.RuleFromStatement && c.Severity == translator.SeverityError { + sawUnsupported = true + } + } + if !sawUnsupported { + t.Fatalf("expected mixed-statement from to fall through to Unsupported, got: %v", rep.Changes) + } +} + +// TestFromCycleDoesNotInfiniteLoop — A from B, B from A. The closure +// walker dedups by canonical key so it terminates; the fixpoint can +// make no progress (each side waits for the other) and the +// post-fixpoint cleanup translates them anyway. This test asserts +// the migrator returns rather than hanging, regardless of the +// (broken) input semantics. +func TestFromCycleDoesNotInfiniteLoop(t *testing.T) { + files := map[string]string{ + "/abs/a.blobl": `from "/abs/b.blobl"`, + "/abs/b.blobl": `from "/abs/a.blobl"`, + } + done := make(chan struct{}) + go func() { + defer close(done) + _, _ = translator.Migrate(`from "/abs/a.blobl"`, translator.Options{ + MinCoverage: 0, + Verbose: true, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + content, ok := files[importPath] + return importPath, content, ok + }, + }) + }() + select { + case <-done: + // Returned cleanly — exact V2 content for cyclic input is not + // well-defined (it's broken V1) so we don't assert on it. + case <-time.After(2 * time.Second): + t.Fatalf("Migrate hung on a from-cycle input — fixpoint or closure walker is non-terminating") + } +} + +func keys[K comparable, V any](m map[K]V) []K { + out := make([]K, 0, len(m)) + for k := range m { + out = append(out, k) + } + return out +} diff --git a/internal/bloblang2/migrator/translator/methods.go b/internal/bloblang2/migrator/translator/methods.go new file mode 100644 index 000000000..bd3e637c9 --- /dev/null +++ b/internal/bloblang2/migrator/translator/methods.go @@ -0,0 +1,947 @@ +package translator + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// methodRewrite applies V1 → V2 method-shape translations on a V1 MethodCall. +// Returns a non-nil V2 expression on success, or nil to signal "fall through +// to the default 1:1 translation". +// +// Rules are ordered by the V1 method name; each rule may: +// - rename the method (e.g. map_each -> map), +// - convert the method call to a different V2 node shape (e.g. index -> []), +// - leave it alone (default). +func (t *translator) methodRewrite(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + // Custom rules win on name collision (design P2). When a registered + // rule signals handled=true, its result short-circuits the + // built-in switch entirely. Returning a nil expression with + // handled=true is the rule's way of signalling Unsupported — the + // translator still falls back to the default 1:1 translation but + // the rule will already have recorded an Error-severity Change. + if rule, ok := t.customMethodRules[m.Name]; ok { + if out, handled := rule(t, m, recv); handled { + if out == nil { + return nil + } + return out + } + } + switch m.Name { + + // ----- Simple renames (V2 name differs, same shape) ----- + case "map_each": + // V1 .map_each accepts arrays and objects; V2 splits that: `.map` + // for arrays and `.map_values` for objects. Detect object-literal + // receivers at translate time; everything else defaults to `.map` + // with a SemanticChange flag so object-receiver cases surface in + // the Report. The single arg is a ParamQuery in V1 — wrap as a + // V2 lambda when the user wrote a bare query rather than a + // lambda. + if _, isObj := m.Recv.(*v1ast.ObjectLit); isObj { + return t.queryFormRename(m, recv, "map_values", nil) + } + return t.queryFormRename(m, recv, "map", &Change{ + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 .map_each() accepts arrays and objects; V2 .map() is array-only — use .map_values() if the receiver is an object", + }) + case "enumerated": + return t.simpleRename(m, recv, "enumerate") + case "key_values": + return t.simpleRename(m, recv, "iter") + case "map_each_key": + // V1 .map_each_key == V2 .map_keys (exact match — both take lambda). + return t.queryFormRename(m, recv, "map_keys", nil) + case "assign": + // V1 .assign() is a deep recursive merge of nested objects; V2 + // .merge() is shallow at the top level (nested values are + // replaced, not recursively merged). Flag so callers audit + // nested object usage. + return t.rewrittenRename(m, recv, "merge", + Change{ + RuleID: RuleMethodDoesNotExist, + Severity: SeverityWarning, + Category: CategorySemanticChange, + SpecRef: "§14#50", + Explanation: "V1 .assign() recursively deep-merges nested objects; V2 .merge() replaces nested values rather than merging", + }) + + // ----- Array indexing: .index(n) -> [n] ----- + case "index": + return t.indexToBracket(m, recv) + + // ----- Dynamic key access: .get(k) -> [k] ----- + case "get": + return t.indexToBracket(m, recv) + + // ----- Apply: recv.apply("name") -> name(recv) ----- + case "apply": + return t.applyToCall(m, recv) + + // ----- Numeric coercion: V1 .number() -> V2 .float64() ----- + case "number": + return t.rewrittenRename(m, recv, "float64", + Change{ + RuleID: RuleMethodDoesNotExist, + Severity: SeverityWarning, + Category: CategorySemanticChange, + Explanation: "V1 .number() is float64; V2 .float64() preserves that, but downstream code expecting int64 results may break", + }) + + // ----- Variadic .without("a","b","c") -> .without(["a","b","c"]) ----- + case "without": + return t.variadicArgsToArray(m, recv, "without") + // ----- Variadic .with(...) and .zip(...) follow the same pattern. + case "with", "zip": + return t.variadicArgsToArray(m, recv, m.Name) + // ----- V1 `.format(a, b, ...)` (variadic) -> V2 `.format([a, b, ...])`. + case "format": + return t.variadicArgsToArray(m, recv, "format") + + // ----- V1 timestamp method renames. + // V2 ts_format / ts_parse use strftime/strptime exclusively, so V1 + // callsites that already use the strftime/strptime variants rename + // directly. The V1 Go-layout variants (`format_timestamp`, + // `parse_timestamp`) cannot be auto-rewritten because V2 has no + // Go-layout method — flag with a Note instead. + case "ts_strftime", "format_timestamp_strftime": + return t.simpleRename(m, recv, "ts_format") + case "ts_strptime", "parse_timestamp_strptime": + return t.simpleRename(m, recv, "ts_parse") + case "format_timestamp", "parse_timestamp": + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 ." + m.Name + "() uses Go's reference-time layout; V2 ts_format / ts_parse use strftime/strptime — convert the format string and rename to ." + map[string]string{"format_timestamp": "ts_format", "parse_timestamp": "ts_parse"}[m.Name] + "() manually", + }) + return nil + case "format_timestamp_unix": + return t.simpleRename(m, recv, "ts_unix") + case "format_timestamp_unix_milli": + return t.simpleRename(m, recv, "ts_unix_milli") + case "format_timestamp_unix_micro": + return t.simpleRename(m, recv, "ts_unix_micro") + case "format_timestamp_unix_nano": + return t.simpleRename(m, recv, "ts_unix_nano") + + // ----- .find(value) -> .index_of(value) ----- + case "find": + return t.findValueToIndexOf(m, recv) + + // ----- V1 .fold single-param ctx-object lambda -> V2 two-param (tally, value) lambda + case "fold": + return t.foldContextToTwoParam(m, recv) + + // ----- .exists(path) -> (path != null).catch(false) ----- + case "exists": + return t.existsToNullCheck(m, recv) + + // ----- V2 .catch requires a lambda; V1 accepts a plain value ----- + case "catch": + return t.catchValueToLambda(m, recv) + + // ----- Flag known semantic divergences without rewriting ----- + case "length": + t.flagMethodDivergence(m, "V1 .length() on strings counts bytes; V2 counts codepoints (§14#40)") + return nil + case "or": + return t.orToOrPlusCatch(m, recv) + + // ----- V1 .merge is polymorphic (object OR array); V2 splits: + // .merge for objects, .concat for arrays. Detect array shape from + // the receiver / arg and rewrite; otherwise pass through + warn. + case "merge": + return t.mergePolymorphicRewrite(m, recv) + case "filter", "filter_entries", "all", "any": + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 " + "." + m.Name + "() accepts arrays and objects; V2 is strict about receiver type", + }) + return t.queryFormRename(m, recv, m.Name, nil) + case "find_by", "find_all_by": + // V1 .find_by / .find_all_by take a ParamQuery predicate where + // `this` and bare idents resolve as fields of the current + // element. V2 requires an explicit lambda. Wrap unconditionally. + return t.queryFormRename(m, recv, m.Name, nil) + case "sort_by": + return t.queryFormRename(m, recv, m.Name, nil) + case "unique": + // V1 .unique() with no args = identity comparison; with one arg + // it's a ParamQuery key extractor that needs wrapping. + if len(m.Args) == 1 { + return t.queryFormRename(m, recv, "unique", nil) + } + return nil + case "sum", "min", "max": + // V1 .sum/.min/.max are numeric-only and always return float64. + // V2 is typed (int64 stays int64) and .min/.max also accept + // strings (lexicographic). Flag both angles so downstream type + // comparisons and expected-error tests surface the divergence. + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 ." + m.Name + "() is numeric-only and returns float64; V2 preserves integer type and (for min/max) also accepts strings", + }) + return nil + case "sort": + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 .sort() accepts any element type but produces lexicographic ordering; V2 rejects non-scalar or non-numeric elements outright", + }) + return nil + case "reverse": + // V1 .reverse() errors on empty arrays/strings; V2 returns empty. + // V1 also rejects non-array/non-string types where V2 may be more + // lenient. + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 .reverse() errors on empty or non-sequence receivers; V2 returns the empty receiver", + }) + return nil + case "abs", "floor", "ceil", "round": + // V1 numeric methods return an untyped "number"; V2 preserves the + // typed variant (int64 stays int64, float64 stays float64). Runtime + // values compare equal but type-introspection / JSON serialisation + // differ. + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§14#5", + Explanation: "V1 ." + m.Name + "() returns an unspecified numeric type; V2 preserves int64/float64 — downstream code branching on .type() may behave differently", + }) + return nil + case "type": + // V1 .type() collapses int and float to "number"; V2 reports the + // precise "int64"/"float64"/"timestamp" strings. + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§13", + Explanation: "V1 .type() returns \"number\" for any integer/float; V2 reports int64/float64 separately (and timestamp as timestamp, not string)", + }) + return nil + case "parse_json", "parse_yaml": + // V1 returns all numbers as float64; V2 distinguishes int64 and + // float64 based on the serialised form. + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§13", + Explanation: "V1 ." + m.Name + "() returns all numbers as float64; V2 distinguishes int64 and float64 by serialised form", + }) + return nil + case "index_of": + // V1 .index_of on strings counts bytes; V2 counts codepoints. + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleStringLengthBytes, + SpecRef: "§14#40", + Explanation: "V1 .index_of() on strings counts bytes; V2 counts codepoints", + }) + return nil + case "string": + // V1 .string() on an integer-valued float64 formats as "5"; V2 + // preserves the float form and emits "5.0". + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 .string() strips trailing zeros from integer-valued floats; V2 preserves the float form (5.0 stays \"5.0\")", + }) + return nil + } + return nil +} + +// catchValueToLambda wraps V1 `.catch(value)` as V2 `.catch(_ -> value)`. +// V2's .catch takes a lambda receiving the error; V1 accepts either a value +// or a lambda. We wrap plain values unconditionally — if the V1 argument was +// already a lambda the wrap is redundant but harmless. +func (t *translator) catchValueToLambda(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + if len(m.Args) != 1 { + return nil + } + arg := m.Args[0].Value + // If already a V1 lambda, translate it 1:1 — no wrap needed. Emit a + // note: V1 passes the error message as a string; V2 passes an error + // object `{"what": msg}`, so handlers that concatenate or format the + // argument will produce different output. + if _, isLambda := arg.(*v1ast.Lambda); isLambda { + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleOrCatchesErrors, + SpecRef: "§12.2", + Explanation: "V1 .catch(err -> ...) receives the error message as a string; V2 receives an error object of shape {\"what\": msg}", + }) + return nil + } + value := t.translateExpr(arg) + if value == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleOrCatchesErrors, + SpecRef: "§12.2", + Explanation: "V1 .catch(value) wrapped in lambda for V2: .catch(_ -> value)", + }) + wrapped := &syntax.LambdaExpr{ + TokenPos: pos(m.NamePos), + Params: []syntax.Param{{Discard: true, Pos: pos(m.NamePos), SlotIndex: -1}}, + Body: &syntax.ExprBody{Result: value}, + } + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: "catch", + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{{Value: wrapped}}, + } +} + +// simpleRename emits a V2 MethodCallExpr with a different method name, all +// other fields identical. Counts as Exact coverage. +func (t *translator) simpleRename(m *v1ast.MethodCall, recv syntax.Expr, newName string) syntax.Expr { + args := t.translateArgs(m.Args) + t.rec.Exact() + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: newName, + MethodPos: pos(m.NamePos), + Args: args, + Named: m.Named, + } +} + +// flagMethodDivergence emits a SemanticChange Change without rewriting the +// method call itself. Useful for methods where V1 and V2 names match but +// behaviour legitimately differs — the migrator can't always tell at +// translate time whether the divergence applies, so warn unconditionally +// and let the caller audit. +func (t *translator) flagMethodDivergence(m *v1ast.MethodCall, reason string) { + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleStringLengthBytes, SpecRef: "§14#40", + Explanation: reason, + }) +} + +// rewrittenRename is simpleRename but emits a Change record describing the +// rewrite. +func (t *translator) rewrittenRename(m *v1ast.MethodCall, recv syntax.Expr, newName string, ch Change) syntax.Expr { + args := t.translateArgs(m.Args) + ch.Line = m.NamePos.Line + ch.Column = m.NamePos.Column + t.rec.Rewritten(ch) + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: newName, + MethodPos: pos(m.NamePos), + Args: args, + Named: m.Named, + } +} + +// indexToBracket translates `recv.index(n)` or `recv.get(k)` into V2's +// bracket indexing: recv[n] / recv[k]. Counts as Rewritten (idiom shift). +func (t *translator) indexToBracket(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + if len(m.Args) != 1 { + return nil + } + idx := t.translateExpr(m.Args[0].Value) + if idx == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleNoBracketIndexing, + SpecRef: "§14#10", + Explanation: "V1 ." + m.Name + "() rewritten as V2 [] indexing", + }) + // V2 [] is type-strict: an out-of-bounds array index or a non-whole + // float index errors where V1 silently returned null. Flag so the + // divergence surfaces if the receiver or index isn't statically safe. + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleNoBracketIndexing, + Explanation: "V1 " + "." + m.Name + "() returns null on missing key or out-of-bounds index; V2 errors on bounds/type mismatches", + }) + return &syntax.IndexExpr{ + Receiver: recv, + Index: idx, + LBracketPos: pos(m.NamePos), + } +} + +// variadicArgsToArray rewrites V1 variadic-style method calls +// `.NAME(a, b, c)` into V2 `.NAME([a, b, c])`. Used for V1 methods whose +// V2 counterpart was redefined to take a single array argument now that +// V2 rejects variadic plugins at compile time (without, with, zip). +// +// If the V1 call already passes a single array literal the rewrite is a +// no-op rename. +func (t *translator) variadicArgsToArray(m *v1ast.MethodCall, recv syntax.Expr, name string) syntax.Expr { + if len(m.Args) == 1 { + if _, ok := m.Args[0].Value.(*v1ast.ArrayLit); ok { + return t.simpleRename(m, recv, name) + } + } + elems := make([]syntax.Expr, 0, len(m.Args)) + for _, a := range m.Args { + v := t.translateExpr(a.Value) + if v == nil { + continue + } + elems = append(elems, v) + } + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 variadic ." + name + "(...) rewritten as V2 ." + name + "([...])", + }) + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: name, + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{{ + Value: &syntax.ArrayLiteral{LBracketPos: pos(m.NamePos), Elements: elems}, + }}, + } +} + +// queryFormRename translates a V1 method call whose final argument is a +// ParamQuery (V1 rebinds `this` and bare idents to the per-element +// context). When the V1 argument is already an explicit lambda we +// translate through 1:1; otherwise we synthesize a V2 lambda that +// rebinds `this` to a fresh parameter so the V2 surface (which requires +// an explicit lambda) sees the same effective predicate. +// +// `newName` selects the V2 method name (often the same as V1). If +// `note` is non-nil it is recorded as a Rewritten change describing the +// rename. +func (t *translator) queryFormRename(m *v1ast.MethodCall, recv syntax.Expr, newName string, note *Change) syntax.Expr { + args := make([]syntax.CallArg, 0, len(m.Args)) + wrapped := false + for i, a := range m.Args { + if i == len(m.Args)-1 { + lam, didWrap := t.translateQueryFormPredicate(a.Value, m.NamePos) + if lam == nil { + return nil + } + args = append(args, syntax.CallArg{Name: a.Name, Value: lam}) + wrapped = didWrap + continue + } + v := t.translateExpr(a.Value) + if v == nil { + return nil + } + args = append(args, syntax.CallArg{Name: a.Name, Value: v}) + } + switch { + case note != nil: + ch := *note + ch.Line = m.NamePos.Line + ch.Column = m.NamePos.Column + if wrapped { + ch.Explanation += "; V1 query-form wrapped as V2 (__v -> ...)" + } + t.rec.Rewritten(ch) + case wrapped: + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 ." + m.Name + "(query-form) wrapped as V2 ." + newName + "(__v -> ...) — V2 requires an explicit lambda", + }) + default: + t.rec.Exact() + } + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: newName, + MethodPos: pos(m.NamePos), + Args: args, + Named: m.Named, + } +} + +// translateQueryFormPredicate translates a single V1 ParamQuery argument. +// Returns the V2 lambda expression and a `wrapped` flag indicating whether +// a lambda had to be synthesized (true when the V1 source used the +// query form rather than an explicit lambda). +func (t *translator) translateQueryFormPredicate(arg v1ast.Expr, namePos v1ast.Pos) (syntax.Expr, bool) { + if _, ok := arg.(*v1ast.Lambda); ok { + return t.translateExpr(arg), false + } + const paramName = "__v" + t.pushScope(paramName) + t.pushThisRebind(paramName) + body := t.translateExpr(arg) + t.popThisRebind() + t.popScope() + if body == nil { + return nil, false + } + return &syntax.LambdaExpr{ + TokenPos: pos(namePos), + Params: []syntax.Param{{Name: paramName, Pos: pos(namePos), SlotIndex: -1}}, + Body: &syntax.ExprBody{Result: body}, + }, true +} + +// findValueToIndexOf rewrites V1 `.find(value)` (returns the index of the +// first matching element, or -1) to V2 `.index_of(value)` (same signature +// and semantics). V2's stdlib `find` exists but takes a lambda and returns +// the matching element — not a semantic match for V1's value-based find. +func (t *translator) findValueToIndexOf(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + if len(m.Args) != 1 { + return nil + } + needle := t.translateExpr(m.Args[0].Value) + if needle == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 .find(value) rewritten as V2 .index_of(value) (V2 .find takes a lambda and returns an element).", + }) + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: "index_of", + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{{Value: needle}}, + } +} + +// existsToNullCheck rewrites V1 `.exists()` into V2. V1 has two shapes: +// +// - `.exists(key)` on an object: checks for key presence -> V2 `.has_key(key)`. +// - `.exists()` on a value: non-null check -> V2 `(recv != null).catch(false)`. +func (t *translator) existsToNullCheck(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + // One-arg form is `has_key` on V2. + if len(m.Args) == 1 { + return t.rewrittenRename(m, recv, "has_key", + Change{ + RuleID: RuleMethodDoesNotExist, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + Explanation: "V1 .exists(key) rewritten as V2 .has_key(key)", + }) + } + if len(m.Args) != 0 { + t.rec.Unsupported(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 .exists() with more than one arg has no V2 rewrite", + }) + return nil + } + // Zero-arg form: recv != null, caught to false for non-null receivers + // with unreadable types. + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + Explanation: "V1 .exists() rewritten as (recv != null).catch(false)", + }) + neq := &syntax.BinaryExpr{ + Left: recv, + Op: syntax.NE, + OpPos: pos(m.NamePos), + Right: &syntax.LiteralExpr{TokenPos: pos(m.NamePos), TokenType: syntax.NULL, Value: "null"}, + } + return &syntax.MethodCallExpr{ + Receiver: neq, + Method: "catch", + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{{ + Value: &syntax.LiteralExpr{TokenPos: pos(m.NamePos), TokenType: syntax.FALSE, Value: "false"}, + }}, + } +} + +// applyToCall translates `recv.apply("mapName")` into V2 `mapName(recv)`. +// V1 maps take a single implicit receiver passed via apply; V2 maps are +// ordinary callables so the receiver becomes the first positional argument. +func (t *translator) applyToCall(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + if len(m.Args) != 1 { + return nil + } + // The argument should be a string literal naming the map. If it's + // something dynamic (e.g. .apply(this.kind)), V2 can't express the + // dynamic dispatch — flag as unsupported. + nameLit, ok := m.Args[0].Value.(*v1ast.Literal) + if !ok || (nameLit.Kind != v1ast.LitString && nameLit.Kind != v1ast.LitRawString) { + t.rec.Unsupported(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + RuleID: RuleUnsupportedConstruct, + Explanation: "V1 .apply() with dynamic map name has no V2 equivalent", + }) + return nil + } + // If the map lives in an imported namespace, qualify the V2 call. + namespace, known := t.mapNamespace[nameLit.Str] + if !known { + // V1 imports share a flat table so a map from a transitively + // imported file is reachable by bare name; V2 namespaces each + // import explicitly and doesn't re-export. If we can't resolve + // the name, emit the unqualified call and flag — the V2 output + // will compile-error at runtime pointing at the missing map. + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleImportStatement, + SpecRef: "§10.2", + Explanation: "V1 .apply(\"" + nameLit.Str + "\") resolves across transitive imports; V2 requires an explicit namespace — add `import \"x\" as ns` and call `ns::" + nameLit.Str + "()`", + }) + } + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMapDeclTranslation, + SpecRef: "§10.2", + Explanation: "V1 recv.apply(\"name\") rewritten as V2 name(recv)", + }) + // V2 enforces a runtime recursion-depth limit on map calls where V1 + // did not. Flag so recursive / mutually-recursive maps surface. + t.rec.Note(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategorySemanticChange, + RuleID: RuleMapDeclTranslation, + Explanation: "V2 enforces a runtime recursion-depth limit on map calls that V1 did not — deeply recursive maps may error in V2", + }) + return &syntax.CallExpr{ + TokenPos: pos(m.NamePos), + Name: nameLit.Str, + Namespace: namespace, + Args: []syntax.CallArg{{Value: recv}}, + } +} + +// foldContextToTwoParam rewrites V1 `.fold(init, ctx -> ...ctx.tally...ctx.value...)` +// into V2 `.fold(init, (tally, value) -> ...)`. +// +// V1's fold lambda receives a single context object with `.tally` and +// `.value` fields; V2 takes two explicit parameters. We walk the V1 body +// and replace `.tally` / `.value` field accesses +// with bare identifiers that resolve to the new V2 parameters, then +// assemble a two-param V2 lambda. If the body references the context +// parameter directly (not via .tally / .value) the shape isn't safely +// mechanical — we fall through to the default translation with a warning +// so the caller knows the V2 output will error at runtime. +func (t *translator) foldContextToTwoParam(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + if len(m.Args) != 2 { + return nil + } + lam, ok := m.Args[1].Value.(*v1ast.Lambda) + if !ok || lam.Discard { + // Map-ref or discard param — V1 also supports these but the shape + // isn't recognisable from here. Pass through; translator will emit + // V2 that errors and the warning surfaces the issue. + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§13", + Explanation: "V1 .fold() second argument must be a one-param lambda for automatic V1→V2 rewrite; manually convert to V2 .fold(init, (tally, value) -> ...)", + }) + return nil + } + + paramName := lam.Param + rewritten, unsafeRef := rewriteFoldContext(lam.Body, paramName) + if unsafeRef { + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§13", + Explanation: "V1 .fold() lambda references its context param outside .tally/.value; V2 has no single-value accessor — rewrite manually to use (tally, value) params", + }) + return nil + } + + // Translate the initial value and the rewritten body. The two synthetic + // V2 param names are pushed onto the scope stack so the rewritten bare + // `tally` / `value` idents resolve as lambda-param references rather + // than the default V1 bare-ident-to-input rewrite. + initial := t.translateExpr(m.Args[0].Value) + if initial == nil { + return nil + } + t.pushScope("tally", "value") + v2Body := t.translateExpr(rewritten) + t.popScope() + if v2Body == nil { + return nil + } + + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§13", + Explanation: "V1 .fold(init, ctx -> ...ctx.tally...ctx.value...) rewritten as V2 .fold(init, (tally, value) -> ...)", + }) + + lamPos := pos(lam.ParamPos) + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: "fold", + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{ + {Value: initial}, + {Value: &syntax.LambdaExpr{ + TokenPos: lamPos, + Params: []syntax.Param{ + {Name: "tally", Pos: lamPos, SlotIndex: -1}, + {Name: "value", Pos: lamPos, SlotIndex: -1}, + }, + Body: &syntax.ExprBody{Result: v2Body}, + }}, + }, + } +} + +// orToOrPlusCatch rewrites V1 `.or(x)` (which catches null AND errors) as +// V2 `.or(x).catch(_ -> x)` so both branches are preserved. Mirrors the +// `|` coalesce rewrite in translateBinary. +func (t *translator) orToOrPlusCatch(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + if len(m.Args) != 1 { + return nil + } + fallback := t.translateExpr(m.Args[0].Value) + if fallback == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleOrCatchesErrors, + SpecRef: "§12.2", + Explanation: "V1 .or() catches null AND errors; rewritten as V2 .or(x).catch(_ -> x) to preserve both paths", + }) + orCall := &syntax.MethodCallExpr{ + Receiver: recv, + Method: "or", + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{{Value: fallback}}, + } + catchLambda := &syntax.LambdaExpr{ + TokenPos: pos(m.NamePos), + Params: []syntax.Param{{Discard: true, Pos: pos(m.NamePos), SlotIndex: -1}}, + Body: &syntax.ExprBody{Result: fallback}, + } + return &syntax.MethodCallExpr{ + Receiver: orCall, + Method: "catch", + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{{Value: catchLambda}}, + } +} + +// mergePolymorphicRewrite handles V1 .merge(). V1 is polymorphic: +// +// - Object receiver + object arg → object-level merge (V2 .merge) +// - Array receiver + array arg → array concatenation (V2 .concat) +// +// V2 splits these into separate methods. When both the V1 receiver and +// the V1 argument have a statically-visible array shape (array literal +// or a known array-returning method call), we rewrite to `.concat`. +// Otherwise we leave the call as `.merge` and emit a warning. +func (t *translator) mergePolymorphicRewrite(m *v1ast.MethodCall, recv syntax.Expr) syntax.Expr { + if len(m.Args) == 1 && isArrayExpr(m.Recv) && isArrayExpr(m.Args[0].Value) { + // Rewrite to V2 .concat(arg). + arg := t.translateExpr(m.Args[0].Value) + if arg == nil { + return nil + } + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityInfo, Category: CategoryIdiomRewrite, + RuleID: RuleMethodDoesNotExist, + SpecRef: "§14#50", + Explanation: "V1 .merge() on array receiver+arg rewritten as V2 .concat() (V2 .merge is object-only)", + }) + return &syntax.MethodCallExpr{ + Receiver: recv, + Method: "concat", + MethodPos: pos(m.NamePos), + Args: []syntax.CallArg{{Value: arg}}, + } + } + // Default: pass through as .merge() with a warning. + t.rec.Rewritten(Change{ + Line: m.NamePos.Line, Column: m.NamePos.Column, + Severity: SeverityWarning, Category: CategorySemanticChange, + RuleID: RuleMethodDoesNotExist, SpecRef: "§14#50", + Explanation: "V1 .merge() is polymorphic (objects AND arrays); V2 .merge is object-only — use .concat(other) for arrays", + }) + return nil +} + +// isArrayExpr reports whether a V1 expression is statically known to +// produce an array value. Used by merge-polymorphic dispatch and any +// future receiver-shape rules. +func isArrayExpr(e v1ast.Expr) bool { + switch n := e.(type) { + case *v1ast.ArrayLit: + return true + case *v1ast.MethodCall: + switch n.Name { + case "map_each", "map", "filter", "filter_entries", + "sort", "sort_by", "unique", "reverse", "without", + "slice", "values", "keys", "enumerated", "flatten", + "find_all", "find_all_by", "collapse", "explode", + "concat": + return true + case "split": + // .split() on a string returns an array of strings. + return true + } + case *v1ast.FunctionCall: + if n.Name == "range" { + return true + } + case *v1ast.ParenExpr: + return isArrayExpr(n.Inner) + case *v1ast.IfExpr: + // Both branches must be arrays. + for _, b := range n.Branches { + if !isArrayExpr(b.Body) { + return false + } + } + if n.Else != nil && !isArrayExpr(n.Else) { + return false + } + return true + } + return false +} + +// rewriteFoldContext walks the V1 expression tree and replaces every +// `.tally` / `.value` field access with bare +// `tally` / `value` identifiers. The walk is in-place but the caller +// owns the V1 AST by this point (it's being discarded after translation). +// Returns (rewritten, unsafeRef) where unsafeRef is true when we found a +// reference to `` outside the .tally/.value pattern — the +// caller should bail on the rewrite in that case. +func rewriteFoldContext(e v1ast.Expr, paramName string) (v1ast.Expr, bool) { + unsafe := false + var walk func(v1ast.Expr) v1ast.Expr + walk = func(e v1ast.Expr) v1ast.Expr { + if e == nil { + return nil + } + switch n := e.(type) { + case *v1ast.Ident: + // Bare reference to the context param — cannot safely rewrite. + if n.Name == paramName { + unsafe = true + } + return n + case *v1ast.FieldAccess: + if id, ok := n.Recv.(*v1ast.Ident); ok && id.Name == paramName { + switch n.Seg.Name { + case "tally": + return &v1ast.Ident{Name: "tally", TokPos: id.TokPos} + case "value": + return &v1ast.Ident{Name: "value", TokPos: id.TokPos} + default: + // .something_else — unexpected, bail. + unsafe = true + return n + } + } + n.Recv = walk(n.Recv) + return n + case *v1ast.MethodCall: + n.Recv = walk(n.Recv) + for i := range n.Args { + n.Args[i].Value = walk(n.Args[i].Value) + } + return n + case *v1ast.FunctionCall: + for i := range n.Args { + n.Args[i].Value = walk(n.Args[i].Value) + } + return n + case *v1ast.MapExpr: + n.Recv = walk(n.Recv) + n.Body = walk(n.Body) + return n + case *v1ast.Lambda: + // A nested lambda shadowing paramName binds a fresh value; don't + // descend into it (the param inside is a different variable). + if n.Param == paramName { + return n + } + n.Body = walk(n.Body) + return n + case *v1ast.BinaryExpr: + n.Left = walk(n.Left) + n.Right = walk(n.Right) + return n + case *v1ast.UnaryExpr: + n.Operand = walk(n.Operand) + return n + case *v1ast.ParenExpr: + n.Inner = walk(n.Inner) + return n + case *v1ast.ArrayLit: + for i := range n.Elems { + n.Elems[i] = walk(n.Elems[i]) + } + return n + case *v1ast.ObjectLit: + for i := range n.Entries { + n.Entries[i].Key = walk(n.Entries[i].Key) + n.Entries[i].Value = walk(n.Entries[i].Value) + } + return n + case *v1ast.MetaCall: + n.Key = walk(n.Key) + return n + case *v1ast.IfExpr: + for i := range n.Branches { + n.Branches[i].Cond = walk(n.Branches[i].Cond) + n.Branches[i].Body = walk(n.Branches[i].Body) + } + n.Else = walk(n.Else) + return n + case *v1ast.MatchExpr: + n.Subject = walk(n.Subject) + for i := range n.Cases { + n.Cases[i].Pattern = walk(n.Cases[i].Pattern) + n.Cases[i].Body = walk(n.Cases[i].Body) + } + return n + } + // Literal, ThisExpr, RootExpr, VarRef, MetaRef — no child Expr to rewrite. + return e + } + return walk(e), unsafe +} diff --git a/internal/bloblang2/migrator/translator/methods_audit_test.go b/internal/bloblang2/migrator/translator/methods_audit_test.go new file mode 100644 index 000000000..143fcf6e9 --- /dev/null +++ b/internal/bloblang2/migrator/translator/methods_audit_test.go @@ -0,0 +1,257 @@ +package translator_test + +import ( + "encoding/json" + "fmt" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// TestMethodTranslationAudit is the systematic per-V1-method regression +// suite. For each V1 method with a translation rule (or a known 1:1 with +// V2), it runs: +// +// 1. Translate a minimal V1 snippet exercising the method. +// 2. Compile + execute the V2 output against a specific input. +// 3. Assert the V2 output equals the expected value BYTE-FOR-BYTE. +// +// There is NO "warning = free pass" escape hatch here — unlike the +// corpus regression test, this file validates that each rule produces +// the correct V2 mapping. A translator bug (like the .fold() context- +// param mis-rewrite that slipped through the corpus) would fail here. +// +// Categories: +// - renames: V1 method → V2 method with same signature +// - reshapes: V1 method → different V2 method call shape +// - polymorphic: V1 method accepted shape X or Y; V2 split +// - wrappers: V1 method wrapped with extra chain calls in V2 +func TestMethodTranslationAudit(t *testing.T) { + cases := []struct { + name string + v1 string + input any + want any + }{ + // --- Renames --- + { + name: "map_each on array -> map", + v1: `root = this.xs.map_each(x -> x * 2)`, + input: map[string]any{"xs": []any{1.0, 2.0, 3.0}}, + want: []any{2.0, 4.0, 6.0}, + }, + { + // V1 .map_each() on an object passes each VALUE (not an + // {key, value} entry) — so the translator's `.map_values` + // rewrite is correct. The test confirms the value-level + // transform round-trips. + name: "map_each on object literal -> map_values", + v1: `root = {"a": 1, "b": 2}.map_each(v -> v * 10)`, + input: map[string]any{}, + want: map[string]any{"a": int64(10), "b": int64(20)}, + }, + { + name: "enumerated -> enumerate", + v1: `root = this.xs.enumerated()`, + input: map[string]any{"xs": []any{"a", "b"}}, + want: []any{map[string]any{"index": int64(0), "value": "a"}, map[string]any{"index": int64(1), "value": "b"}}, + }, + { + name: "key_values -> iter", + v1: `root = this.obj.key_values()`, + input: map[string]any{"obj": map[string]any{"x": float64(1)}}, + want: []any{map[string]any{"key": "x", "value": float64(1)}}, + }, + { + name: "map_each_key -> map_keys", + v1: `root = this.obj.map_each_key(k -> k.uppercase())`, + input: map[string]any{"obj": map[string]any{"x": float64(1)}}, + want: map[string]any{"X": float64(1)}, + }, + + // --- Reshapes --- + { + name: ".index(n) -> [n]", + v1: `root = this.xs.index(1)`, + input: map[string]any{"xs": []any{"a", "b", "c"}}, + want: "b", + }, + { + name: ".get(key) -> [key]", + v1: `root = this.obj.get("x")`, + input: map[string]any{"obj": map[string]any{"x": float64(7)}}, + want: float64(7), + }, + { + name: ".apply('m') -> m(recv)", + v1: "map double { root = this * 2 }\nroot = 5.apply(\"double\")", + input: map[string]any{}, + want: int64(10), + }, + { + name: ".number() -> .float64()", + v1: `root = this.s.number()`, + input: map[string]any{"s": "3.14"}, + want: 3.14, + }, + + // --- Polymorphic splits --- + { + name: ".merge on arrays -> .concat", + v1: `root = this.a.map_each(x -> x).merge(this.b.map_each(x -> x))`, + input: map[string]any{"a": []any{float64(1), float64(2)}, "b": []any{float64(3)}}, + want: []any{float64(1), float64(2), float64(3)}, + }, + { + name: ".merge on objects stays .merge", + v1: `root = {"a": 1}.merge({"b": 2})`, + input: map[string]any{}, + want: map[string]any{"a": int64(1), "b": int64(2)}, + }, + + // --- Higher-order lambda shape rewrites --- + { + name: ".find(value) -> .index_of(value)", + v1: `root = this.xs.find("b")`, + input: map[string]any{"xs": []any{"a", "b", "c"}}, + want: int64(1), + }, + { + name: ".fold(init, ctx -> ctx.tally + ctx.value) -> .fold(init, (tally, value) -> tally + value)", + v1: `root = this.xs.fold(0, item -> item.tally + item.value)`, + input: map[string]any{"xs": []any{float64(1), float64(2), float64(3)}}, + want: float64(6), + }, + { + name: ".fold with merge pattern (GA4)", + v1: `root = this.xs.map_each(p -> {"key": p.k, "value": p.v}).fold({}, item -> item.tally.merge({(item.value.key): item.value.value}))`, + input: map[string]any{"xs": []any{map[string]any{"k": "a", "v": float64(1)}, map[string]any{"k": "b", "v": float64(2)}}}, + want: map[string]any{"a": float64(1), "b": float64(2)}, + }, + + // --- Error-catching operators --- + { + name: "| coalesce catches null", + v1: `root = this.missing | "default"`, + input: map[string]any{}, + want: "default", + }, + { + name: "| coalesce catches errors (out-of-bounds index)", + v1: `root = this.xs.0 | "fallback"`, + input: map[string]any{"xs": []any{}}, + want: "fallback", + }, + { + name: ".or() catches null", + v1: `root = this.missing.or("default")`, + input: map[string]any{}, + want: "default", + }, + { + name: ".or() catches errors (out-of-bounds index)", + v1: `root = this.xs.index(5).or("fallback")`, + input: map[string]any{"xs": []any{"a"}}, + want: "fallback", + }, + + // --- Variadic / array normalisation --- + { + name: ".without(a, b, c) -> .without([a, b, c])", + v1: `root = this.obj.without("x", "y")`, + input: map[string]any{"obj": map[string]any{"x": float64(1), "y": float64(2), "z": float64(3)}}, + want: map[string]any{"z": float64(3)}, + }, + + // --- exists rewrites --- + { + name: ".exists(key) on object -> .has_key(key)", + v1: `root = this.obj.exists("x")`, + input: map[string]any{"obj": map[string]any{"x": float64(1)}}, + want: true, + }, + + // --- .catch with value (not lambda) --- + { + name: ".catch(value) -> .catch(_ -> value)", + v1: `root = this.xs.index(5).catch("fallback")`, + input: map[string]any{"xs": []any{"a"}}, + want: "fallback", + }, + + // --- V1 recv.(body) map expression --- + { + name: "recv.(ctx -> body) -> recv.into(ctx -> body)", + v1: `root = 10.(n -> n + 1)`, + input: map[string]any{}, + want: int64(11), + }, + { + name: "recv.(body) un-named rebinds this -> recv.into(__this -> body)", + v1: `root = 10.(this * 2)`, + input: map[string]any{}, + want: int64(20), + }, + + // --- nothing() sentinel variations --- + { + name: "nothing() in collection -> deleted()", + v1: `root = [1, if false { 0 }, 3]`, + input: map[string]any{}, + want: []any{int64(1), int64(3)}, + }, + + // --- Bare identifier + path --- + { + name: "bare ident -> input.X (null-safe)", + v1: `root = message`, + input: map[string]any{"message": "hi"}, + want: "hi", + }, + } + + interp := &bloblang2.Interp{} + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + rep, err := translator.Migrate(tc.v1, translator.Options{MinCoverage: 0}) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + if rep.V2Mapping == "" { + t.Fatalf("empty V2 mapping") + } + compiled, err := interp.Compile(rep.V2Mapping, nil) + if err != nil { + t.Fatalf("V2 compile failed: %v\nV2 mapping:\n%s", err, rep.V2Mapping) + } + got, _, _, runErr := compiled.Exec(tc.input, nil) + if runErr != nil { + t.Fatalf("V2 runtime error: %v\nV2 mapping:\n%s", runErr, rep.V2Mapping) + } + if !jsonEqual(got, tc.want) { + gotJSON, _ := json.Marshal(got) + wantJSON, _ := json.Marshal(tc.want) + t.Fatalf("output mismatch:\n got: %s\n want: %s\nV2 mapping:\n%s", gotJSON, wantJSON, rep.V2Mapping) + } + }) + } +} + +// jsonEqual compares two Go values by JSON round-trip so numeric type +// differences (int64 vs float64) and map key ordering don't trip the +// test over implementation details we don't care about here. +func jsonEqual(a, b any) bool { + aj, err := json.Marshal(a) + if err != nil { + return false + } + bj, err := json.Marshal(b) + if err != nil { + return false + } + var an, bn any + _ = json.Unmarshal(aj, &an) + _ = json.Unmarshal(bj, &bn) + return fmt.Sprintf("%v", an) == fmt.Sprintf("%v", bn) +} diff --git a/internal/bloblang2/migrator/translator/migrate.go b/internal/bloblang2/migrator/translator/migrate.go new file mode 100644 index 000000000..404aa2d42 --- /dev/null +++ b/internal/bloblang2/migrator/translator/migrate.go @@ -0,0 +1,303 @@ +package translator + +import ( + "fmt" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// Migrate translates a V1 Bloblang mapping into V2. +// +// On success it returns a *Report containing: +// - the V2 mapping text, +// - a slice of Change records describing any semantic divergences, +// - a Coverage summary of how much of the input was successfully translated. +// +// Migrate returns an error only when the weighted coverage ratio falls below +// opts.MinCoverage (default 0.75). The returned error is *CoverageError and +// still carries the best-effort Report via its Report field. +// +// A zero Options value behaves like DefaultOptions(). +func Migrate(v1Source string, opts Options) (*Report, error) { + opts = applyDefaults(opts) + + // 1. Walk the closure of imports rooted at the main source. Both + // pre-populated Files and the FileResolver feed into the resulting + // fileSet; canonical keys serve as the identity for dedup and for + // Report.V2Files emission. + fs, err := buildFileSet(v1Source, opts) + if err != nil { + return nil, fmt.Errorf("migrator: walking import closure: %w", err) + } + + // 2. Translate every file in the closure (except the main source) + // from V1 to V2 so the main source's V2 sanity-check Parse can + // resolve its imports. v2Contents is canonical-keyed. + v2Contents, err := translateFiles(fs, opts) + if err != nil { + return nil, fmt.Errorf("migrator: translating imported file: %w", err) + } + + // 3. Project canonical-keyed V2 contents to V2-path-keyed contents + // for the sanity-check Parse, then translate the main V1 source. + parseFiles := projectToV2Paths(fs, v2Contents, opts.V2ImportPathRewriter) + rep, err := migrateSource(v1Source, "", opts, fs, v2Contents, parseFiles) + if err != nil { + return nil, err + } + rep.V2Files = v2Contents + + // 4. Coverage gate. + if cerr := checkCoverage(rep, opts.MinCoverage); cerr != nil { + return nil, cerr + } + return rep, nil +} + +// migrateSource is the core V1→V2 translation step for a single source string. +// parentKey is the canonical key of the file being translated, or empty for +// the main source. fs is the closure built upstream; v2Contents holds the +// already-translated V2 contents for every file in fs (canonical-keyed), +// used to inline `from "path"` whole-mapping replacements; v2Files is the +// path-keyed projection used by the post-translation sanity-check Parse. +// +// The post-translation sanity-check Parse is non-fatal: if it fails, we record +// an Unsupported Change tagged RuleEmittedInvalidV2 and still return the +// Report with the emitted text. Most V1-invalid inputs (chained comparisons, +// missing imports, duplicate namespaces) echo as V2 parse errors here — they +// are not translator bugs but honest V2 rejections of V1-invalid input, and +// should flow through to the caller's Compile for classification. +func migrateSource(v1Source, parentKey string, opts Options, fs *fileSet, v2Contents, v2Files map[string]string) (*Report, error) { + if v1Source == "" { + return newRecorder(opts).finalise(""), nil + } + + prog, err := v1ast.Parse(v1Source) + if err != nil { + return nil, fmt.Errorf("migrator: parsing V1 source: %w", err) + } + + // `from "path"` is V1's whole-mapping replace form. When the source + // is a single from statement and the closure walker has resolved + // the referenced file, inline the file's V2 content directly: the + // V2 source for this mapping is literally the migrated file. + if fromPath, ok := isFromOnlyProgram(prog); ok && fs != nil { + site := siteKey{parentKey: parentKey, importPath: fromPath} + if canonical, ok := fs.siteIndex[site]; ok { + if v2, ok := v2Contents[canonical]; ok { + rec := newRecorder(opts) + rec.Rewritten(Change{ + Line: prog.Pos.Line, Column: prog.Pos.Column, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleFromStatement, + Explanation: fmt.Sprintf(`V1 "from %q" inlined as V2 content of resolved file`, fromPath), + }) + return rec.finalise(v2), nil + } + } + } + + tr := &translator{ + rec: newRecorder(opts), + parentKey: parentKey, + fileSet: fs, + v2ImportPathRewriter: opts.V2ImportPathRewriter, + customMethodRules: opts.CustomMethodRules, + customFunctionRules: opts.CustomFunctionRules, + } + v2Prog := tr.translateProgram(prog) + v2Source := syntax.Print(v2Prog) + + if _, errs := syntax.Parse(v2Source, "", v2Files); len(errs) > 0 { + tr.rec.Note(Change{ + Line: 1, Column: 1, + Severity: SeverityError, + Category: CategoryUnsupported, + RuleID: RuleEmittedInvalidV2, + Explanation: fmt.Sprintf("emitted V2 failed to parse: %v", errs), + }) + } + + return tr.rec.finalise(v2Source), nil +} + +// isFromOnlyProgram reports whether prog consists of exactly one +// FromStmt with a string-literal path, and returns that path. +// Anything else (mixed statements, non-literal path) returns ok=false +// and falls through to the per-statement loop in translateProgram, +// which records FromStmt as Unsupported. +func isFromOnlyProgram(prog *v1ast.Program) (string, bool) { + if len(prog.Stmts) != 1 { + return "", false + } + from, ok := prog.Stmts[0].(*v1ast.FromStmt) + if !ok { + return "", false + } + lit, ok := from.Path.(*v1ast.Literal) + if !ok { + return "", false + } + return lit.Str, true +} + +// isFromOnlyV1Source parses a V1 source string and returns its from +// path if the source is from-only. Used by translateFiles to defer +// from-only files until their target has been translated. +func isFromOnlyV1Source(src string) (string, bool) { + prog, err := v1ast.Parse(src) + if err != nil { + return "", false + } + return isFromOnlyProgram(prog) +} + +// IsFromOnly reports whether v1Source consists of a single +// `from "path"` statement and returns the path string if so. Callers +// that want to special-case from-only sources before invoking Migrate +// (e.g. to rewrite a processor config to a file-backed form) can use +// this to detect the case ahead of time. +func IsFromOnly(v1Source string) (string, bool) { + return isFromOnlyV1Source(v1Source) +} + +// translateFiles migrates every file in fs.contents (excluding the +// main source — which migrateSource translates separately) from V1 to +// V2. The returned map is canonical-keyed: it's the shape we surface +// to callers via Report.V2Files. +// +// The outer loop is a fixpoint: files whose imports are all already +// translated complete first, their results feed the next round, and so on. +// This handles nested import chains (A imports B imports C) without +// recursing into Migrate. After the fixpoint settles, any files still +// pending (cycles, or files with unresolvable imports) get one final pass +// with all siblings visible — remaining errors are fatal. +func translateFiles(fs *fileSet, outerOpts Options) (map[string]string, error) { + if len(fs.contents) == 0 { + return nil, nil + } + innerOpts := outerOpts + innerOpts.MinCoverage = 0 + + // pending is keyed by canonical key. + pending := make(map[string]string, len(fs.contents)) + for k, v := range fs.contents { + pending[k] = v + } + // out is keyed by canonical key. + out := make(map[string]string, len(fs.contents)) + + for { + progress := false + for canonical, src := range pending { + // Defer from-only files whose target hasn't been translated + // yet — the from inlining inside migrateSource would + // otherwise fall through to the translateProgram path that + // records the FromStmt as Unsupported. + if fromPath, ok := isFromOnlyV1Source(src); ok { + site := siteKey{parentKey: canonical, importPath: fromPath} + if targetCanonical, ok := fs.siteIndex[site]; ok { + if _, ready := out[targetCanonical]; !ready { + continue + } + } + } + parseFiles := projectToV2Paths(fs, out, outerOpts.V2ImportPathRewriter) + rep, err := migrateSource(src, canonical, innerOpts, fs, out, parseFiles) + if err != nil || hasUnresolvedImport(rep) { + continue + } + out[canonical] = rep.V2Mapping + delete(pending, canonical) + progress = true + } + if !progress || len(pending) == 0 { + break + } + } + // Leftovers (cycles, or files with genuinely missing imports): one last + // pass with all translated siblings visible. Any remaining error is + // still fatal. + for canonical, src := range pending { + parseFiles := projectToV2Paths(fs, out, outerOpts.V2ImportPathRewriter) + rep, err := migrateSource(src, canonical, innerOpts, fs, out, parseFiles) + if err != nil { + return nil, fmt.Errorf("%s: %w", canonical, err) + } + out[canonical] = rep.V2Mapping + } + + return out, nil +} + +// projectToV2Paths maps canonical-keyed V2 contents into a map keyed by +// the V2 path strings that appear in V2 import statements (i.e. after +// V2ImportPathRewriter is applied). When no rewriter is set canonical +// keys equal V1 path strings equal V2 path strings, so the map is +// returned with canonical keys unchanged. +func projectToV2Paths(fs *fileSet, v2Contents map[string]string, rewriter V2ImportPathRewriter) map[string]string { + if rewriter == nil { + // Fast path: the V2 import statements in the emitted source carry + // the same path strings as the V1 source, and canonical keys + // equal those path strings whenever no FileResolver is in use. + // For FileResolver-driven flows the canonical keys are whatever + // the resolver returned; the V2 import statements still carry + // the original V1 path strings (no rewriter), so we project via + // siteIndex. + out := make(map[string]string, len(v2Contents)) + for canonical, content := range v2Contents { + out[canonical] = content + } + for site, canonical := range fs.siteIndex { + if c, ok := v2Contents[canonical]; ok { + out[site.importPath] = c + } + } + return out + } + out := make(map[string]string, len(v2Contents)) + for canonical, content := range v2Contents { + out[rewriter(canonical)] = content + } + for site, canonical := range fs.siteIndex { + if c, ok := v2Contents[canonical]; ok { + out[rewriter(site.importPath)] = c + } + } + return out +} + +// hasUnresolvedImport reports whether the report signals an emitted-invalid-V2 +// change, which during the fixpoint most likely means a sibling file hasn't +// been translated yet. The caller retries next round. +func hasUnresolvedImport(rep *Report) bool { + for _, c := range rep.Changes { + if c.RuleID == RuleEmittedInvalidV2 { + return true + } + } + return false +} + +// applyDefaults fills in zero-valued options with DefaultOptions(). +func applyDefaults(opts Options) Options { + if opts.MinCoverage == 0 { + opts.MinCoverage = 0.75 + } + return opts +} + +// checkCoverage returns a *CoverageError when the report's Ratio is below +// opts.MinCoverage. Returns nil otherwise. +func checkCoverage(rep *Report, minCoverage float64) error { + if rep.Coverage.Ratio >= minCoverage { + return nil + } + return &CoverageError{ + Coverage: rep.Coverage, + Min: minCoverage, + Report: rep, + } +} diff --git a/internal/bloblang2/migrator/translator/migrate_test.go b/internal/bloblang2/migrator/translator/migrate_test.go new file mode 100644 index 000000000..263833d5c --- /dev/null +++ b/internal/bloblang2/migrator/translator/migrate_test.go @@ -0,0 +1,110 @@ +package translator + +import ( + "errors" + "strings" + "testing" +) + +func TestMigrateEmptyInput(t *testing.T) { + rep, err := Migrate("", Options{}) + if err != nil { + t.Fatalf("empty input should succeed, got %v", err) + } + if rep.V2Mapping != "" { + t.Fatalf("empty V2 expected, got %q", rep.V2Mapping) + } + if rep.Coverage.Ratio != 1.0 { + t.Fatalf("empty coverage should be 1.0, got %v", rep.Coverage.Ratio) + } +} + +func TestMigrateSimpleRootToOutput(t *testing.T) { + rep, err := Migrate("root = this", Options{Verbose: true}) + if err != nil { + t.Fatalf("simple root->output should succeed: %v", err) + } + if rep.V2Mapping == "" { + t.Fatalf("expected non-empty V2 output") + } + // Should contain "output" and "input" since root/this are rewritten. + if !strings.Contains(rep.V2Mapping, "output") || !strings.Contains(rep.V2Mapping, "input") { + t.Fatalf("expected output/input in V2 text, got:\n%s", rep.V2Mapping) + } + if rep.Coverage.Ratio < 0.9 { + t.Fatalf("simple translation should be near-perfect coverage, got %v", rep.Coverage.Ratio) + } +} + +func TestMigrateArithmetic(t *testing.T) { + rep, err := Migrate("root = 1 + 2 * 3", Options{}) + if err != nil { + t.Fatalf("arithmetic translation should succeed: %v", err) + } + if !strings.Contains(rep.V2Mapping, "1") || !strings.Contains(rep.V2Mapping, "+") { + t.Fatalf("expected arithmetic preserved, got:\n%s", rep.V2Mapping) + } +} + +func TestCheckCoverageBelowThreshold(t *testing.T) { + rep := &Report{Coverage: Coverage{Total: 10, Translated: 5, Unsupported: 5, Ratio: 0.5}} + err := checkCoverage(rep, 0.75) + var ce *CoverageError + if !errors.As(err, &ce) { + t.Fatalf("expected *CoverageError, got %T: %v", err, err) + } + if ce.Report != rep { + t.Fatalf("expected CoverageError to carry the Report") + } + if ce.Min != 0.75 { + t.Fatalf("expected Min 0.75, got %v", ce.Min) + } +} + +func TestCheckCoverageMetsThreshold(t *testing.T) { + rep := &Report{Coverage: Coverage{Total: 10, Translated: 8, Rewritten: 2, Ratio: 0.98}} + if err := checkCoverage(rep, 0.75); err != nil { + t.Fatalf("above threshold should not error, got %v", err) + } +} + +func TestApplyDefaults(t *testing.T) { + opts := applyDefaults(Options{}) + if opts.MinCoverage != 0.75 { + t.Fatalf("default MinCoverage should be 0.75, got %v", opts.MinCoverage) + } + if opts.Mode != ModeMutation { + t.Fatalf("default Mode should be ModeMutation, got %v", opts.Mode) + } +} + +func TestMigrateModeMutationNoPrelude(t *testing.T) { + // Default mode: no `output = input` prelude. + rep, err := Migrate("root.v = 1", Options{Mode: ModeMutation}) + if err != nil { + t.Fatalf("mutation-mode translation should succeed: %v", err) + } + if strings.Contains(rep.V2Mapping, "output = input") { + t.Fatalf("mutation mode must not inject `output = input`; got:\n%s", rep.V2Mapping) + } +} + +func TestMigrateModeMappingInjectsPrelude(t *testing.T) { + // mapping mode: translator prepends `output = input` so the V2 + // result starts as the input document (matching V1 mapping's + // pass-through default). + rep, err := Migrate("root.v = 1", Options{Mode: ModeMapping, Verbose: true}) + if err != nil { + t.Fatalf("mapping-mode translation should succeed: %v", err) + } + if !strings.Contains(rep.V2Mapping, "output = input") { + t.Fatalf("mapping mode must inject `output = input`; got:\n%s", rep.V2Mapping) + } + // The prelude must be the *first* statement so subsequent field + // assignments build on top of the passed-through input. + idxPrelude := strings.Index(rep.V2Mapping, "output = input") + idxBody := strings.Index(rep.V2Mapping, "output.v") + if idxBody < idxPrelude { + t.Fatalf("prelude must precede the translated body; got:\n%s", rep.V2Mapping) + } +} diff --git a/internal/bloblang2/migrator/translator/property_test.go b/internal/bloblang2/migrator/translator/property_test.go new file mode 100644 index 000000000..f77398f84 --- /dev/null +++ b/internal/bloblang2/migrator/translator/property_test.go @@ -0,0 +1,127 @@ +package translator_test + +import ( + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// TestNeverPanics — Layer 5 (property). Feeding the translator arbitrary +// strings (including malformed V1) must always either return a Report or an +// error, never panic. This is the single most important robustness property. +func TestNeverPanics(t *testing.T) { + inputs := []string{ + "", + " ", + "\n", + "\x00", + "root", + "root =", + "root = ", + "root = \"unterminated", + "root = ???", + "{}", + strings.Repeat("root = this\n", 1000), + "let x = 1\n$x = 2", // invalid V1 (var reassignment) + "root.a =1", // invalid V1 (no whitespace around =) + "!!true", // invalid V1 (double-not) + "this[0]", // invalid V1 (bracket indexing) + "root = 5 / 0", // V1 compile-time divide-by-zero + `root = {5: "x"}`, // V1 compile-time invalid key + } + + for _, in := range inputs { + func() { + defer func() { + if r := recover(); r != nil { + t.Errorf("Migrate(%q) panicked: %v", in, r) + } + }() + _, _ = translator.Migrate(in, translator.Options{}) + }() + } +} + +// TestV2OutputAlwaysParses — Layer 5 (property). For every Migrate call that +// returns a Report (rather than an error), the V2 text must parse cleanly +// through syntax.Parse. Migrate itself enforces this internally, but a test +// double-checks the contract and guards against regressions. +func TestV2OutputAlwaysParses(t *testing.T) { + inputs := []string{ + `root = this`, + `root = 1 + 2`, + `root.foo = this.bar`, + `root = [1, 2, 3]`, + `root = {"a": 1}`, + `root = if true { "y" } else { "n" }`, + `let x = 5 +root = $x`, + `meta key = "v" +root.data = this`, + `root = this.x.or("default")`, + `root = [1,2,3].map_each(x -> x + 1)`, + } + for _, in := range inputs { + rep, err := translator.Migrate(in, translator.Options{}) + if err != nil { + t.Logf("skip non-parsing input: %q -> %v", in, err) + continue + } + if _, errs := syntax.Parse(rep.V2Mapping, "", nil); len(errs) > 0 { + t.Errorf("Migrate produced invalid V2 for %q:\nV2:\n%s\nerrors: %v", in, rep.V2Mapping, errs) + } + } +} + +// TestCoverageAlwaysComputed — Coverage.Ratio must be populated and +// in [0, 1] for every successful Migrate. +func TestCoverageAlwaysComputed(t *testing.T) { + inputs := []string{ + ``, + `root = this`, + `root.foo = 1`, + `let x = 1 +root = $x`, + } + for _, in := range inputs { + rep, err := translator.Migrate(in, translator.Options{}) + if err != nil { + continue + } + if rep.Coverage.Ratio < 0 || rep.Coverage.Ratio > 1 { + t.Errorf("Coverage.Ratio out of [0,1] for %q: %v", in, rep.Coverage.Ratio) + } + } +} + +// TestReportWellFormed — every Change record must carry non-empty Rule / non- +// zero position / non-empty Explanation. Catches rules that forget to set +// fields. +func TestReportWellFormed(t *testing.T) { + sources := []string{ + `root = this`, + `root = foo`, + `meta k = this.x | "fb"`, + `map foo { root = this * 2 } +root.v = (5).apply("foo")`, + } + for _, src := range sources { + rep, err := translator.Migrate(src, translator.Options{Verbose: true}) + if err != nil { + continue + } + for i, c := range rep.Changes { + if c.RuleID == translator.RuleUnknown { + t.Errorf("src=%q change[%d] has RuleUnknown", src, i) + } + if c.Line <= 0 { + t.Errorf("src=%q change[%d] has non-positive line %d", src, i, c.Line) + } + if c.Explanation == "" { + t.Errorf("src=%q change[%d] has empty Explanation", src, i) + } + } + } +} diff --git a/internal/bloblang2/migrator/translator/rules_test.go b/internal/bloblang2/migrator/translator/rules_test.go new file mode 100644 index 000000000..e5945e0d5 --- /dev/null +++ b/internal/bloblang2/migrator/translator/rules_test.go @@ -0,0 +1,556 @@ +package translator_test + +import ( + "errors" + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// TestRuleUnits — Layer 2. Each entry documents one translation rule with a +// representative V1 input, the expected V2 substring(s), and the RuleIDs the +// translator must emit. Substring matching keeps the tests insensitive to +// whitespace/layout drift; RuleID (rather than Explanation) assertions keep +// them insensitive to wording. +// +// When a single V1 construct legitimately emits more than one Change +// (e.g. the bare-expression shorthand emits both RuleRootToOutput and the +// per-expression rule), assert all of them via wantRules. +func TestRuleUnits(t *testing.T) { + for _, c := range ruleCases { + t.Run(c.name, func(t *testing.T) { + // MinCoverage 0.0001 bypasses applyDefaults' 0.75 fallback + // (which kicks in only when the value is literally 0). + // Mappings that translate to 100% Unsupported (`from`, + // `.apply(dynamic)`) still trip the CoverageError path; we + // unwrap the Report from the error for those cases. + rep, err := translator.Migrate(c.v1, translator.Options{ + Verbose: true, + MinCoverage: 0.0001, + }) + var cerr *translator.CoverageError + switch { + case err == nil: + // normal path; rep is populated + case errors.As(err, &cerr): + rep = cerr.Report + default: + t.Fatalf("Migrate(%q) failed: %v", c.v1, err) + } + for _, want := range c.wantV2 { + if !strings.Contains(rep.V2Mapping, want) { + t.Errorf("V2 output missing %q\nGot:\n%s", want, rep.V2Mapping) + } + } + for _, rule := range c.wantRules { + if !hasRule(rep.Changes, rule) { + t.Errorf("expected a Change with RuleID %s; got:\n%s", rule, changeList(rep.Changes)) + } + } + for _, rule := range c.notRules { + if hasRule(rep.Changes, rule) { + t.Errorf("did not expect a Change with RuleID %s; got:\n%s", rule, changeList(rep.Changes)) + } + } + }) + } +} + +type ruleCase struct { + name string + v1 string + wantV2 []string // substrings that must appear in the V2 output + wantRules []translator.RuleID + notRules []translator.RuleID // negative assertions +} + +// ruleCases is deliberately flat and verbose — each entry documents one rule, +// one shape. Add a new entry when a new RuleID is emitted. +var ruleCases = []ruleCase{ + // ----------------------------------------------------------------- + // Naming and shape rewrites. + // ----------------------------------------------------------------- + { + name: "root -> output (identity rename, no rule fires)", + v1: `root = "hi"`, + wantV2: []string{"output", `"hi"`}, + }, + { + name: "this -> input (read position)", + v1: `root = this`, + wantV2: []string{"output", "input"}, + wantRules: []translator.RuleID{translator.RuleThisToInput}, + }, + { + name: "this-target -> output (write position)", + v1: `this.foo = "bar"`, + wantV2: []string{"output.foo"}, + wantRules: []translator.RuleID{translator.RuleThisTargetToOutput}, + }, + { + name: "bare ident -> input.ident (null-safe)", + v1: `root = foo`, + wantV2: []string{"input?.foo"}, + wantRules: []translator.RuleID{translator.RuleBareIdentToInput}, + }, + { + name: "bare path target -> output.path", + v1: `foo.bar = 1`, + wantV2: []string{"output.foo.bar"}, + wantRules: []translator.RuleID{translator.RuleBarePathToOutput}, + }, + { + name: "bare expression mapping -> explicit output = expr", + v1: `"hi"`, + wantV2: []string{"output", `"hi"`}, + wantRules: []translator.RuleID{translator.RuleRootToOutput}, + }, + + // ----------------------------------------------------------------- + // Metadata. + // ----------------------------------------------------------------- + { + name: "meta target -> output@", + v1: `meta foo = "bar"`, + wantV2: []string{"output@", "foo"}, + wantRules: []translator.RuleID{translator.RuleMetaTargetToOutputMeta}, + }, + { + name: "meta(key) read -> input@[key]", + v1: `root = meta("k")`, + wantV2: []string{"input@"}, + wantRules: []translator.RuleID{translator.RuleMetaReadToInputMeta}, + }, + + // ----------------------------------------------------------------- + // Operators. + // ----------------------------------------------------------------- + { + name: "coalesce | -> .or()", + v1: `root = this.x | "fb"`, + wantV2: []string{".or(", `"fb"`}, + wantRules: []translator.RuleID{translator.RuleCoalescePrecedence}, + }, + { + name: "&& flags operand-typing divergence", + v1: `root = this.a && this.b`, + wantV2: []string{"&&"}, + wantRules: []translator.RuleID{translator.RuleAndOrSameLevel}, + }, + { + name: "== flags cross-type equality divergence", + v1: `root = this.a == 1`, + wantV2: []string{"=="}, + wantRules: []translator.RuleID{translator.RuleBoolNumberEquality}, + }, + { + name: "% flags float-truncation divergence", + v1: `root = this.x % 3`, + wantV2: []string{"%"}, + wantRules: []translator.RuleID{translator.RuleModuloFloatTruncation}, + }, + { + name: "/ flags int-division-returns-float divergence", + v1: `root = this.x / 2`, + wantV2: []string{"/"}, + wantRules: []translator.RuleID{translator.RuleIntDivReturnsFloat}, + }, + + // ----------------------------------------------------------------- + // Method rewrites & flags. + // ----------------------------------------------------------------- + { + name: "method rename: map_each(lambda) on array -> map", + v1: `root = [1,2,3].map_each(x -> x)`, + wantV2: []string{".map(x -> x)"}, + notRules: []translator.RuleID{translator.RuleCoalescePrecedence}, + }, + { + name: "method rename: map_each on object-literal receiver -> map_values", + v1: `root = {"a":1}.map_each(v -> v)`, + wantV2: []string{".map_values(v -> v)"}, + }, + { + name: "method rename: .index(n) -> [n]", + v1: `root = this.items.index(0)`, + wantV2: []string{"[0]"}, + wantRules: []translator.RuleID{translator.RuleNoBracketIndexing}, + }, + { + name: "method rename: .get(k) -> [k]", + v1: `root = this.obj.get("k")`, + wantV2: []string{`["k"]`}, + wantRules: []translator.RuleID{translator.RuleNoBracketIndexing}, + }, + { + name: ".number() -> .float64()", + v1: `root = "3.14".number()`, + wantV2: []string{".float64()"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: ".or() flags catches-errors divergence", + v1: `root = this.x.or("fb")`, + wantV2: []string{".or(", `"fb"`}, + wantRules: []translator.RuleID{translator.RuleOrCatchesErrors}, + }, + { + name: ".length() flags codepoints-vs-bytes divergence", + v1: `root = "héllo".length()`, + wantV2: []string{".length()"}, + wantRules: []translator.RuleID{translator.RuleStringLengthBytes}, + }, + { + name: ".catch(value) wrapped as lambda", + v1: `root = this.x.catch("fb")`, + wantV2: []string{".catch(_ ->"}, + wantRules: []translator.RuleID{translator.RuleOrCatchesErrors}, + }, + { + name: ".exists(key) -> .has_key(key)", + v1: `root = this.exists("a")`, + wantV2: []string{".has_key(", `"a"`}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: "variadic .without(a, b) -> .without([a, b])", + v1: `root = this.without("a", "b")`, + wantV2: []string{".without(", `"a"`, `"b"`, "[", "]"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: ".find(value) rewrites to .index_of(value)", + v1: `root = [1,2,3].find(2)`, + wantV2: []string{".index_of(", "2"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: ".type() flags number-vs-int64/float64 divergence", + v1: `root = this.x.type()`, + wantV2: []string{".type()"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: "find_by(query-form) wrapped as explicit V2 lambda", + v1: `root = this.items.find_by(this.id == 5)`, + wantV2: []string{".find_by(__v ->", "__v?.id == 5"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: "find_by(query-form bare ident) rebinds to lambda param", + v1: `root = this.items.find_by(name == "alice")`, + wantV2: []string{".find_by(__v ->", "__v?.name"}, + notRules: []translator.RuleID{translator.RuleBareIdentToInput}, + }, + { + name: "find_by(explicit lambda) translates 1:1", + v1: `root = this.items.find_by(v -> v.id == 5)`, + wantV2: []string{".find_by(v -> v?.id == 5)"}, + }, + { + name: "find_all_by(query-form) wrapped as explicit V2 lambda", + v1: `root = this.items.find_all_by(this.active)`, + wantV2: []string{".find_all_by(__v ->", "__v?.active"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: "filter(query-form) wrapped as explicit V2 lambda", + v1: `root = this.nums.filter(this > 10)`, + wantV2: []string{".filter(__v ->", "__v > 10"}, + }, + { + name: "sort_by(query-form) wrapped as explicit V2 lambda", + v1: `root = this.items.sort_by(this.priority)`, + wantV2: []string{".sort_by(__v ->", "__v?.priority"}, + }, + { + name: "unique(query-form) wrapped as explicit V2 lambda", + v1: `root = this.items.unique(this.id)`, + wantV2: []string{".unique(__v ->", "__v?.id"}, + }, + { + name: "all(query-form) wrapped as explicit V2 lambda", + v1: `root = this.nums.all(this > 0)`, + wantV2: []string{".all(__v ->", "__v > 0"}, + }, + + // ----------------------------------------------------------------- + // Batch 3 — message-coupled stdlib (P8 migrator coverage). + // ----------------------------------------------------------------- + { + name: `metadata("k") rewrites to input@["k"]`, + v1: `root = metadata("region")`, + wantV2: []string{"input@", `["region"]`}, + wantRules: []translator.RuleID{translator.RuleMetaReadToInputMeta}, + }, + { + name: "metadata() with no arg rewrites to input@", + v1: `root = metadata()`, + wantV2: []string{"input@"}, + wantRules: []translator.RuleID{translator.RuleMetaReadToInputMeta}, + }, + { + name: `meta("k") rewrites to input@["k"] with type-change Note`, + v1: `root = meta("region")`, + wantV2: []string{"input@", `["region"]`}, + wantRules: []translator.RuleID{translator.RuleMetaReadToInputMeta}, + }, + { + name: `root_meta("k") rewrites to output@["k"]`, + v1: `root.copy = root_meta("audit")`, + wantV2: []string{"output@", `["audit"]`}, + wantRules: []translator.RuleID{translator.RuleMetaReadToInputMeta}, + }, + { + name: "error() rewrites to error().what", + v1: `root.failed = error()`, + wantV2: []string{"error()", ".what"}, + }, + { + name: "errored() passes through", + v1: `root.failed = errored()`, + wantV2: []string{"errored()"}, + }, + { + name: "batch_index() passes through", + v1: `root.idx = batch_index()`, + wantV2: []string{"batch_index()"}, + }, + { + name: "content() passes through", + v1: `root.bytes = content()`, + wantV2: []string{"content()"}, + }, + + // ----------------------------------------------------------------- + // Variadic→array rewrites for V2 (with / zip mirror without). + // ----------------------------------------------------------------- + { + name: `variadic .with("a","b") -> .with(["a","b"])`, + v1: `root = this.with("a", "b")`, + wantV2: []string{".with(", `"a"`, `"b"`, "[", "]"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: `single-array .with([...]) passes through`, + v1: `root = this.with(["a", "b"])`, + wantV2: []string{".with([", `"a"`, `"b"`, "])"}, + }, + { + name: `variadic .zip(a, b) -> .zip([a, b])`, + v1: `root.foo = this.foo.zip(this.bar, this.baz)`, + wantV2: []string{".zip(", "[", "]"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + { + name: `variadic .format(a, b) -> .format([a, b])`, + v1: `root.s = "%s/%v".format(this.name, this.age)`, + wantV2: []string{".format(", "[", "]"}, + wantRules: []translator.RuleID{translator.RuleMethodDoesNotExist}, + }, + + // ----------------------------------------------------------------- + // Timestamp idiom shifts: V1 function-form -> V2 method-form. + // ----------------------------------------------------------------- + { + name: "ts.format_timestamp_strftime(fmt) -> ts.ts_format(fmt)", + v1: `root.iso = this.t.format_timestamp_strftime("%Y-%m-%d")`, + wantV2: []string{".ts_format(", `"%Y-%m-%d"`}, + }, + { + name: "ts.format_timestamp(fmt) flagged but not auto-rewritten", + v1: `root.iso = this.t.format_timestamp("2006-01-02")`, + wantV2: []string{".format_timestamp("}, + }, + { + name: "str.parse_timestamp_strptime(fmt) -> str.ts_parse(fmt)", + v1: `root.t = this.s.parse_timestamp_strptime("%Y-%m-%d")`, + wantV2: []string{".ts_parse(", `"%Y-%m-%d"`}, + }, + { + name: "ts.format_timestamp_unix() -> ts.ts_unix()", + v1: `root.epoch = this.t.format_timestamp_unix()`, + wantV2: []string{".ts_unix()"}, + }, + { + name: "ts.format_timestamp_unix_milli() -> ts.ts_unix_milli()", + v1: `root.epoch_ms = this.t.format_timestamp_unix_milli()`, + wantV2: []string{".ts_unix_milli()"}, + }, + { + name: "ts_strftime method renamed to ts_format", + v1: `root.iso = this.t.ts_strftime("2006-01-02")`, + wantV2: []string{".ts_format(", `"2006-01-02"`}, + }, + { + name: "ts_strptime method renamed to ts_parse", + v1: `root.t = this.s.ts_strptime("%Y-%m-%d")`, + wantV2: []string{".ts_parse(", `"%Y-%m-%d"`}, + }, + + // ----------------------------------------------------------------- + // Maps and imports. + // ----------------------------------------------------------------- + { + name: ".apply('name') -> name(recv)", + v1: "map double { root = this * 2 }\nroot.v = (5).apply(\"double\")", + wantV2: []string{"double(", "map double(in)"}, + wantRules: []translator.RuleID{translator.RuleMapDeclTranslation}, + }, + { + name: ".apply(dynamic) is unsupported", + v1: `root = (5).apply(this.name)`, + wantRules: []translator.RuleID{translator.RuleUnsupportedConstruct}, + }, + { + name: `from "file" is unsupported`, + v1: `from "helper.blobl"`, + wantRules: []translator.RuleID{translator.RuleFromStatement}, + }, + { + name: `import "file" -> namespaced V2 import`, + v1: "import \"helper.blobl\"\nroot.v = 1", + wantRules: []translator.RuleID{translator.RuleImportStatement}, + }, + { + name: "now() flags string-vs-timestamp divergence", + v1: `root = now()`, + wantV2: []string{"now()"}, + wantRules: []translator.RuleID{translator.RuleNowReturnsString}, + }, + + // ----------------------------------------------------------------- + // Control flow. + // ----------------------------------------------------------------- + { + name: "if-without-else flags nothing-sentinel divergence", + v1: `root = if true { 1 }`, + wantV2: []string{"if true"}, + wantRules: []translator.RuleID{translator.RuleIfNoElseNothing}, + }, + { + name: "subject-less match flags boolean-case divergence", + v1: `root = match { this.x > 0 => "pos", _ => "neg" }`, + wantV2: []string{"match"}, + wantRules: []translator.RuleID{translator.RuleMatchSubjectRebinds}, + }, + { + name: "let inside if-branch flags block-scope divergence", + v1: "if true { let x = 1 }\nroot.v = 1", + wantRules: []translator.RuleID{translator.RuleBlockScopedLet}, + }, + + // ----------------------------------------------------------------- + // Paths and indexing. + // ----------------------------------------------------------------- + { + name: "numeric path segment -> index expression", + v1: `root = this.items.0`, + wantV2: []string{"input", "[0]"}, + wantRules: []translator.RuleID{translator.RuleNoBracketIndexing}, + }, + + // ----------------------------------------------------------------- + // Sentinels. + // ----------------------------------------------------------------- + { + name: "nothing() at statement RHS -> V2 void()", + v1: `root = if this.x > 0 { this.x } else { nothing() }`, + wantV2: []string{"void()"}, + }, + { + name: "nothing() inside array literal -> V2 deleted()", + v1: `root.xs = [1, nothing(), 3]`, + wantV2: []string{"deleted()"}, + }, + { + name: "nothing() inside object literal value -> V2 deleted()", + v1: `root.obj = {"a": 1, "b": nothing()}`, + wantV2: []string{"deleted()"}, + }, + { + name: "nothing() inside let binding -> Unsupported (no V2 equivalent)", + v1: "let a = nothing()\nroot.v = 1", + wantRules: []translator.RuleID{translator.RuleUnsupportedConstruct}, + }, + + // ----------------------------------------------------------------- + // Error path: V2-invalid emission is a non-fatal Change. + // ----------------------------------------------------------------- + { + name: "chained comparison echoes as RuleEmittedInvalidV2", + v1: `root = 1 < 2 < 3`, + wantRules: []translator.RuleID{translator.RuleEmittedInvalidV2}, + }, + + // ----------------------------------------------------------------- + // Variables and lambdas. + // ----------------------------------------------------------------- + { + name: "let binding translates to $x declaration and reference", + v1: "let x = 1\nroot = $x", + wantV2: []string{"$x = 1", "output = $x"}, + }, + { + name: "lambda parameter scope respected (no bare-ident rewrite on param)", + v1: `root = [1,2,3].map_each(n -> n + 1)`, + wantV2: []string{".map(n -> n + 1)"}, + // The n inside the lambda body must NOT be rewritten as input.n. + notRules: []translator.RuleID{translator.RuleBareIdentToInput}, + }, + { + name: "this inside lambda resolves to outer context (no rebind)", + v1: `root = [1,2,3].map_each(_ -> this.scale)`, + wantV2: []string{"input?.scale"}, + }, + + // ----------------------------------------------------------------- + // Object/array literals. + // ----------------------------------------------------------------- + { + name: "object literal preserves string keys", + v1: `root = {"a": 1, "b": 2}`, + wantV2: []string{"output", `"a"`, `"b"`, "1", "2"}, + }, + { + name: "array literal preserves order", + v1: `root = [1, 2, 3]`, + wantV2: []string{"output", "1", "2", "3"}, + }, + + // ----------------------------------------------------------------- + // Empty / whitespace inputs (property-ish edge cases at unit scope). + // ----------------------------------------------------------------- + { + name: "empty mapping produces empty V2 (no changes)", + v1: ``, + wantV2: []string{}, + }, +} + +// hasRule reports whether any change in the slice has the given RuleID. +func hasRule(changes []translator.Change, id translator.RuleID) bool { + for _, c := range changes { + if c.RuleID == id { + return true + } + } + return false +} + +// changeList returns a human-readable summary of a Change slice for failing +// test output. +func changeList(changes []translator.Change) string { + var out strings.Builder + for _, c := range changes { + out.WriteString(" - ") + out.WriteString(c.RuleID.String()) + out.WriteString(" (") + out.WriteString(c.Severity.String()) + out.WriteString("): ") + out.WriteString(c.Explanation) + out.WriteString("\n") + } + return out.String() +} diff --git a/internal/bloblang2/migrator/translator/speccoverage_test.go b/internal/bloblang2/migrator/translator/speccoverage_test.go new file mode 100644 index 000000000..7a9c211d8 --- /dev/null +++ b/internal/bloblang2/migrator/translator/speccoverage_test.go @@ -0,0 +1,215 @@ +package translator_test + +import ( + "os" + "path/filepath" + "regexp" + "sort" + "strings" + "testing" +) + +// Layer 6 — spec-coverage lint. These tests don't verify translation +// correctness; they verify that the translator code, the rules_test +// assertions, and the V1 spec anchors stay in sync. A failure means +// something drifted: a RuleID was added and never emitted, a RuleID was +// emitted and never tested, or a §14#N reference points at a quirk number +// that doesn't exist in the spec. +// +// Run the tests, read the failure messages, fix the drift. Don't add an +// exemption to silence them without understanding what slipped. + +// TestRuleIDCoverage asserts the invariants: +// +// 1. Every RuleID declared in change.go is emitted somewhere in the +// translator source (translate.go / methods.go / migrate.go). +// RuleUnknown is the zero-value sentinel and is exempt. +// 2. Every RuleID emitted by the translator is referenced by at least +// one case in rules_test.go. +func TestRuleIDCoverage(t *testing.T) { + declared := declaredRuleIDs(t) + emitted := emittedRuleIDs(t) + tested := testedRuleIDs(t) + + var declaredMissing []string + for name := range declared { + if name == "RuleUnknown" { + continue + } + if !emitted[name] { + declaredMissing = append(declaredMissing, name) + } + } + sort.Strings(declaredMissing) + for _, name := range declaredMissing { + t.Errorf("RuleID %s is declared in change.go but never emitted — either wire it up or delete the constant", name) + } + + var testedMissing []string + for name := range emitted { + if !tested[name] { + testedMissing = append(testedMissing, name) + } + } + sort.Strings(testedMissing) + for _, name := range testedMissing { + t.Errorf("RuleID %s is emitted by the translator but no rules_test.go case asserts it — add a case in ruleCases", name) + } +} + +// TestSpec14Anchors asserts that every §14#N SpecRef cited in the +// translator source corresponds to an actual numbered quirk in +// bloblang_v1_spec.md §14. Catches typos (like §14#6 meant as §14#48) +// rather than coverage gaps. +func TestSpec14Anchors(t *testing.T) { + highest := countSpec14Quirks(t) + if highest < 50 { + t.Fatalf("spec quirk count (%d) looks implausible — broken parser?", highest) + } + for _, n := range spec14AnchorsInTranslator(t) { + if n < 1 || n > highest { + t.Errorf("SpecRef §14#%d cited by translator source is out of range; the spec has quirks 1-%d", n, highest) + } + } +} + +// declaredRuleIDs returns the set of RuleID constant names declared in +// change.go. Detection is regex-based against the const block; good +// enough for our single-file declaration convention. The type name +// RuleID itself is explicitly excluded. +func declaredRuleIDs(t *testing.T) map[string]bool { + t.Helper() + src := mustRead(t, "change.go") + // Isolate the RuleID const block — everything between + // `type RuleID int` and the closing `)` of its const declaration. + startRE := regexp.MustCompile(`type RuleID int`) + start := startRE.FindStringIndex(src) + if start == nil { + t.Fatalf("could not find `type RuleID int` in change.go") + } + rest := src[start[1]:] + end := strings.Index(rest, "\n)") + if end < 0 { + t.Fatalf("could not find end of RuleID const block") + } + block := rest[:end] + // Lines starting with a tab then "Rule..."; the bare type + // name RuleID has no such declaration inside this block. + re := regexp.MustCompile(`(?m)^\t(Rule[A-Z][A-Za-z0-9]*)\b`) + out := map[string]bool{} + for _, m := range re.FindAllStringSubmatch(block, -1) { + if m[1] == "RuleID" { + continue + } + out[m[1]] = true + } + return out +} + +// emittedRuleIDs returns the set of RuleID names referenced via the +// `RuleID: RuleXxx` struct-literal form in non-test translator sources. +// This matches the convention the Change constructor uses at every +// emission site. +func emittedRuleIDs(t *testing.T) map[string]bool { + t.Helper() + re := regexp.MustCompile(`RuleID:\s*(Rule[A-Z][A-Za-z0-9]*)\b`) + out := map[string]bool{} + for _, file := range []string{"translate.go", "statements.go", "expressions.go", "methods.go", "migrate.go", "change.go", "context.go"} { + src := mustRead(t, file) + for _, m := range re.FindAllStringSubmatch(src, -1) { + out[m[1]] = true + } + } + return out +} + +// testedRuleIDs returns the set of RuleIDs referenced in rules_test.go, +// either in wantRules (positive assertions) or notRules (negative). +func testedRuleIDs(t *testing.T) map[string]bool { + t.Helper() + src := mustRead(t, "rules_test.go") + // `translator.RuleXxx` — we assert only on qualified references + // inside the test file so we don't accidentally pick up the + // `translator.RuleID` type name. + re := regexp.MustCompile(`translator\.(Rule[A-Z][A-Za-z0-9]*)\b`) + out := map[string]bool{} + for _, m := range re.FindAllStringSubmatch(src, -1) { + if m[1] == "RuleID" { + continue + } + out[m[1]] = true + } + return out +} + +// countSpec14Quirks returns the highest numbered quirk in §14 of the V1 +// spec. Scans from the `## 14.` header down to the next top-level +// section and picks the largest `^N.` line. +func countSpec14Quirks(t *testing.T) int { + t.Helper() + path := filepath.Join("..", "bloblang_v1_spec.md") + raw, err := os.ReadFile(path) + if err != nil { + t.Fatalf("read V1 spec: %v", err) + } + // Isolate the §14 section. + startRE := regexp.MustCompile(`(?m)^##\s+14\.`) + endRE := regexp.MustCompile(`(?m)^##\s+15\.`) + startLoc := startRE.FindIndex(raw) + if startLoc == nil { + t.Fatalf("could not find §14 in V1 spec") + } + endLoc := endRE.FindIndex(raw[startLoc[1]:]) + if endLoc == nil { + t.Fatalf("could not find end of §14 in V1 spec") + } + section := string(raw[startLoc[0] : startLoc[1]+endLoc[0]]) + // Numbered quirks are `^N. **...` lines. + quirkRE := regexp.MustCompile(`(?m)^(\d+)\. \*\*`) + highest := 0 + for _, m := range quirkRE.FindAllStringSubmatch(section, -1) { + var n int + for _, r := range m[1] { + n = n*10 + int(r-'0') + } + if n > highest { + highest = n + } + } + return highest +} + +// spec14AnchorsInTranslator returns every §14#N number cited in the +// non-test translator source files. +func spec14AnchorsInTranslator(t *testing.T) []int { + t.Helper() + re := regexp.MustCompile(`§14#(\d+)`) + seen := map[int]bool{} + for _, file := range []string{"translate.go", "statements.go", "expressions.go", "methods.go", "migrate.go", "change.go", "context.go"} { + src := mustRead(t, file) + for _, m := range re.FindAllStringSubmatch(src, -1) { + var n int + for _, r := range m[1] { + n = n*10 + int(r-'0') + } + seen[n] = true + } + } + out := make([]int, 0, len(seen)) + for n := range seen { + out = append(out, n) + } + sort.Ints(out) + return out +} + +// mustRead reads a file from the translator package directory (where the +// test binary runs). +func mustRead(t *testing.T, name string) string { + t.Helper() + b, err := os.ReadFile(name) + if err != nil { + t.Fatalf("read %s: %v", name, err) + } + return string(b) +} diff --git a/internal/bloblang2/migrator/translator/statements.go b/internal/bloblang2/migrator/translator/statements.go new file mode 100644 index 000000000..51ebfa0af --- /dev/null +++ b/internal/bloblang2/migrator/translator/statements.go @@ -0,0 +1,488 @@ +package translator + +import ( + "fmt" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// translateStmt dispatches on concrete statement type. +func (t *translator) translateStmt(stmt v1ast.Stmt) syntax.Stmt { + switch s := stmt.(type) { + case *v1ast.Assignment: + return t.translateAssignment(s) + case *v1ast.LetStmt: + return t.translateLet(s) + case *v1ast.IfStmt: + return t.translateIfStmt(s) + case *v1ast.BareExprStmt: + return t.translateBareExpr(s) + default: + t.rec.Unsupported(Change{ + Line: stmt.NodePos().Line, Column: stmt.NodePos().Column, + RuleID: RuleUnsupportedConstruct, + Explanation: fmt.Sprintf("no translation rule for statement %T", stmt), + }) + return nil + } +} + +// translateStmts maps a slice of V1 statements to V2. Each V2 node inherits +// the V1 source's leading/trailing trivia (comments + blank lines). +func (t *translator) translateStmts(stmts []v1ast.Stmt) []syntax.Stmt { + var out []syntax.Stmt + for _, s := range stmts { + v2 := t.translateStmt(s) + if v2 != nil { + copyTrivia(s, v2) + out = append(out, v2) + } + } + return out +} + +// translateAssignment rewrites a V1 `target = expr` into V2 `target = expr`. +func (t *translator) translateAssignment(a *v1ast.Assignment) syntax.Stmt { + target := t.translateTarget(a.Target) + value := t.translateExpr(a.Value) + if target == nil || value == nil { + return nil + } + // Whether this assignment counts as Exact or Rewritten depends on whether + // translateTarget emitted a Change. The target helper records its own + // coverage for the target node; here we just count the assignment itself. + t.rec.Exact() + return &syntax.Assignment{ + TokenPos: pos(a.Pos), + Target: *target, + Value: value, + } +} + +// translateTarget translates the LHS of an assignment. +func (t *translator) translateTarget(tgt v1ast.AssignTarget) *syntax.AssignTarget { + switch tgt.Kind { + case v1ast.TargetRoot: + t.rec.Exact() + return &syntax.AssignTarget{ + Pos: pos(tgt.Pos), + Root: syntax.AssignOutput, + Path: t.translatePathSegments(tgt.Path), + } + case v1ast.TargetThis: + // V1 accepts `this = v` / `this.foo = v` but produces a literal "this" + // top-level key rather than aliasing to root (§14#72). V2 has no + // equivalent — translate to the most-likely-intended `output`. + t.rec.Rewritten(Change{ + Line: tgt.Pos.Line, Column: tgt.Pos.Column, + Severity: SeverityWarning, + Category: CategorySemanticChange, + RuleID: RuleThisTargetToOutput, + SpecRef: "§14#72", + Explanation: `V1 treats "this" at target position as a literal top-level key; translated to "output"`, + }) + return &syntax.AssignTarget{ + Pos: pos(tgt.Pos), + Root: syntax.AssignOutput, + Path: t.translatePathSegments(tgt.Path), + } + case v1ast.TargetBare: + // Bare-path target: `foo.bar = v` → `output.foo.bar = v`. + t.rec.Rewritten(Change{ + Line: tgt.Pos.Line, Column: tgt.Pos.Column, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleBarePathToOutput, + SpecRef: "§14#2", + Explanation: "bare-path assignment target rewritten with explicit output root", + }) + return &syntax.AssignTarget{ + Pos: pos(tgt.Pos), + Root: syntax.AssignOutput, + Path: t.translatePathSegments(tgt.Path), + } + case v1ast.TargetMeta: + t.rec.Rewritten(Change{ + Line: tgt.Pos.Line, Column: tgt.Pos.Column, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleMetaTargetToOutputMeta, + Explanation: "meta target translated to output@", + }) + return &syntax.AssignTarget{ + Pos: pos(tgt.Pos), + Root: syntax.AssignOutput, + MetaAccess: true, + Path: t.translatePathSegments(tgt.Path), + } + } + t.rec.Unsupported(Change{ + Line: tgt.Pos.Line, Column: tgt.Pos.Column, + RuleID: RuleUnsupportedConstruct, + Explanation: fmt.Sprintf("unknown target kind %v", tgt.Kind), + }) + return nil +} + +// translatePathSegments converts V1 path segments to V2 path segments on an +// AssignTarget. +func (t *translator) translatePathSegments(segs []v1ast.PathSegment) []syntax.PathSegment { + out := make([]syntax.PathSegment, 0, len(segs)) + for _, s := range segs { + out = append(out, syntax.PathSegment{ + Kind: syntax.PathSegField, + Name: s.Name, + Pos: pos(s.Pos), + }) + } + return out +} + +// translateLet rewrites `let x = expr` to V2 equivalent. V2 expresses +// variable declaration the same way at statement position. +func (t *translator) translateLet(l *v1ast.LetStmt) syntax.Stmt { + if l.NameQuoted { + // §7.2 quoted binding names with non-identifier characters are + // write-only in V1. Translate to an unquoted best-effort name if + // possible; emit a SemanticChange otherwise. + t.rec.Rewritten(Change{ + Line: l.Pos.Line, Column: l.Pos.Column, + Severity: SeverityWarning, + Category: CategorySemanticChange, + RuleID: RuleUnsupportedConstruct, + SpecRef: "§7.2 / §14#76", + Explanation: "V1 quoted let-binding name preserved verbatim in V2 (may not be readable)", + }) + } else { + t.rec.Exact() + } + t.pushCtx(ctxVarDeclRHS) + value := t.translateExpr(l.Value) + t.popCtx() + if value == nil { + return nil + } + return &syntax.Assignment{ + TokenPos: pos(l.Pos), + Target: syntax.AssignTarget{ + Pos: pos(l.NamePos), + Root: syntax.AssignVar, + VarName: l.Name, + }, + Value: value, + } +} + +// translateIfStmt rewrites statement-form if/else if/else. +func (t *translator) translateIfStmt(i *v1ast.IfStmt) syntax.Stmt { + t.rec.Exact() + out := &syntax.IfStmt{TokenPos: pos(i.Pos)} + for _, br := range i.Branches { + t.flagBranchLets(br.Body) + cond := t.translateExpr(br.Cond) + body := t.translateStmts(br.Body) + if cond == nil { + continue + } + out.Branches = append(out.Branches, syntax.IfBranch{Cond: cond, Body: body}) + } + if i.Else != nil { + t.flagBranchLets(i.Else) + out.Else = t.translateStmts(i.Else) + } + return out +} + +// flagBranchLets emits a SemanticChange for any `let` at the top of an +// if/else branch body. V1 leaks the binding into the enclosing mapping +// scope; V2 confines it to the branch. If the binding is referenced outside +// the branch the V2 output won't compile — we flag unconditionally so the +// divergence is surfaced whether or not that reference exists. +func (t *translator) flagBranchLets(body []v1ast.Stmt) { + for _, s := range body { + l, ok := s.(*v1ast.LetStmt) + if !ok { + continue + } + p := l.NodePos() + t.rec.Note(Change{ + Line: p.Line, Column: p.Column, + Severity: SeverityWarning, + Category: CategorySemanticChange, + RuleID: RuleBlockScopedLet, + SpecRef: "§11", + Explanation: "V1 let-bindings leak out of if/else branches; V2 scopes them per block. Move this declaration to the outer scope if the variable is used after the branch.", + }) + } +} + +// translateBareExpr handles `` as the sole statement of a mapping. V1 +// treats this as `root = `; we emit the equivalent V2 assignment. +func (t *translator) translateBareExpr(b *v1ast.BareExprStmt) syntax.Stmt { + t.rec.Rewritten(Change{ + Line: b.Pos.Line, Column: b.Pos.Column, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleRootToOutput, + SpecRef: "§7.4 / §14#16", + Explanation: "bare-expression mapping shorthand rewritten as explicit output assignment", + }) + val := t.translateExpr(b.Expr) + if val == nil { + return nil + } + return &syntax.Assignment{ + TokenPos: pos(b.Pos), + Target: syntax.AssignTarget{ + Pos: pos(b.Pos), + Root: syntax.AssignOutput, + }, + Value: val, + } +} + +// translateMapDecl translates `map foo { ... }` to V2. +// +// V1 map bodies are statement lists that assemble a `root` value; the map's +// implicit receiver is accessible as `this`. V2 map bodies are ExprBody: +// zero or more variable assignments followed by a single result expression, +// and parameters are explicit. +// +// The translation strategy: +// 1. Give the V2 map a single parameter named "in" — the receiver. +// 2. Rebind V1 `this` to that parameter inside the body. +// 3. Translate V1 `let` statements to V2 VarAssigns, kept in order. +// 4. Translate the last `root = expr` (or sole statement) as the Result. +// 5. If the body contains multiple `root.x = ...` assignments, assemble +// them into an object literal as the Result. +// 6. Otherwise (complex, unsupported shapes), flag and stub. +func (t *translator) translateMapDecl(m *v1ast.MapDecl) *syntax.MapDecl { + const paramName = "in" + + t.pushScope(paramName) + t.pushThisRebind(paramName) + defer t.popScope() + defer t.popThisRebind() + + body, ok := t.tryTranslateMapBody(m) + if !ok { + t.rec.Rewritten(Change{ + Line: m.Pos.Line, Column: m.Pos.Column, + Severity: SeverityWarning, Category: CategoryUnsupported, + RuleID: RuleMapDeclTranslation, + Explanation: "map body shape could not be translated; emitted stub returning input", + }) + body = &syntax.ExprBody{ + Result: &syntax.IdentExpr{ + TokenPos: pos(m.Pos), + Name: paramName, + SlotIndex: -1, + }, + } + } else { + t.rec.Exact() + } + + return &syntax.MapDecl{ + TokenPos: pos(m.Pos), + Name: m.Name, + Params: []syntax.Param{{Name: paramName, Pos: pos(m.Pos), SlotIndex: -1}}, + Body: body, + } +} + +// tryTranslateMapBody attempts to translate a V1 map body into a V2 ExprBody. +// Returns (body, true) on success; (nil, false) when the body shape isn't +// supported by the current rules (caller substitutes a stub). +func (t *translator) tryTranslateMapBody(m *v1ast.MapDecl) (*syntax.ExprBody, bool) { + out := &syntax.ExprBody{} + var rootAssigns []*v1ast.Assignment + var finalResult syntax.Expr + for _, stmt := range m.Body { + switch s := stmt.(type) { + case *v1ast.LetStmt: + val := t.translateExpr(s.Value) + if val == nil { + return nil, false + } + va := &syntax.VarAssign{ + TokenPos: pos(s.Pos), + Name: s.Name, + Value: val, + SlotIndex: -1, + } + copyTriviaTo(s, va) + out.Assignments = append(out.Assignments, va) + case *v1ast.Assignment: + // Only handle root-rooted assignments here. Other target kinds + // (meta, bare, this) aren't valid inside a map body per V1 + // semantics (§10.1), but the V1 parser may have accepted them — + // bail. + if s.Target.Kind != v1ast.TargetRoot { + return nil, false + } + if len(s.Target.Path) == 0 { + // Whole-root replacement: `root = expr`. Becomes the result + // directly, superseding any previous field-level asserts. + v := t.translateExpr(s.Value) + if v == nil { + return nil, false + } + finalResult = v + rootAssigns = nil + continue + } + // Field-level assignment: accumulate for object-literal + // construction. + rootAssigns = append(rootAssigns, s) + default: + // Unsupported map body statement kind. + return nil, false + } + } + + switch { + case finalResult != nil && len(rootAssigns) == 0: + out.Result = finalResult + case len(rootAssigns) > 0 && finalResult == nil: + // Build an object literal from the accumulated root. = v + // assignments. Only one-level paths are supported here; deeper + // paths would require nested objects which a future rule can add. + obj := &syntax.ObjectLiteral{LBracePos: pos(m.Pos)} + for _, a := range rootAssigns { + if len(a.Target.Path) != 1 { + return nil, false + } + v := t.translateExpr(a.Value) + if v == nil { + return nil, false + } + key := &syntax.LiteralExpr{ + TokenPos: pos(a.Target.Pos), TokenType: syntax.STRING, + Value: a.Target.Path[0].Name, + } + obj.Entries = append(obj.Entries, syntax.ObjectEntry{Key: key, Value: v}) + } + out.Result = obj + case finalResult == nil && len(rootAssigns) == 0: + // Empty map body: return input unchanged. + out.Result = &syntax.IdentExpr{TokenPos: pos(m.Pos), Name: "in", SlotIndex: -1} + default: + // `root = X` mixed with `root.y = Y`: ambiguous. Bail. + return nil, false + } + + return out, true +} + +// translateImport translates `import "path"` to V2. Assigns a synthetic +// namespace alias, records every map name in the imported file so that +// subsequent `.apply(name)` call sites can be qualified, and applies +// V2ImportPathRewriter (if set) to the emitted V2 path. +func (t *translator) translateImport(i *v1ast.ImportStmt) *syntax.ImportStmt { + lit, ok := i.Path.(*v1ast.Literal) + if !ok { + t.rec.Unsupported(Change{ + Line: i.Pos.Line, Column: i.Pos.Column, + RuleID: RuleImportStatement, + Explanation: "import path is not a string literal", + }) + return nil + } + + site := siteKey{parentKey: t.parentKey, importPath: lit.Str} + if t.fileSet != nil { + if _, unresolved := t.fileSet.unresolved[site]; unresolved { + t.rec.Unsupported(Change{ + Line: i.Pos.Line, Column: i.Pos.Column, + RuleID: RuleImportStatement, + Explanation: fmt.Sprintf("import path %q could not be resolved", lit.Str), + }) + return nil + } + } + + ns := namespaceFromPath(lit.Str) + // Record every map in the imported file under this namespace. + if content, ok := t.importedContent(site); ok { + if prog, err := v1ast.Parse(content); err == nil { + for _, m := range prog.Maps { + // Last import wins on map-name collision; V1 rejects this + // at parse but best-effort on our side. + t.mapNamespace[m.Name] = ns + } + } + } + t.rec.Rewritten(Change{ + Line: i.Pos.Line, Column: i.Pos.Column, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleImportStatement, + Explanation: `V1 import rewritten with synthetic V2 namespace alias`, + }) + v2Path := lit.Str + if t.v2ImportPathRewriter != nil { + v2Path = t.v2ImportPathRewriter(lit.Str) + } + return &syntax.ImportStmt{ + TokenPos: pos(i.Pos), + Path: v2Path, + Namespace: ns, + } +} + +// importedContent retrieves the V1 source of an imported file from the +// translator's fileSet. Returns ("", false) if the import is unresolved +// or the fileSet is nil. +func (t *translator) importedContent(site siteKey) (string, bool) { + if t.fileSet == nil { + return "", false + } + canonical, ok := t.fileSet.siteIndex[site] + if !ok { + return "", false + } + c, ok := t.fileSet.contents[canonical] + return c, ok +} + +// namespaceFromPath derives a V2 namespace alias from a V1 import path. It +// strips directories and the .blobl extension, leaves something identifier- +// safe, and falls back to "imported" for unusual shapes. +func namespaceFromPath(p string) string { + s := p + // Strip directory. + if idx := lastIndexByte(s, '/'); idx >= 0 { + s = s[idx+1:] + } + // Strip extension. + if idx := lastIndexByte(s, '.'); idx >= 0 { + s = s[:idx] + } + // Replace non-identifier characters with '_'. + var b []byte + for _, r := range s { + switch { + case r >= 'a' && r <= 'z', r >= 'A' && r <= 'Z', r >= '0' && r <= '9', r == '_': + b = append(b, byte(r)) + default: + b = append(b, '_') + } + } + if len(b) == 0 || (b[0] >= '0' && b[0] <= '9') { + return "imported" + } + return string(b) +} + +// lastIndexByte is a tiny replacement for strings.LastIndexByte to avoid +// an import just for one call. +func lastIndexByte(s string, c byte) int { + for i := len(s) - 1; i >= 0; i-- { + if s[i] == c { + return i + } + } + return -1 +} diff --git a/internal/bloblang2/migrator/translator/translate.go b/internal/bloblang2/migrator/translator/translate.go new file mode 100644 index 000000000..d42433f5e --- /dev/null +++ b/internal/bloblang2/migrator/translator/translate.go @@ -0,0 +1,317 @@ +package translator + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// translator holds the per-call state: the Change/Coverage recorder, a +// scope stack for lambda / named-capture parameter names, and helpers. +// +// Translation rules live in sibling files: +// +// - statements.go — statement-level translators (assignment, let, if, +// map decl, import). +// - expressions.go — expression-level translators (literals, paths, +// binary/unary operators, method/function calls, lambdas, match/if +// expressions). +// - methods.go — per-method V1→V2 rewrites dispatched from +// translateMethodCall. +type translator struct { + rec *recorder + // scopes is a stack of named context frames. An ident is resolved from + // the innermost frame outward; if not found, it falls back to the + // legacy V1 bare-ident form (this.). Each entry in a frame is + // the parameter name introduced by a lambda or .(name -> body). + scopes []scopeFrame + // thisRebindStack tracks names that V1 `this` should resolve to inside + // a V2 map body. When non-empty, the translator emits IdentExpr with + // the top of the stack instead of `input` for a V1 ThisExpr. + thisRebindStack []string + // mapNamespace maps a V1 map name to the V2 namespace it lives in. For + // locally-declared maps the namespace is "" (unqualified). For + // imported maps it is the alias assigned to the import statement. Used + // by `.apply("name")` rewrites to qualify the resulting V2 call. + mapNamespace map[string]string + // parentKey is the canonical key of the file currently being + // translated, or empty for the main V1 source. Looked up against + // fileSet.siteIndex when resolving import statements. + parentKey string + // fileSet is the closure of imports built by buildFileSet. Used by + // translateImport to resolve imported file contents (for map-name + // discovery and namespace tracking) and to flag unresolved imports. + fileSet *fileSet + // v2ImportPathRewriter, when non-nil, is applied to V1 import path + // strings before they are emitted into the V2 source. Default + // behaviour (nil) is identity. + v2ImportPathRewriter V2ImportPathRewriter + // ctxStack tracks the nearest enclosing construct that changes how + // sentinel values (nothing(), deleted()) should be emitted in V2. + // The top of the stack wins. See ctxKind for the enumerated contexts. + ctxStack []ctxKind + // customMethodRules and customFunctionRules carry the per-Migrator + // rule hooks supplied via Options. They are checked at the top of + // methodRewrite / functionRewrite so a custom rule shadows the + // matching built-in (custom-wins precedence — design P2). + customMethodRules map[string]MethodRuleHook + customFunctionRules map[string]FunctionRuleHook +} + +// ctxKind is a translator-side marker for the kind of position we're +// currently rendering into. Used by nothing() rewrites to choose between +// void() (skip assignment) and deleted() (elide from collection). +type ctxKind int + +const ( + // ctxCollectionLit is pushed while translating an element of an + // array or an object-entry value — positions where V1's nothing() + // silently elided in V1 and V2's deleted() serves the same role. + ctxCollectionLit ctxKind = iota + 1 + // ctxVarDeclRHS is pushed while translating the RHS of a `let $x = …` + // binding. V1 deletes the variable on nothing(); V2 errors on void + // in a var-decl RHS — there is no semantic-preserving translation. + ctxVarDeclRHS +) + +// pushCtx pushes a translation context kind. Pair with popCtx. +func (t *translator) pushCtx(k ctxKind) { t.ctxStack = append(t.ctxStack, k) } + +// popCtx removes the innermost context. +func (t *translator) popCtx() { + if n := len(t.ctxStack); n > 0 { + t.ctxStack = t.ctxStack[:n-1] + } +} + +// currentCtx returns the innermost context, or 0 if none. +func (t *translator) currentCtx() ctxKind { + if n := len(t.ctxStack); n > 0 { + return t.ctxStack[n-1] + } + return 0 +} + +// pushThisRebind makes V1 `this` translate to the given V2 identifier name +// (typically a map parameter) while the callback is active. +func (t *translator) pushThisRebind(name string) { t.thisRebindStack = append(t.thisRebindStack, name) } + +func (t *translator) popThisRebind() { + if n := len(t.thisRebindStack); n > 0 { + t.thisRebindStack = t.thisRebindStack[:n-1] + } +} + +func (t *translator) currentThisRebind() (string, bool) { + if n := len(t.thisRebindStack); n > 0 { + return t.thisRebindStack[n-1], true + } + return "", false +} + +// scopeFrame is one level of named-context bindings. +type scopeFrame struct { + names map[string]struct{} +} + +// pushScope adds a named-context frame. Callers must pair with popScope. +func (t *translator) pushScope(names ...string) { + frame := scopeFrame{names: map[string]struct{}{}} + for _, n := range names { + if n != "" && n != "_" { + frame.names[n] = struct{}{} + } + } + t.scopes = append(t.scopes, frame) +} + +// popScope removes the innermost frame. +func (t *translator) popScope() { + if len(t.scopes) == 0 { + return + } + t.scopes = t.scopes[:len(t.scopes)-1] +} + +// isBoundIdent reports whether name matches a named-context binding in any +// active scope. +func (t *translator) isBoundIdent(name string) bool { + for i := len(t.scopes) - 1; i >= 0; i-- { + if _, ok := t.scopes[i].names[name]; ok { + return true + } + } + return false +} + +// translateProgram walks a parsed V1 program and produces a V2 program. Every +// V1 node contributes to Coverage via recorder calls. +func (t *translator) translateProgram(p *v1ast.Program) *syntax.Program { + if t.mapNamespace == nil { + t.mapNamespace = map[string]string{} + } + // Register locally-declared map names first (unqualified namespace) so + // later .apply() calls resolve correctly. + for _, m := range p.Maps { + t.mapNamespace[m.Name] = "" + } + + out := &syntax.Program{} + + // ModeMapping prelude: V1 `mapping` starts `root` as the input + // document, whereas V2 `output` starts as `{}`. Prepend an explicit + // `output = input` so a V1 mapping whose statements only tweak + // individual fields continues to pass the input through. + if t.rec.opts.Mode == ModeMapping { + t.rec.Rewritten(Change{ + Line: 1, + Column: 1, + Severity: SeverityInfo, + Category: CategoryIdiomRewrite, + RuleID: RuleRootToOutput, + Explanation: "ModeMapping: prepended `output = input` to preserve V1 mapping pass-through default", + }) + out.Stmts = append(out.Stmts, &syntax.Assignment{ + TokenPos: syntax.Pos{Line: 1, Column: 1}, + Target: syntax.AssignTarget{ + Pos: syntax.Pos{Line: 1, Column: 1}, + Root: syntax.AssignOutput, + }, + Value: &syntax.InputExpr{TokenPos: syntax.Pos{Line: 1, Column: 1}}, + }) + } + + // Translate statements in original order, routing map decls and imports + // to the dedicated slices while keeping everything else in Stmts. Each + // V2 node inherits its V1 source's leading/trailing trivia. + for _, stmt := range p.Stmts { + switch s := stmt.(type) { + case *v1ast.MapDecl: + if m := t.translateMapDecl(s); m != nil { + copyTriviaTo(s, m) + out.Maps = append(out.Maps, m) + } + case *v1ast.ImportStmt: + if i := t.translateImport(s); i != nil { + copyTriviaTo(s, i) + out.Imports = append(out.Imports, i) + } + case *v1ast.FromStmt: + // `from "path"` replaces the whole mapping in V1 with zero V2 + // equivalent short of inlining the imported file. We flag and + // drop — caller should manually inline. + t.rec.Unsupported(Change{ + Line: s.Pos.Line, Column: s.Pos.Column, + RuleID: RuleFromStatement, + SpecRef: "§10.5 / §14#12", + Explanation: `V1 "from" replaces the whole mapping — inline the imported file manually`, + }) + default: + v2 := t.translateStmt(stmt) + if v2 != nil { + copyTrivia(stmt, v2) + out.Stmts = append(out.Stmts, v2) + } + } + } + + return out +} + +// pos converts a V1 position to a V2 position. Same structure, different +// package. +func pos(p v1ast.Pos) syntax.Pos { + return syntax.Pos{Line: p.Line, Column: p.Column} +} + +// ----------------------------------------------------------------------- +// Public helpers consumed by the public migrator package +// (public/bloblangv2/migrator). They are exposed so custom-rule +// implementations registered through that package can hook into the +// same machinery the built-in rules use without duplicating logic. +// ----------------------------------------------------------------------- + +// Translator is the surface a custom rule needs from the running +// translator: recursive expression translation, scope / this-rebind +// management, position translation, and the recorder hooks used to +// keep coverage stats honest when a rule fires. +type Translator interface { + // TranslateExpr recursively translates a V1 expression into a V2 + // expression. Returns nil when translation cannot proceed (the + // translator already emitted the appropriate diagnostic). + TranslateExpr(v1Expr v1ast.Expr) syntax.Expr + + // PushScope pushes a named-context frame onto the scope stack. + // Each name becomes a bound identifier in the rule's body + // translation. Pair with PopScope. + PushScope(names ...string) + // PopScope removes the innermost scope frame. + PopScope() + + // PushThisRebind makes V1 `this` translate to the given V2 + // identifier name (typically a synthesized lambda parameter) + // while the translation walks the rule's body. Pair with + // PopThisRebind. + PushThisRebind(name string) + // PopThisRebind removes the innermost this-rebinding. + PopThisRebind() + + // Pos translates a V1 source position into a V2 source position. + Pos(p v1ast.Pos) syntax.Pos + + // EmitChange records a Change in the report without touching + // coverage counters. Mirrors recorder.Note — used for + // supplementary diagnostics alongside a rule's Result. + EmitChange(ch Change) + + // RecordRewritten counts a Rewritten coverage entry and records + // the supplied Change. Bridges call this when a custom rule's + // Replace outcome lands. + RecordRewritten(ch Change) + + // RecordUnsupported counts an Unsupported coverage entry and + // records the supplied Change with Error severity. Bridges call + // this when a custom rule's Unsupported outcome lands. + RecordUnsupported(ch Change) +} + +// TranslateExpr is the exported form of translateExpr. +func (t *translator) TranslateExpr(v1Expr v1ast.Expr) syntax.Expr { + return t.translateExpr(v1Expr) +} + +// PushScope is the exported form of pushScope. +func (t *translator) PushScope(names ...string) { t.pushScope(names...) } + +// PopScope is the exported form of popScope. +func (t *translator) PopScope() { t.popScope() } + +// PushThisRebind is the exported form of pushThisRebind. +func (t *translator) PushThisRebind(name string) { t.pushThisRebind(name) } + +// PopThisRebind is the exported form of popThisRebind. +func (t *translator) PopThisRebind() { t.popThisRebind() } + +// Pos is the exported form of pos. +func (t *translator) Pos(p v1ast.Pos) syntax.Pos { return pos(p) } + +// EmitChange records a Change without bumping coverage counters. +func (t *translator) EmitChange(ch Change) { t.rec.Note(ch) } + +// RecordRewritten bumps the Rewritten counter and records ch. +func (t *translator) RecordRewritten(ch Change) { t.rec.Rewritten(ch) } + +// RecordUnsupported bumps the Unsupported counter and records ch. +func (t *translator) RecordUnsupported(ch Change) { t.rec.Unsupported(ch) } + +// MethodRuleHook is the dispatch shape for a custom V1 method-call +// translation rule. Returning handled=true and out=nil signals an +// Unsupported outcome (the translator records an Error-severity +// Change and emits a `// MIGRATION:` comment). Returning handled=true +// with a non-nil out replaces the V1 method call with that V2 +// expression. Returning handled=false falls through to the built-in +// rule for the same V1 method name (or the default 1:1 translation +// if no built-in matches). +type MethodRuleHook func(t Translator, m *v1ast.MethodCall, recv syntax.Expr) (out syntax.Expr, handled bool) + +// FunctionRuleHook is the function-call analogue of MethodRuleHook. +type FunctionRuleHook func(t Translator, f *v1ast.FunctionCall) (out syntax.Expr, handled bool) diff --git a/internal/bloblang2/migrator/translator/trivia.go b/internal/bloblang2/migrator/translator/trivia.go new file mode 100644 index 000000000..12bf9ed93 --- /dev/null +++ b/internal/bloblang2/migrator/translator/trivia.go @@ -0,0 +1,70 @@ +package translator + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// Every V1 statement carries a TriviaSet (comments + blank lines collected +// by the V1 parser). The translator copies that set onto the V2 node it +// produces so V2 output preserves the author's prose and pacing. +// +// Rules for edge cases: +// - When a V1 statement is dropped (translateStmt returns nil), its +// trivia is lost. A future pass can hoist it onto the next surviving +// statement if this becomes a pain point. +// - When a V1 statement expands into multiple V2 statements, leading +// should attach to the first emitted V2 node and trailing to the last. +// No current rule expands 1:N, but `attachLeadingTo` / `attachTrailingTo` +// helpers exist for when that arrives. +// - Synthesised V2 statements (e.g. ModeMapping's `output = input` +// prelude) get no V1 trivia. + +// triviaKindFromV1 maps V1 trivia kinds to V2. +func triviaKindFromV1(k v1ast.TriviaKind) syntax.TriviaKind { + if k == v1ast.TriviaBlankLine { + return syntax.TriviaBlankLine + } + return syntax.TriviaComment +} + +// convertTriviaList clones a V1 trivia slice into V2 form. +func convertTriviaList(in []v1ast.Trivia) []syntax.Trivia { + if len(in) == 0 { + return nil + } + out := make([]syntax.Trivia, len(in)) + for i, t := range in { + out[i] = syntax.Trivia{ + Kind: triviaKindFromV1(t.Kind), + Text: t.Text, + Pos: syntax.Pos{Line: t.Pos.Line, Column: t.Pos.Column}, + } + } + return out +} + +// copyTrivia copies the V1 statement's leading+trailing trivia onto the +// V2 statement. Safe to call with a nil V2 — no-op in that case. +func copyTrivia(src v1ast.Stmt, dst syntax.Stmt) { + if dst == nil { + return + } + srcTri := src.Trivia() + dstTri := dst.Trivia() + dstTri.Leading = append(dstTri.Leading, convertTriviaList(srcTri.Leading)...) + dstTri.Trailing = append(dstTri.Trailing, convertTriviaList(srcTri.Trailing)...) +} + +// copyTriviaTo copies trivia onto any node exposing a Trivia() accessor. +// Used when the V2 target is not a Stmt (e.g. VarAssign in an ExprBody, or +// a MapDecl exposed via prog.Maps). +func copyTriviaTo(src v1ast.Stmt, dst interface{ Trivia() *syntax.TriviaSet }) { + if dst == nil { + return + } + srcTri := src.Trivia() + dstTri := dst.Trivia() + dstTri.Leading = append(dstTri.Leading, convertTriviaList(srcTri.Leading)...) + dstTri.Trailing = append(dstTri.Trailing, convertTriviaList(srcTri.Trailing)...) +} diff --git a/internal/bloblang2/migrator/translator/trivia_test.go b/internal/bloblang2/migrator/translator/trivia_test.go new file mode 100644 index 000000000..d12581825 --- /dev/null +++ b/internal/bloblang2/migrator/translator/trivia_test.go @@ -0,0 +1,119 @@ +package translator_test + +import ( + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// TestTriviaPropagation is the end-to-end test for comment + blank-line +// preservation. For each case we run Migrate over a V1 source and assert +// the V2 output contains the expected formatting. +func TestTriviaPropagation(t *testing.T) { + cases := []struct { + name string + v1 string + wants []string // substrings required in V2 output, in order + }{ + { + name: "file-leading comment", + v1: `# top comment +root.a = this.a +`, + wants: []string{"# top comment\n", "output.a = input?.a"}, + }, + { + name: "comment between statements", + v1: `root.a = this.a +# why B +root.b = this.b +`, + wants: []string{"output.a = input?.a", "# why B\n", "output.b = input?.b"}, + }, + { + name: "trailing comment on same line", + v1: `root.a = this.a # inline reason` + "\n", + wants: []string{"output.a = input?.a # inline reason"}, + }, + { + name: "blank line between statements", + v1: `root.a = this.a + +root.b = this.b +`, + wants: []string{"output.a = input?.a\n\noutput.b = input?.b"}, + }, + { + name: "comment block + blank + comment", + v1: `# section A +root.a = this.a + +# section B +root.b = this.b +`, + wants: []string{"# section A\n", "output.a = input?.a", "# section B\n", "output.b = input?.b"}, + }, + { + name: "comment inside map body is preserved", + v1: `map double { + # the core math + let doubled = this * 2 + root = $doubled +} +root.x = 21.apply("double") +`, + wants: []string{"# the core math\n", "$doubled = in * 2"}, + }, + { + name: "comment before import is preserved", + v1: `# used for helpers +import "helpers.blobl" +root.x = "hi" +`, + wants: []string{"# used for helpers\n", `import "helpers.blobl"`}, + }, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + rep, err := translator.Migrate(tc.v1, translator.Options{MinCoverage: 0}) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + out := rep.V2Mapping + // Check each required substring appears in order. + idx := 0 + for _, want := range tc.wants { + j := strings.Index(out[idx:], want) + if j < 0 { + t.Fatalf("V2 output missing %q (or out of order).\nOutput:\n%s", want, out) + } + idx += j + len(want) + } + }) + } +} + +// TestTriviaMultipleBlankLinesCollapse — collapsing many blank lines into one. +func TestTriviaMultipleBlankLinesCollapse(t *testing.T) { + v1 := `root.a = this.a + + + +root.b = this.b +` + rep, err := translator.Migrate(v1, translator.Options{MinCoverage: 0}) + if err != nil { + t.Fatalf("Migrate: %v", err) + } + // Expect exactly one blank line between the two statements (three \n in a row: + // trailing \n from stmt 1, blank-line marker \n, then output.b line). + if !strings.Contains(rep.V2Mapping, "output.a = input?.a\n\noutput.b = input?.b") { + t.Errorf("expected exactly one blank line between stmts, got:\n%s", rep.V2Mapping) + } + // And NOT two blank lines. + if strings.Contains(rep.V2Mapping, "output.a = input?.a\n\n\noutput.b") { + t.Errorf("expected blank lines to be collapsed, got:\n%s", rep.V2Mapping) + } +} From 4f07a1d4d8f4180c7b2a901304768243826cdba7 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 28 Apr 2026 14:16:52 +0100 Subject: [PATCH 13/20] bloblang(v2): Add migrator benchmark harness and playground demo Adds internal/bloblang2/migrator/benchmark/, a corpus-wide V1 -> V2 migration benchmark suite with a coverage probe and migration smoke test that quantifies translator coverage and runtime parity against a curated V1 corpus. Adds internal/bloblang2/migrator/demo/, a small Go-served web playground that wires the migrator behind a UI with real V1 case studies. Useful for eyeballing translator output and for showing the behaviour of flagged SemanticChange entries. --- .../migrator/benchmark/benchmark_test.go | 370 +++++++++++ .../bloblang2/migrator/benchmark/corpus.go | 338 ++++++++++ .../migrator/benchmark/coverage_probe_test.go | 35 + .../bloblang2/migrator/benchmark/harness.go | 192 ++++++ .../benchmark/migration_smoke_test.go | 156 +++++ internal/bloblang2/migrator/demo/main.go | 471 ++++++++++++++ internal/bloblang2/migrator/demo/page.html | 611 ++++++++++++++++++ 7 files changed, 2173 insertions(+) create mode 100644 internal/bloblang2/migrator/benchmark/benchmark_test.go create mode 100644 internal/bloblang2/migrator/benchmark/corpus.go create mode 100644 internal/bloblang2/migrator/benchmark/coverage_probe_test.go create mode 100644 internal/bloblang2/migrator/benchmark/harness.go create mode 100644 internal/bloblang2/migrator/benchmark/migration_smoke_test.go create mode 100644 internal/bloblang2/migrator/demo/main.go create mode 100644 internal/bloblang2/migrator/demo/page.html diff --git a/internal/bloblang2/migrator/benchmark/benchmark_test.go b/internal/bloblang2/migrator/benchmark/benchmark_test.go new file mode 100644 index 000000000..ac417fd73 --- /dev/null +++ b/internal/bloblang2/migrator/benchmark/benchmark_test.go @@ -0,0 +1,370 @@ +package benchmark_test + +import ( + "fmt" + "math" + "os" + "runtime" + "sort" + "testing" + "time" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/benchmark" +) + +// BenchmarkCorpus emits one sub-benchmark per equivalent V1↔V2 case, +// timing each engine separately. Run with: +// +// go test ./internal/bloblang2/migrator/benchmark -bench BenchmarkCorpus -benchmem +// +// Each sub-benchmark name is "//(v1|v2)", so +// `benchstat` / `go test -bench` filters can isolate a specific case +// or engine. +func BenchmarkCorpus(b *testing.B) { + coll, err := benchmark.CollectDefault() + if err != nil { + b.Fatalf("collect corpus: %v", err) + } + if len(coll.Cases) == 0 { + b.Fatal("no equivalent cases found — corpus probably unreachable") + } + for _, c := range coll.Cases { + cpy := c + b.Run(cpy.Name+"/v1", func(b *testing.B) { + b.ReportAllocs() + for b.Loop() { + if _, err := cpy.V1.Exec(cpy.Input, cpy.InputMetadata); err != nil { + b.Fatal(err) + } + } + }) + b.Run(cpy.Name+"/v2", func(b *testing.B) { + b.ReportAllocs() + for b.Loop() { + if _, err := cpy.V2.Exec(cpy.Input, cpy.InputMetadata); err != nil { + b.Fatal(err) + } + } + }) + } +} + +// TestCorpusAnalysis measures V1 and V2 performance for every equivalent +// case in the corpus and prints per-case + aggregate statistics +// (geometric-mean speed-up, median, best, worst, allocation savings). +// The test does NOT enforce a performance floor — it passes as long as +// the corpus collection succeeds. +// +// Skipped in -short mode. The per-case time budget is controlled by the +// MIGRATOR_BENCH_BUDGET environment variable (default "100ms"); shorter +// values trade statistical stability for faster runs. +func TestCorpusAnalysis(t *testing.T) { + if testing.Short() { + t.Skip("corpus benchmark analysis is slow; skipped in -short mode") + } + budget := 100 * time.Millisecond + if s := os.Getenv("MIGRATOR_BENCH_BUDGET"); s != "" { + if d, err := time.ParseDuration(s); err == nil { + budget = d + } + } + + coll, err := benchmark.CollectDefault() + if err != nil { + t.Fatalf("collect corpus: %v", err) + } + t.Logf("accepted cases: %d", len(coll.Cases)) + t.Logf("skipped: %d", len(coll.Skips)) + t.Logf("per-case budget: %s per engine (set MIGRATOR_BENCH_BUDGET to override)", budget) + t.Logf("") + t.Logf("skips by reason:") + counts := coll.SkipCounts() + reasons := make([]string, 0, len(counts)) + for r := range counts { + reasons = append(reasons, string(r)) + } + sort.Strings(reasons) + for _, r := range reasons { + t.Logf(" %-26s %d", r, counts[benchmark.SkipReason(r)]) + } + t.Logf("") + + rows := make([]benchRow, 0, len(coll.Cases)) + for _, c := range coll.Cases { + cpy := c + v1 := timeExec(func() error { + _, err := cpy.V1.Exec(cpy.Input, cpy.InputMetadata) + return err + }, budget) + v2 := timeExec(func() error { + _, err := cpy.V2.Exec(cpy.Input, cpy.InputMetadata) + return err + }, budget) + if v1.ns == 0 || v2.ns == 0 { + continue + } + rows = append(rows, benchRow{ + name: cpy.Name, + v1ns: v1.ns, + v2ns: v2.ns, + v1alloc: v1.allocs, + v2alloc: v2.allocs, + v1bytes: v1.bytes, + v2bytes: v2.bytes, + }) + } + if len(rows) == 0 { + t.Skip("no equivalent cases to benchmark") + } + + // Per-case table, sorted by V2/V1 ns/op descending so slowdowns + // land at the top of the log (easiest to spot). + sort.Slice(rows, func(i, j int) bool { + return rows[i].v2OverV1() > rows[j].v2OverV1() + }) + t.Logf("per-case breakdown (sorted by V2/V1 ns/op, worst first):") + t.Logf("%-80s %12s %12s %8s %8s %8s %10s", "case", "v1 ns/op", "v2 ns/op", "v1/v2", "v1 allocs", "v2 allocs", "alloc Δ") + for _, r := range rows { + t.Logf("%-80s %12.0f %12.0f %7.2fx %8.0f %8.0f %9s", + truncateName(r.name, 80), + r.v1ns, r.v2ns, + r.v1OverV2(), + r.v1alloc, r.v2alloc, + formatPct(r.v2alloc-r.v1alloc, r.v1alloc)) + } + t.Logf("") + + // Aggregate. + t.Logf("aggregate (N=%d cases where V1 and V2 agree on output):", len(rows)) + speedups := extract(rows, func(r benchRow) float64 { return r.v1OverV2() }) + slowdowns := extract(rows, func(r benchRow) float64 { return r.v2OverV1() }) + sumV1Ns := sum(extract(rows, func(r benchRow) float64 { return r.v1ns })) + sumV2Ns := sum(extract(rows, func(r benchRow) float64 { return r.v2ns })) + sumV1Alloc := sum(extract(rows, func(r benchRow) float64 { return r.v1alloc })) + sumV2Alloc := sum(extract(rows, func(r benchRow) float64 { return r.v2alloc })) + sumV1Bytes := sum(extract(rows, func(r benchRow) float64 { return r.v1bytes })) + sumV2Bytes := sum(extract(rows, func(r benchRow) float64 { return r.v2bytes })) + + t.Logf(" V2 speed-up (v1/v2): geomean=%.2fx median=%.2fx min=%.2fx max=%.2fx p95(slowest)=%.2fx", + geomean(speedups), + median(speedups), + minf(speedups), + maxf(speedups), + percentile(slowdowns, 95), + ) + if sumV2Ns > 0 { + t.Logf(" V2 ns/op (summed): v1=%.0f v2=%.0f overall=%.2fx faster", + sumV1Ns, sumV2Ns, sumV1Ns/sumV2Ns) + } + t.Logf(" V2 allocs/op (summed): v1=%.0f v2=%.0f delta=%s", + sumV1Alloc, sumV2Alloc, formatPct(sumV2Alloc-sumV1Alloc, sumV1Alloc)) + t.Logf(" V2 B/op (summed): v1=%.0f v2=%.0f delta=%s", + sumV1Bytes, sumV2Bytes, formatPct(sumV2Bytes-sumV1Bytes, sumV1Bytes)) + + // Win/loss/tie split. + faster, slower, tied := 0, 0, 0 + for _, r := range rows { + switch { + case r.v2ns < r.v1ns*0.98: + faster++ + case r.v2ns > r.v1ns*1.02: + slower++ + default: + tied++ + } + } + t.Logf(" case split: V2 faster=%d V2 slower=%d tied=%d (±2%%)", faster, slower, tied) +} + +// benchRow is one per-case benchmark sample. +type benchRow struct { + name string + v1ns float64 + v2ns float64 + v1alloc float64 + v2alloc float64 + v1bytes float64 + v2bytes float64 +} + +func (r benchRow) v1OverV2() float64 { + if r.v2ns == 0 { + return math.Inf(1) + } + return r.v1ns / r.v2ns +} + +func (r benchRow) v2OverV1() float64 { + if r.v1ns == 0 { + return math.Inf(1) + } + return r.v2ns / r.v1ns +} + +// ----------------------------------------------------------------------- +// Helpers +// ----------------------------------------------------------------------- + +// timing bundles per-iteration ns/op, allocs/op, and B/op measured by +// timeExec. Zeroed result means "no measurement" and is skipped by the +// caller. +type timing struct { + ns float64 + allocs float64 + bytes float64 +} + +// timeExec measures exec under a fixed wall-clock budget. It runs a +// short warm-up / calibration phase to estimate iteration cost, then +// a measurement phase sized to fill `budget`. Memory stats come from +// runtime.MemStats (same source testing.B uses under the hood), with a +// forced GC before the measurement window so only per-iteration +// allocations count. +func timeExec(exec func() error, budget time.Duration) timing { + // Calibration: 3 iterations to estimate cost. + const calIter = 3 + for range calIter { + if err := exec(); err != nil { + return timing{} + } + } + start := time.Now() + for range calIter { + if err := exec(); err != nil { + return timing{} + } + } + per := time.Since(start) / calIter + if per <= 0 { + per = time.Microsecond + } + n := int(budget / per) + if n < 1 { + n = 1 + } + + // Measurement. + var m0, m1 runtime.MemStats + runtime.GC() + runtime.ReadMemStats(&m0) + start = time.Now() + for range n { + if err := exec(); err != nil { + return timing{} + } + } + elapsed := time.Since(start) + runtime.ReadMemStats(&m1) + + return timing{ + ns: float64(elapsed.Nanoseconds()) / float64(n), + allocs: float64(m1.Mallocs-m0.Mallocs) / float64(n), + bytes: float64(m1.TotalAlloc-m0.TotalAlloc) / float64(n), + } +} + +func formatPct(delta, base float64) string { + if base == 0 { + if delta == 0 { + return " 0%" + } + return " n/a" + } + p := delta / base * 100 + sign := "+" + if p < 0 { + sign = "" + } + return fmt.Sprintf("%s%5.1f%%", sign, p) +} + +func truncateName(s string, n int) string { + if len(s) <= n { + return s + } + return "…" + s[len(s)-n+1:] +} + +func extract(rows []benchRow, fn func(benchRow) float64) []float64 { + out := make([]float64, len(rows)) + for i, r := range rows { + out[i] = fn(r) + } + return out +} + +func sum(xs []float64) float64 { + s := 0.0 + for _, x := range xs { + s += x + } + return s +} + +func minf(xs []float64) float64 { + m := math.Inf(1) + for _, x := range xs { + if x < m { + m = x + } + } + return m +} + +func maxf(xs []float64) float64 { + m := math.Inf(-1) + for _, x := range xs { + if x > m { + m = x + } + } + return m +} + +func median(xs []float64) float64 { + if len(xs) == 0 { + return 0 + } + s := append([]float64(nil), xs...) + sort.Float64s(s) + n := len(s) + if n%2 == 1 { + return s[n/2] + } + return (s[n/2-1] + s[n/2]) / 2 +} + +func geomean(xs []float64) float64 { + if len(xs) == 0 { + return 0 + } + sumLn := 0.0 + count := 0 + for _, x := range xs { + if x > 0 { + sumLn += math.Log(x) + count++ + } + } + if count == 0 { + return 0 + } + return math.Exp(sumLn / float64(count)) +} + +// percentile returns the p-th percentile (0–100) using nearest-rank. +func percentile(xs []float64, p float64) float64 { + if len(xs) == 0 { + return 0 + } + s := append([]float64(nil), xs...) + sort.Float64s(s) + idx := int(math.Ceil(p/100*float64(len(s)))) - 1 + if idx < 0 { + idx = 0 + } + if idx >= len(s) { + idx = len(s) - 1 + } + return s[idx] +} diff --git a/internal/bloblang2/migrator/benchmark/corpus.go b/internal/bloblang2/migrator/benchmark/corpus.go new file mode 100644 index 000000000..088684ac6 --- /dev/null +++ b/internal/bloblang2/migrator/benchmark/corpus.go @@ -0,0 +1,338 @@ +package benchmark + +import ( + "fmt" + "os" + "path/filepath" + "runtime" + "sort" + "strings" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/spectest" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// Case is one benchmarkable V1↔V2 mapping pair: the V1 source, the +// translated V2 source, the input, and pre-compiled runners for both +// sides. Cases reach this form only when translation succeeded, both +// sides compile, both sides run without error, and their outputs are +// spectest.DeepEqual. +type Case struct { + // Name uniquely identifies this case within the corpus: "/[/]". Used as the sub-benchmark name. + Name string + // V1Source is the mapping text as it appears in the V1 corpus. + V1Source string + // V2Source is the translator.Migrate output. + V2Source string + // Input is the already-decoded test input (spectest.DecodeValue form). + Input any + // InputMetadata is the already-decoded input metadata (may be empty). + InputMetadata map[string]any + // Expected is the output both V1 and V2 produce. Stored for sanity + // checks during benchmark iterations; the harness doesn't re-check + // each loop because that would distort the timing. + Expected any + // V1 and V2 are the compiled runners, ready to call Exec() with the + // test input. + V1 *v1Runner + V2 *v2Runner +} + +// SkipReason explains why a case was excluded from the benchmark set. +type SkipReason string + +// SkipReason constants. Each one corresponds to a distinct gating +// failure in Collection.admit. The set is stable enough to group and +// count when emitting the analysis report. +const ( + SkipV1ParseFail SkipReason = "v1-parse-fail" + SkipV1ExecFail SkipReason = "v1-exec-fail" + SkipTranslateFail SkipReason = "translate-fail" + SkipV2CompileFail SkipReason = "v2-compile-fail" + SkipV2ExecFail SkipReason = "v2-exec-fail" + SkipOutputMismatch SkipReason = "output-mismatch" + SkipExpectsError SkipReason = "expects-error" + SkipNoInput SkipReason = "no-input" + SkipInputDecode SkipReason = "input-decode-fail" + SkipExpectedDelete SkipReason = "expects-delete" + SkipMultiCaseUnwound SkipReason = "multi-case" + SkipExplicitlyMarked SkipReason = "skip-marker" + SkipEmptyMapping SkipReason = "empty-mapping" + SkipOutputDecodeFail SkipReason = "output-decode-fail" + SkipInvalidTestCase SkipReason = "invalid-test-case" + SkipInputMetaDecode SkipReason = "input-metadata-decode-fail" + SkipNoOutputCheckBench SkipReason = "no-output-check" +) + +// SkipRecord records why a test case was rejected. Collected alongside +// the accepted Cases so the analysis report can explain the coverage. +type SkipRecord struct { + Name string + Reason SkipReason + Detail string +} + +// Collection is the result of scanning the corpus: the accepted cases +// plus the rejected ones bucketed by reason. +type Collection struct { + Cases []Case + Skips []SkipRecord +} + +// CollectDefault scans ../v1spec/tests relative to this source file. +func CollectDefault() (*Collection, error) { + _, thisFile, _, _ := runtime.Caller(0) + root := filepath.Join(filepath.Dir(thisFile), "..", "v1spec", "tests") + return Collect(root) +} + +// Collect walks root, loads every yaml file, and returns the +// benchmarkable subset of test cases along with skip records for the +// rest. The scan is deterministic (files sorted, cases in file order). +func Collect(root string) (*Collection, error) { + files, err := discoverFiles(root) + if err != nil { + return nil, err + } + out := &Collection{} + for _, path := range files { + tf, err := spectest.LoadFile(path) + if err != nil { + continue + } + rel, _ := filepath.Rel(root, path) + for i := range tf.Tests { + tc := &tf.Tests[i] + if len(tc.Cases) > 0 { + for j := range tc.Cases { + name := fmt.Sprintf("%s/%s/%s", rel, tc.Name, tc.Cases[j].Name) + out.admit(name, tc, &tc.Cases[j], tf.Files, path) + } + continue + } + name := fmt.Sprintf("%s/%s", rel, tc.Name) + out.admit(name, tc, nil, tf.Files, path) + } + } + return out, nil +} + +// admit attempts to add a test case to the collection. If any gate +// fails (rejected by V1, rejected by V2, outputs differ, …) the case +// lands on Skips with an explanatory reason instead. The gating runs +// translation + a sanity Exec on each side so the benchmark itself +// only contains known-equivalent pairs. +func (c *Collection) admit(name string, tc *spectest.TestCase, sub *spectest.Case, fileFiles map[string]string, yamlPath string) { + // Pull out the fields that can live on either the parent TestCase + // or a child Case. The Case form is read if present; otherwise the + // parent's fields govern. + mapping := tc.Mapping + input := tc.Input + inputMeta := tc.InputMetadata + hasError := tc.HasError + errStr := tc.Error + compileErr := tc.CompileError + deleted := tc.Deleted + noOutput := tc.NoOutputCheck + if sub != nil { + input = sub.Input + inputMeta = sub.InputMetadata + hasError = sub.HasError + errStr = sub.Error + deleted = sub.Deleted + noOutput = sub.NoOutputCheck + } + + // Skip-marker check runs first because many "skip: …" cases have no + // mapping by design — the author stripped the body when the case was + // known-incompatible. Attributing those to SkipEmptyMapping would + // understate the real reason for skipping. + if isSkipped(yamlPath, tc.Name) { + c.skip(name, SkipExplicitlyMarked, "v1spec skip marker") + return + } + // Filter out cases that don't represent a successful mapping run. + if strings.TrimSpace(mapping) == "" { + c.skip(name, SkipEmptyMapping, "") + return + } + if compileErr != "" { + c.skip(name, SkipExpectsError, "compile error expected") + return + } + if hasError || errStr != "" { + c.skip(name, SkipExpectsError, "runtime error expected") + return + } + if deleted { + c.skip(name, SkipExpectedDelete, "root deletion expected") + return + } + if noOutput { + c.skip(name, SkipNoOutputCheckBench, "") + return + } + + decodedInput, err := spectest.DecodeValue(input) + if err != nil { + c.skip(name, SkipInputDecode, err.Error()) + return + } + decodedMeta := map[string]any{} + if inputMeta != nil { + raw, err := spectest.DecodeValue(inputMeta) + if err != nil { + c.skip(name, SkipInputMetaDecode, err.Error()) + return + } + if m, ok := raw.(map[string]any); ok { + decodedMeta = m + } + } + + // 1. Compile V1. + files := mergeFileMaps(fileFiles, tc.Files) + v1, err := newV1Runner(mapping, files) + if err != nil { + c.skip(name, SkipV1ParseFail, err.Error()) + return + } + + // 2. Run V1 once to capture expected output. If V1 errors, this + // case isn't benchmarkable — V2 equivalence is meaningless. + v1Out, err := v1.Exec(decodedInput, decodedMeta) + if err != nil { + c.skip(name, SkipV1ExecFail, err.Error()) + return + } + + // 3. Translate V1 -> V2. + rep, err := translator.Migrate(mapping, translator.Options{MinCoverage: 0, Files: files}) + if err != nil { + c.skip(name, SkipTranslateFail, err.Error()) + return + } + + // 4. Compile V2. + v2, err := newV2Runner(rep.V2Mapping) + if err != nil { + c.skip(name, SkipV2CompileFail, err.Error()) + return + } + + // 5. Run V2 once and compare. + v2Out, err := v2.Exec(decodedInput, decodedMeta) + if err != nil { + c.skip(name, SkipV2ExecFail, err.Error()) + return + } + if ok, diff := spectest.DeepEqual(v1Out, v2Out); !ok { + c.skip(name, SkipOutputMismatch, diff) + return + } + + c.Cases = append(c.Cases, Case{ + Name: name, + V1Source: mapping, + V2Source: rep.V2Mapping, + Input: decodedInput, + InputMetadata: decodedMeta, + Expected: v1Out, + V1: v1, + V2: v2, + }) +} + +func (c *Collection) skip(name string, reason SkipReason, detail string) { + c.Skips = append(c.Skips, SkipRecord{Name: name, Reason: reason, Detail: detail}) +} + +// SkipCounts returns a count of skip records grouped by reason. +func (c *Collection) SkipCounts() map[SkipReason]int { + counts := map[SkipReason]int{} + for _, s := range c.Skips { + counts[s.Reason]++ + } + return counts +} + +// ----------------------------------------------------------------------- +// Corpus-file discovery helpers (shared with corpus_test.go patterns; +// duplicated here so the benchmark package has no test-package +// dependency). +// ----------------------------------------------------------------------- + +func discoverFiles(root string) ([]string, error) { + var out []string + err := filepath.Walk(root, func(path string, info os.FileInfo, err error) error { + if err != nil { + return err + } + if info.IsDir() { + return nil + } + if strings.HasSuffix(info.Name(), ".yaml") { + out = append(out, path) + } + return nil + }) + if err != nil { + return nil, err + } + sort.Strings(out) + return out, nil +} + +var skipCache = map[string]map[string]bool{} + +// isSkipped consults the corpus YAML for a `skip: …` line under the +// named test case. Mirrors the v1spec `skip:` extension to the shared +// spectest schema (which has no field for it in-struct). +func isSkipped(path, name string) bool { + skips, ok := skipCache[path] + if !ok { + skips = loadSkips(path) + skipCache[path] = skips + } + return skips[name] +} + +func loadSkips(path string) map[string]bool { + data, err := os.ReadFile(path) + if err != nil { + return nil + } + out := map[string]bool{} + lines := strings.Split(string(data), "\n") + current := "" + for _, l := range lines { + trim := strings.TrimSpace(l) + if strings.HasPrefix(trim, "- name:") { + current = strings.TrimSpace(strings.TrimPrefix(trim, "- name:")) + current = strings.Trim(current, `"`) + continue + } + if current != "" && strings.HasPrefix(trim, "skip:") { + out[current] = true + current = "" + } + } + return out +} + +// mergeFileMaps merges file-scoped and case-scoped import maps, with +// case-level entries winning on collision. +func mergeFileMaps(fileLevel, caseLevel map[string]string) map[string]string { + if len(fileLevel) == 0 && len(caseLevel) == 0 { + return nil + } + out := make(map[string]string, len(fileLevel)+len(caseLevel)) + for k, v := range fileLevel { + out[k] = v + } + for k, v := range caseLevel { + out[k] = v + } + return out +} diff --git a/internal/bloblang2/migrator/benchmark/coverage_probe_test.go b/internal/bloblang2/migrator/benchmark/coverage_probe_test.go new file mode 100644 index 000000000..cc892da8d --- /dev/null +++ b/internal/bloblang2/migrator/benchmark/coverage_probe_test.go @@ -0,0 +1,35 @@ +package benchmark_test + +import ( + "sort" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/benchmark" +) + +// TestCoverageProbe collects the corpus without running any benchmarks +// and prints a coverage breakdown. It's fast enough to run unconditionally +// so coverage regressions (e.g. a translator change that makes many +// previously-equivalent cases diverge) surface immediately. +func TestCoverageProbe(t *testing.T) { + coll, err := benchmark.CollectDefault() + if err != nil { + t.Fatalf("collect: %v", err) + } + total := len(coll.Cases) + len(coll.Skips) + t.Logf("corpus coverage:") + t.Logf(" total cases: %d", total) + t.Logf(" accepted (V1≡V2): %d (%.1f%%)", len(coll.Cases), 100*float64(len(coll.Cases))/float64(max(total, 1))) + t.Logf(" skipped: %d (%.1f%%)", len(coll.Skips), 100*float64(len(coll.Skips))/float64(max(total, 1))) + t.Logf("") + t.Logf("skip breakdown:") + counts := coll.SkipCounts() + reasons := make([]string, 0, len(counts)) + for r := range counts { + reasons = append(reasons, string(r)) + } + sort.Strings(reasons) + for _, r := range reasons { + t.Logf(" %-26s %d", r, counts[benchmark.SkipReason(r)]) + } +} diff --git a/internal/bloblang2/migrator/benchmark/harness.go b/internal/bloblang2/migrator/benchmark/harness.go new file mode 100644 index 000000000..1267aaa47 --- /dev/null +++ b/internal/bloblang2/migrator/benchmark/harness.go @@ -0,0 +1,192 @@ +// Package benchmark runs a comparative performance suite between the V1 +// Bloblang interpreter and the V2 Pratt interpreter over the full V1 +// corpus (../v1spec/tests). For every corpus case it translates the V1 +// mapping to V2, verifies the two runtimes agree on the output, and +// then benchmarks both sides. Only equivalent pairs contribute to the +// summary — divergent cases are counted but not timed. +// +// Run per-case numbers with: +// +// task test:go -- -bench BenchmarkCorpus ./internal/bloblang2/migrator/benchmark +// +// Run the aggregate analysis with: +// +// go test ./internal/bloblang2/migrator/benchmark -run TestCorpusAnalysis -v +// +// The analysis test prints a summary table (ok / skipped / per-case +// V2/V1 ratios) to t.Log; it does not gate the build. +package benchmark + +import ( + "errors" + "fmt" + + "github.com/redpanda-data/benthos/v4/internal/bloblang/mapping" + "github.com/redpanda-data/benthos/v4/internal/bloblang/query" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/message" + "github.com/redpanda-data/benthos/v4/internal/value" + "github.com/redpanda-data/benthos/v4/public/bloblang" + + // Side-effect imports: registers the impl/pure stdlib extensions that + // real V1 mappings depend on (ts_*, typed numeric coercers, etc.). + _ "github.com/redpanda-data/benthos/v4/internal/impl/pure" +) + +// v1Runner compiles a V1 mapping once and exposes a fast per-iteration +// Exec closure. It uses the internal *mapping.Executor directly (via +// XUnwrapper) so the AssignmentContext has a real message.Part and +// `meta x = y` writes don't error — matching the production pipeline. +type v1Runner struct { + exec *mapping.Executor +} + +// NewV1Runner is the exported wrapper around newV1Runner so tests in +// other packages can stand up a V1 evaluator with the benchmark +// harness's exact configuration (custom-importer threading included). +func NewV1Runner(src string, files map[string]string) (*V1Runner, error) { + r, err := newV1Runner(src, files) + if err != nil { + return nil, err + } + return &V1Runner{inner: r}, nil +} + +// V1Runner is the public handle returned by NewV1Runner. Exec runs the +// compiled V1 mapping against input + input metadata. +type V1Runner struct{ inner *v1Runner } + +// Exec runs the V1 mapping against input + metadata. +func (r *V1Runner) Exec(input any, meta map[string]any) (any, error) { + return r.inner.Exec(input, meta) +} + +// NewV2Runner is the exported wrapper around newV2Runner. +func NewV2Runner(src string) (*V2Runner, error) { + r, err := newV2Runner(src) + if err != nil { + return nil, err + } + return &V2Runner{inner: r}, nil +} + +// V2Runner is the public handle returned by NewV2Runner. +type V2Runner struct{ inner *v2Runner } + +// Exec runs the V2 mapping against input + metadata. +func (r *V2Runner) Exec(input any, meta map[string]any) (any, error) { + return r.inner.Exec(input, meta) +} + +// newV1Runner compiles a V1 mapping. The import map is threaded through +// the default bloblang environment's custom importer so corpus cases +// with `import "foo"` work. +func newV1Runner(src string, files map[string]string) (*v1Runner, error) { + env := bloblang.NewEnvironment() + if len(files) > 0 { + env = env.WithCustomImporter(func(name string) ([]byte, error) { + if content, ok := files[name]; ok { + return []byte(content), nil + } + return nil, fmt.Errorf("import %q not in test files", name) + }) + } + exe, err := env.Parse(src) + if err != nil { + return nil, err + } + uw, ok := exe.XUnwrapper().(interface{ Unwrap() *mapping.Executor }) + if !ok { + return nil, errors.New("v1 executor does not expose Unwrap()") + } + return &v1Runner{exec: uw.Unwrap()}, nil +} + +// Exec runs the V1 mapping against input + input metadata and returns +// the mapped value (never nil on success — V1's Nothing sentinel is +// mapped to the input to match V1's mapping-processor default). +func (r *v1Runner) Exec(input any, meta map[string]any) (any, error) { + part := message.NewPart(nil) + if input != nil { + part.SetStructured(input) + } + for k, v := range meta { + part.MetaSetMut(k, v) + } + vars := map[string]any{} + var newValue any = value.Nothing(nil) + ctx := query.FunctionContext{ + Maps: r.exec.Maps(), + Vars: vars, + Index: 0, + MsgBatch: message.Batch{part}, + NewMeta: part, + NewValue: &newValue, + }.WithValue(input) + if err := r.exec.ExecOnto(ctx, mapping.AssignmentContext{ + Vars: vars, + Meta: part, + Value: &newValue, + }); err != nil { + return nil, err + } + switch newValue.(type) { + case value.Delete: + return deletedSentinel{}, nil + case value.Nothing: + return input, nil + } + return newValue, nil +} + +// v2Runner compiles a V2 mapping once and exposes a fast per-iteration +// Run closure. The eval.Interpreter keeps its variable stack allocated +// across iterations so the benchmark exercises the steady-state cost of +// executing a compiled V2 program. +type v2Runner struct { + interp *eval.Interpreter +} + +// newV2Runner parses, optimizes, resolves, and instantiates an +// interpreter for a V2 source produced by the translator. +func newV2Runner(src string) (*v2Runner, error) { + prog, errs := syntax.Parse(src, "", nil) + if len(errs) > 0 { + return nil, fmt.Errorf("v2 parse: %v", errs) + } + syntax.Optimize(prog) + + methods, functions := eval.StdlibNames() + methodOpcodes, functionOpcodes := eval.StdlibOpcodes() + if rerrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methods, + Functions: functions, + MethodOpcodes: methodOpcodes, + FunctionOpcodes: functionOpcodes, + }); len(rerrs) > 0 { + return nil, fmt.Errorf("v2 resolve: %v", rerrs) + } + return &v2Runner{interp: eval.NewWithStdlib(prog)}, nil +} + +// Exec runs the V2 mapping against input + input metadata and returns +// the mapped value. The deletedSentinel value is returned when the V2 +// program sets `output = deleted()` so the equivalence check can match +// V1's Delete sentinel symmetrically. +func (r *v2Runner) Exec(input any, meta map[string]any) (any, error) { + out, _, deleted, err := r.interp.Run(input, meta) + if err != nil { + return nil, err + } + if deleted { + return deletedSentinel{}, nil + } + return out, nil +} + +// deletedSentinel stands in for V1's value.Delete and V2's deleted=true +// return path so Case equivalence can compare the two outcomes without +// pulling value.Delete into the equivalence predicate (V2 never produces +// that internal Go type). +type deletedSentinel struct{} diff --git a/internal/bloblang2/migrator/benchmark/migration_smoke_test.go b/internal/bloblang2/migrator/benchmark/migration_smoke_test.go new file mode 100644 index 000000000..d8299e430 --- /dev/null +++ b/internal/bloblang2/migrator/benchmark/migration_smoke_test.go @@ -0,0 +1,156 @@ +package benchmark_test + +import ( + "reflect" + "testing" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/benchmark" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + // Side-effect: register the public-bloblangv2 plugins so V2 sees + // format/with/zip/find_by/etc. through Environment. + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +// TestMigrationSmoke runs a small set of representative V1 mappings end +// to end: V1 source -> migrator -> V2 source, then executes both sides +// against the same input and asserts the outputs match. The cases +// deliberately exercise the recently-added translator rules (variadic +// methods, timestamp idiom shifts, query-form predicates, format +// rewrite) so a regression in any of those would surface here as a +// gating failure rather than as a quietly skipped corpus row. +// +// The V1 path runs through the public bloblang environment via the +// benchmark harness's V1Runner. The V2 path runs through +// bloblangv2.GlobalEnvironment so the public plugin registry (compress, +// with, zip, format, find_by, …) is in scope. Cases that depend on a +// bound message (batch_index, content, error, tracing_*) aren't +// covered here — they need a different harness because the V2 path is +// only reachable via Executor.QueryMessage. +func TestMigrationSmoke(t *testing.T) { + type smokeCase struct { + name string + v1 string + // input + meta drive both V1 and V2 evaluation. + input any + meta map[string]any + // wantV1 / wantV2 are checked independently. When wantV2 is + // nil it falls back to wantV1 (the common case where the two + // engines agree byte-for-byte). + wantV1 any + wantV2 any + } + + cases := []smokeCase{ + { + name: "format method (V1 variadic -> V2 array arg)", + v1: `root.s = "%s/%v".format(this.name, this.age)`, + input: map[string]any{"name": "lance", "age": int64(37)}, + wantV1: map[string]any{"s": "lance/37"}, + }, + { + name: "with method (V1 variadic -> V2 array arg)", + v1: `root = this.with("inner.a", "d")`, + input: map[string]any{"inner": map[string]any{"a": "first", "b": "second"}, "d": "fourth", "e": "fifth"}, + wantV1: map[string]any{"d": "fourth", "inner": map[string]any{"a": "first"}}, + }, + { + name: "zip method (V1 variadic -> V2 array arg)", + v1: `root.foo = this.foo.zip(this.bar, this.baz)`, + input: map[string]any{"foo": []any{"a", "b"}, "bar": []any{int64(1), int64(2)}, "baz": []any{int64(4), int64(5)}}, + wantV1: map[string]any{"foo": []any{[]any{"a", int64(1), int64(4)}, []any{"b", int64(2), int64(5)}}}, + }, + { + name: "without method (V1 variadic -> V2 array arg)", + v1: `root = this.without("b")`, + input: map[string]any{"a": int64(1), "b": int64(2), "c": int64(3)}, + wantV1: map[string]any{"a": int64(1), "c": int64(3)}, + }, + { + // V1 find_by returns Go's int; V2 stores it as int64. The + // integer value is the same, but reflect.DeepEqual is + // type-strict — assert the typed value per engine. + name: "find_by query-form -> V2 explicit lambda", + v1: `root.idx = this.items.find_by(this.id == 2)`, + input: map[string]any{"items": []any{map[string]any{"id": int64(1)}, map[string]any{"id": int64(2)}, map[string]any{"id": int64(3)}}}, + wantV1: map[string]any{"idx": 1}, + wantV2: map[string]any{"idx": int64(1)}, + }, + { + name: "filter query-form -> V2 explicit lambda", + v1: `root = this.nums.filter(this > 2)`, + input: map[string]any{"nums": []any{int64(1), int64(2), int64(3), int64(4)}}, + wantV1: []any{int64(3), int64(4)}, + }, + { + name: "ts_strftime method renamed to ts_format", + v1: `root.iso = this.t.parse_timestamp_strptime("%Y-%m-%dT%H:%M:%SZ").ts_strftime("%Y-%m-%d")`, + input: map[string]any{"t": "2020-08-14T05:54:23Z"}, + wantV1: map[string]any{"iso": "2020-08-14"}, + }, + { + name: "ts.format_timestamp_unix() -> ts.ts_unix()", + v1: `root.epoch = this.t.parse_timestamp_strptime("%Y-%m-%dT%H:%M:%SZ").format_timestamp_unix()`, + input: map[string]any{"t": "2020-08-14T05:54:23Z"}, + wantV1: map[string]any{"epoch": int64(1597384463)}, + }, + { + name: "metadata(key) -> input@[key]", + v1: `root.region = metadata("region")`, + input: map[string]any{}, + meta: map[string]any{"region": "eu-west"}, + wantV1: map[string]any{"region": "eu-west"}, + }, + { + name: "this -> input plus bare-ident rebinding", + v1: `root.copy = this`, + input: map[string]any{"x": int64(1), "y": int64(2)}, + wantV1: map[string]any{"copy": map[string]any{"x": int64(1), "y": int64(2)}}, + }, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + rep, err := translator.Migrate(tc.v1, translator.Options{ + Verbose: true, + MinCoverage: 0.0001, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if rep.V2Mapping == "" { + t.Fatalf("empty V2 mapping; coverage=%v", rep.Coverage) + } + + v1, err := benchmark.NewV1Runner(tc.v1, nil) + if err != nil { + t.Fatalf("v1 compile: %v", err) + } + v1Out, err := v1.Exec(tc.input, tc.meta) + if err != nil { + t.Fatalf("v1 exec: %v", err) + } + + exec, err := bloblangv2.GlobalEnvironment().Parse(rep.V2Mapping) + if err != nil { + t.Fatalf("v2 compile (translated):\n%s\nerr: %v", rep.V2Mapping, err) + } + v2Out, _, err := exec.QueryMetadata(tc.input, tc.meta) + if err != nil { + t.Fatalf("v2 exec (translated):\n%s\nerr: %v", rep.V2Mapping, err) + } + + wantV2 := tc.wantV2 + if wantV2 == nil { + wantV2 = tc.wantV1 + } + if !reflect.DeepEqual(v1Out, tc.wantV1) { + t.Errorf("V1 output mismatch.\nV1: %#v\nwant: %#v\nv2 mapping:\n%s", v1Out, tc.wantV1, rep.V2Mapping) + } + if !reflect.DeepEqual(v2Out, wantV2) { + t.Errorf("V2 output mismatch.\nV2: %#v\nwant: %#v\nv2 mapping:\n%s", v2Out, wantV2, rep.V2Mapping) + } + }) + } +} diff --git a/internal/bloblang2/migrator/demo/main.go b/internal/bloblang2/migrator/demo/main.go new file mode 100644 index 000000000..a44ec7bae --- /dev/null +++ b/internal/bloblang2/migrator/demo/main.go @@ -0,0 +1,471 @@ +package main + +import ( + "context" + "encoding/json" + "errors" + "flag" + "fmt" + "log" + "net/http" + "os" + "os/exec" + "os/signal" + "path/filepath" + "runtime" + "sort" + "strings" + "syscall" + "time" + + _ "embed" + + "gopkg.in/yaml.v3" + + "github.com/redpanda-data/benthos/v4/internal/bloblang/mapping" + "github.com/redpanda-data/benthos/v4/internal/bloblang/query" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" + "github.com/redpanda-data/benthos/v4/internal/message" + "github.com/redpanda-data/benthos/v4/internal/value" + "github.com/redpanda-data/benthos/v4/public/bloblang" +) + +//go:embed page.html +var pageHTML []byte + +// Shared demo assets live alongside the sibling demo; we serve them from +// disk so this demo stays in lockstep with the V2 build pipeline (ts and +// tree-sitter Taskfiles write into that directory). +var sharedDemoDir = func() string { + _, thisFile, _, _ := runtime.Caller(0) + return filepath.Join(filepath.Dir(thisFile), "..", "..", "demo") +}() + +// Cached at startup since they don't change. +var ( + stdlibMethods map[string]syntax.MethodInfo + stdlibFunctions map[string]syntax.FunctionInfo + stdlibMethodOpcodes map[string]uint16 + stdlibFunctionOpcodes map[string]uint16 +) + +func init() { + stdlibMethods, stdlibFunctions = eval.StdlibNames() + stdlibMethodOpcodes, stdlibFunctionOpcodes = eval.StdlibOpcodes() +} + +type executeRequest struct { + V1Mapping string `json:"v1_mapping"` + Input string `json:"input"` + Engine string `json:"engine"` // "v1" or "v2" +} + +type posError struct { + Line int `json:"line"` + Column int `json:"column"` + Message string `json:"message"` +} + +type translateNote struct { + Line int `json:"line"` + Column int `json:"column"` + Severity string `json:"severity"` + RuleID string `json:"rule_id"` + Message string `json:"message"` +} + +type executeResponse struct { + V2Mapping string `json:"v2_mapping"` + TranslateNotes []translateNote `json:"translate_notes,omitempty"` + V1ParseErrors []posError `json:"v1_parse_errors,omitempty"` + V2ParseErrors []posError `json:"v2_parse_errors,omitempty"` + RuntimeError string `json:"runtime_error,omitempty"` + Result string `json:"result,omitempty"` +} + +func handleExecute(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost { + http.Error(w, "method not allowed", http.StatusMethodNotAllowed) + return + } + + var req executeRequest + if err := json.NewDecoder(r.Body).Decode(&req); err != nil { + http.Error(w, err.Error(), http.StatusBadRequest) + return + } + + var resp executeResponse + defer func() { + w.Header().Set("Content-Type", "application/json") + _ = json.NewEncoder(w).Encode(resp) + }() + + // 1. Translate V1 → V2 (always, so the UI's V2 pane updates). + rep, cerr := migrateV1(req.V1Mapping) + if rep != nil { + resp.V2Mapping = rep.V2Mapping + resp.TranslateNotes = notesFromReport(rep) + } + if cerr != nil { + // V1 side failed to parse — surface that; there is no V2 to run. + resp.V1ParseErrors = v1ParseErrorsOf(cerr) + return + } + + // 2. Parse input JSON (shared across engines). + var inputVal any + if err := json.Unmarshal([]byte(req.Input), &inputVal); err != nil { + resp.RuntimeError = fmt.Sprintf("invalid input JSON: %v", err) + return + } + + // 3. Execute via the requested engine. + switch req.Engine { + case "", "v1": + executeV1(req.V1Mapping, inputVal, &resp) + case "v2": + if resp.V2Mapping == "" { + resp.RuntimeError = "no V2 mapping was produced (check translate notes)" + return + } + executeV2(resp.V2Mapping, inputVal, &resp) + default: + resp.RuntimeError = fmt.Sprintf("unknown engine %q (expected v1 or v2)", req.Engine) + } +} + +// migrateV1 runs the translator with a default 0 coverage gate so the demo +// always gets a best-effort V2 mapping back, even for low-coverage inputs. +// A V1 parse failure is surfaced separately. +func migrateV1(v1Source string) (*translator.Report, error) { + if strings.TrimSpace(v1Source) == "" { + return &translator.Report{}, nil + } + opts := translator.Options{ + MinCoverage: 0, // never gate — we want to show whatever the translator can emit + Verbose: true, + } + rep, err := translator.Migrate(v1Source, opts) + if err != nil { + // CoverageError can still carry a partial report; other errors can't. + var cerr *translator.CoverageError + if errors.As(err, &cerr) && cerr.Report != nil { + return cerr.Report, err + } + return nil, err + } + return rep, nil +} + +func notesFromReport(rep *translator.Report) []translateNote { + if rep == nil || len(rep.Changes) == 0 { + return nil + } + out := make([]translateNote, len(rep.Changes)) + for i, c := range rep.Changes { + out[i] = translateNote{ + Line: c.Line, + Column: c.Column, + Severity: c.Severity.String(), + RuleID: c.RuleID.String(), + Message: c.Explanation, + } + } + return out +} + +func v1ParseErrorsOf(err error) []posError { + var pe *bloblang.ParseError + if errors.As(err, &pe) { + return []posError{{Line: pe.Line, Column: pe.Column, Message: pe.Error()}} + } + var v1pe *v1ast.ParseError + if errors.As(err, &v1pe) { + return []posError{{Line: v1pe.Pos.Line, Column: v1pe.Pos.Column, Message: v1pe.Msg}} + } + return []posError{{Line: 1, Column: 1, Message: err.Error()}} +} + +func executeV1(mappingSrc string, input any, resp *executeResponse) { + if strings.TrimSpace(mappingSrc) == "" { + resp.Result = "null" + return + } + exe, err := bloblang.Parse(mappingSrc) + if err != nil { + resp.V1ParseErrors = v1ParseErrorsOf(err) + return + } + // bloblang.Executor.Query uses an empty batch and passes a nil + // AssignmentContext.Meta — any `meta x = y` in the mapping therefore + // errors with "unable to assign metadata in the current context". + // Reach through XUnwrapper and drive the internal executor against a + // real message.Part so metadata writes succeed. The output metadata is + // discarded because the demo only renders the payload. + uw, ok := exe.XUnwrapper().(interface{ Unwrap() *mapping.Executor }) + if !ok { + resp.RuntimeError = "internal: executor does not expose unwrapper" + return + } + part := message.NewPart(nil) + if input != nil { + part.SetStructured(input) + } + vars := map[string]any{} + var newValue any = value.Nothing(nil) + ctx := query.FunctionContext{ + Maps: uw.Unwrap().Maps(), + Vars: vars, + Index: 0, + MsgBatch: message.Batch{part}, + NewMeta: part, + NewValue: &newValue, + }.WithValue(input) + if err := uw.Unwrap().ExecOnto(ctx, mapping.AssignmentContext{ + Vars: vars, + Meta: part, + Value: &newValue, + }); err != nil { + resp.RuntimeError = err.Error() + return + } + switch newValue.(type) { + case value.Delete: + resp.Result = "< message deleted >" + return + case value.Nothing: + // Mapping made no payload assignment — pass through input unchanged, + // matching V1's `mapping` processor default. + newValue = input + } + resp.Result = jsonIndent(newValue, resp) +} + +func executeV2(v2Source string, input any, resp *executeResponse) { + prog, errs := syntax.Parse(v2Source, "", nil) + if len(errs) > 0 { + resp.V2ParseErrors = posErrorsFromSyntax(errs) + return + } + syntax.Optimize(prog) + if resolveErrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: stdlibMethods, + Functions: stdlibFunctions, + MethodOpcodes: stdlibMethodOpcodes, + FunctionOpcodes: stdlibFunctionOpcodes, + }); len(resolveErrs) > 0 { + resp.V2ParseErrors = posErrorsFromSyntax(resolveErrs) + return + } + interp := eval.New(prog) + interp.RegisterStdlib() + interp.RegisterLambdaMethods() + + out, _, deleted, err := interp.Run(input, map[string]any{}) + if err != nil { + resp.RuntimeError = err.Error() + return + } + if deleted { + resp.Result = "< message deleted >" + return + } + resp.Result = jsonIndent(out, resp) +} + +func jsonIndent(v any, resp *executeResponse) string { + b, err := json.MarshalIndent(v, "", " ") + if err != nil { + resp.RuntimeError = fmt.Sprintf("failed to marshal output: %v", err) + return "" + } + return string(b) +} + +type completionItem struct { + Label string `json:"label"` + Kind string `json:"kind"` +} + +var cachedCompletions []byte + +func init() { + var items []completionItem + for name := range stdlibMethods { + items = append(items, completionItem{Label: name, Kind: "method"}) + } + for name := range stdlibFunctions { + items = append(items, completionItem{Label: name, Kind: "function"}) + } + cachedCompletions, _ = json.Marshal(items) +} + +func handleCompletions(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "application/json") + w.Header().Set("Cache-Control", "public, max-age=3600") + _, _ = w.Write(cachedCompletions) +} + +// caseStudiesDir returns the absolute path to the V1 corpus case studies, +// derived from this file's location so `go run` works from any cwd. +func caseStudiesDir() string { + _, thisFile, _, _ := runtime.Caller(0) + return filepath.Join(filepath.Dir(thisFile), "..", "v1spec", "tests", "case_studies") +} + +type caseStudySpec struct { + Description string `yaml:"description"` + Tests []caseStudyTest `yaml:"tests"` +} + +type caseStudyTest struct { + Name string `yaml:"name"` + Mapping string `yaml:"mapping"` + Input any `yaml:"input"` +} + +type caseStudyItem struct { + File string `json:"file"` + Name string `json:"name"` + Description string `json:"description"` + Mapping string `json:"mapping"` + Input string `json:"input"` +} + +func handleCaseStudies(w http.ResponseWriter, r *http.Request) { + dir := caseStudiesDir() + entries, err := os.ReadDir(dir) + if err != nil { + http.Error(w, "case studies not found", http.StatusNotFound) + return + } + + var items []caseStudyItem + for _, entry := range entries { + if entry.IsDir() || !strings.HasSuffix(entry.Name(), ".yaml") { + continue + } + data, err := os.ReadFile(filepath.Join(dir, entry.Name())) + if err != nil { + continue + } + var spec caseStudySpec + if err := yaml.Unmarshal(data, &spec); err != nil { + continue + } + for _, t := range spec.Tests { + if t.Mapping == "" { + continue + } + inputJSON, err := json.MarshalIndent(t.Input, "", " ") + if err != nil { + continue + } + items = append(items, caseStudyItem{ + File: entry.Name(), + Name: t.Name, + Description: strings.TrimSpace(spec.Description), + Mapping: t.Mapping, + Input: string(inputJSON), + }) + } + } + + sort.Slice(items, func(i, j int) bool { return items[i].Name < items[j].Name }) + + w.Header().Set("Content-Type", "application/json") + _ = json.NewEncoder(w).Encode(items) +} + +func posErrorsFromSyntax(errs []syntax.PosError) []posError { + out := make([]posError, len(errs)) + for i, e := range errs { + out[i] = posError{Line: e.Pos.Line, Column: e.Pos.Column, Message: e.Msg} + } + return out +} + +// serveSharedAsset serves a file from the sibling demo directory. The Go +// build pipelines (ts/bundle.mjs, tree-sitter/Taskfile sync-demo) write +// these assets into that directory; this demo consumes them read-only. +func serveSharedAsset(name, contentType string) http.HandlerFunc { + return func(w http.ResponseWriter, r *http.Request) { + path := filepath.Join(sharedDemoDir, name) + data, err := os.ReadFile(path) + if err != nil { + http.Error(w, fmt.Sprintf("%s not available (run the V2 build first): %v", name, err), http.StatusNotFound) + return + } + w.Header().Set("Content-Type", contentType) + w.Header().Set("Cache-Control", "public, max-age=3600") + _, _ = w.Write(data) + } +} + +func openBrowser(url string) { + var cmd string + switch runtime.GOOS { + case "darwin": + cmd = "open" + case "linux": + cmd = "xdg-open" + case "windows": + cmd = "rundll32" + _ = exec.Command(cmd, "url.dll,FileProtocolHandler", url).Start() + return + default: + return + } + _ = exec.Command(cmd, url).Start() +} + +func main() { + addr := flag.String("addr", ":4196", "listen address") + noOpen := flag.Bool("no-open", false, "don't open browser automatically") + flag.Parse() + + mux := http.NewServeMux() + mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "text/html; charset=utf-8") + _, _ = w.Write(pageHTML) + }) + mux.HandleFunc("/execute", handleExecute) + mux.HandleFunc("/completions", handleCompletions) + mux.HandleFunc("/case-studies", handleCaseStudies) + mux.HandleFunc("/tree-sitter-bloblang2.wasm", serveSharedAsset("tree-sitter-bloblang2.wasm", "application/wasm")) + mux.HandleFunc("/highlights.scm", serveSharedAsset("highlights.scm", "text/plain; charset=utf-8")) + mux.HandleFunc("/bloblang2.mjs", serveSharedAsset("bloblang2.mjs", "application/javascript; charset=utf-8")) + mux.HandleFunc("/bloblang2.mjs.map", serveSharedAsset("bloblang2.mjs.map", "application/json; charset=utf-8")) + + server := &http.Server{ + Addr: *addr, + Handler: mux, + ReadTimeout: 10 * time.Second, + WriteTimeout: 10 * time.Second, + } + + if !*noOpen { + openBrowser("http://localhost" + *addr) + } + + log.Printf("Bloblang migrator demo server listening on http://localhost%s", *addr) + log.Printf("WARNING: This server is for local demo purposes only. Do not expose to the internet.") + + go func() { + sigChan := make(chan os.Signal, 1) + signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM) + <-sigChan + log.Println("Shutting down...") + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + _ = server.Shutdown(ctx) + }() + + if err := server.ListenAndServe(); err != nil && err != http.ErrServerClosed { + log.Fatalf("Server error: %v", err) + } +} diff --git a/internal/bloblang2/migrator/demo/page.html b/internal/bloblang2/migrator/demo/page.html new file mode 100644 index 000000000..dee4327a0 --- /dev/null +++ b/internal/bloblang2/migrator/demo/page.html @@ -0,0 +1,611 @@ + + + + + Bloblang V1→V2 Migrator Demo + + + +
+
+
+ Input + + + JSON + +
+
+
+
+
+ Output + + + + + + + +
+
+

+            
+
+
+
+ V1 Mapping + + + + editable + +
+
+
+
+
+ V2 Mapping (generated) + read-only +
+
+
+
+
+ + + + + + + + + From b24f7768bda55cd2850a13c95f75186e12db4330 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 28 Apr 2026 10:47:57 +0100 Subject: [PATCH 14/20] bloblang(v2): Add public bloblangv2 plugin API Adds public/bloblangv2/, the exposed Go surface for V2: Environment, Executor, MessageContext, plugin registration for methods and functions with parse-time argument folding, ParseError plumbing, parameter and spec types, and a View layer over the schema for external consumers. The package is the integration point used by public/service (and any downstream Benthos plugin) to register V2 plugins, parse mappings, and execute them against message contexts. --- public/bloblangv2/bloblang_test.go | 401 +++++++++++++ public/bloblangv2/environment.go | 723 +++++++++++++++++++++++ public/bloblangv2/executor.go | 102 ++++ public/bloblangv2/function.go | 14 + public/bloblangv2/lambda_test.go | 163 +++++ public/bloblangv2/messagecontext.go | 53 ++ public/bloblangv2/messagecontext_test.go | 151 +++++ public/bloblangv2/method.go | 141 +++++ public/bloblangv2/package.go | 55 ++ public/bloblangv2/params.go | 254 ++++++++ public/bloblangv2/parse_error.go | 47 ++ public/bloblangv2/spec.go | 174 ++++++ public/bloblangv2/view.go | 243 ++++++++ public/bloblangv2/view_test.go | 221 +++++++ 14 files changed, 2742 insertions(+) create mode 100644 public/bloblangv2/bloblang_test.go create mode 100644 public/bloblangv2/environment.go create mode 100644 public/bloblangv2/executor.go create mode 100644 public/bloblangv2/function.go create mode 100644 public/bloblangv2/lambda_test.go create mode 100644 public/bloblangv2/messagecontext.go create mode 100644 public/bloblangv2/messagecontext_test.go create mode 100644 public/bloblangv2/method.go create mode 100644 public/bloblangv2/package.go create mode 100644 public/bloblangv2/params.go create mode 100644 public/bloblangv2/parse_error.go create mode 100644 public/bloblangv2/spec.go create mode 100644 public/bloblangv2/view.go create mode 100644 public/bloblangv2/view_test.go diff --git a/public/bloblangv2/bloblang_test.go b/public/bloblangv2/bloblang_test.go new file mode 100644 index 000000000..2be74cc50 --- /dev/null +++ b/public/bloblangv2/bloblang_test.go @@ -0,0 +1,401 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2_test + +import ( + "errors" + "fmt" + "strings" + "sync" + "sync/atomic" + "testing" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +func TestBasicParseAndQuery(t *testing.T) { + env := bloblangv2.NewEnvironment() + exec, err := env.Parse(`output = input.uppercase()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + out, err := exec.Query("hello") + if err != nil { + t.Fatalf("query: %v", err) + } + if out != "HELLO" { + t.Fatalf("expected HELLO, got %#v", out) + } +} + +func TestParseErrorSurfacesLineAndColumn(t *testing.T) { + _, err := bloblangv2.Parse(`root = nope(`) + if err == nil { + t.Fatal("expected parse error") + } + var pErr *bloblangv2.ParseError + if !errors.As(err, &pErr) { + t.Fatalf("expected *ParseError, got %T", err) + } + if pErr.Line < 1 || pErr.Column < 1 { + t.Fatalf("expected positive line/column, got %d:%d", pErr.Line, pErr.Column) + } +} + +func TestRegisterZeroArgMethod(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + err := env.RegisterMethod("bang", bloblangv2.NewPluginSpec(), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return s + "!", nil + }), nil + }) + if err != nil { + t.Fatalf("register: %v", err) + } + + exec, err := env.Parse(`output = input.bang()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + out, err := exec.Query("hi") + if err != nil { + t.Fatalf("query: %v", err) + } + if out != "hi!" { + t.Fatalf("expected hi!, got %#v", out) + } +} + +func TestRegisterMethodWithParams(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + spec := bloblangv2.NewPluginSpec(). + Description("Append n copies of a string"). + Param(bloblangv2.NewStringParam("suffix").Description("the text to append")). + Param(bloblangv2.NewInt64Param("count").Default(int64(1))) + + err := env.RegisterMethod("append_n", spec, func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + suffix, err := args.GetString("suffix") + if err != nil { + return nil, err + } + count, err := args.GetInt64("count") + if err != nil { + return nil, err + } + return bloblangv2.StringMethod(func(s string) (any, error) { + return s + strings.Repeat(suffix, int(count)), nil + }), nil + }) + if err != nil { + t.Fatalf("register: %v", err) + } + + exec, err := env.Parse(`output = input.append_n("!", 3)`) + if err != nil { + t.Fatalf("parse: %v", err) + } + out, err := exec.Query("hi") + if err != nil { + t.Fatalf("query: %v", err) + } + if out != "hi!!!" { + t.Fatalf("expected hi!!!, got %#v", out) + } +} + +func TestRegisterMethodDefaultApplied(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + spec := bloblangv2.NewPluginSpec(). + Param(bloblangv2.NewInt64Param("times").Default(int64(2))) + + err := env.RegisterMethod("repeat_value", spec, func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + times, err := args.GetInt64("times") + if err != nil { + return nil, err + } + return bloblangv2.StringMethod(func(s string) (any, error) { + return strings.Repeat(s, int(times)), nil + }), nil + }) + if err != nil { + t.Fatalf("register: %v", err) + } + + exec, err := env.Parse(`output = input.repeat_value()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + out, err := exec.Query("ab") + if err != nil { + t.Fatalf("query: %v", err) + } + if out != "abab" { + t.Fatalf("expected abab, got %#v", out) + } +} + +func TestRegisterFunction(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + spec := bloblangv2.NewPluginSpec(). + Param(bloblangv2.NewStringParam("greeting")) + + err := env.RegisterFunction("greet", spec, func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + g, err := args.GetString("greeting") + if err != nil { + return nil, err + } + return func() (any, error) { return g + ", world!", nil }, nil + }) + if err != nil { + t.Fatalf("register: %v", err) + } + + exec, err := env.Parse(`output = greet("hello")`) + if err != nil { + t.Fatalf("parse: %v", err) + } + out, err := exec.Query(nil) + if err != nil { + t.Fatalf("query: %v", err) + } + if out != "hello, world!" { + t.Fatalf("expected 'hello, world!', got %#v", out) + } +} + +func TestRegisterRejectsStdlibShadow(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + err := env.RegisterMethod("uppercase", bloblangv2.NewPluginSpec(), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { return v, nil }, nil + }) + if err == nil { + t.Fatal("expected error shadowing stdlib method") + } +} + +func TestRegisterRejectsInvalidName(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + err := env.RegisterMethod("NotSnakeCase", bloblangv2.NewPluginSpec(), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { return v, nil }, nil + }) + if err == nil { + t.Fatal("expected error for non-snake-case name") + } +} + +func TestRegisterRejectsNilSpec(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + err := env.RegisterMethod("noop", nil, + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { return v, nil }, nil + }) + if err == nil { + t.Fatal("expected error when spec is nil") + } +} + +func TestPluginMethodArityEnforced(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + spec := bloblangv2.NewPluginSpec().Param(bloblangv2.NewStringParam("x")) + if err := env.RegisterMethod("needs_one", spec, + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { return v, nil }, nil + }); err != nil { + t.Fatal(err) + } + + _, err := env.Parse(`output = input.needs_one()`) + if err == nil { + t.Fatal("expected arity error when required arg missing") + } +} + +func TestWithoutMethodsStripsStdlib(t *testing.T) { + // V2's resolver defers method name validation to runtime, so stripping a + // stdlib method via WithoutMethods surfaces as a Query-time "unknown + // method" error rather than a parse error. + env := bloblangv2.NewEnvironment().WithoutMethods("uppercase") + exec, err := env.Parse(`output = input.uppercase()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + _, err = exec.Query("hi") + if err == nil || !strings.Contains(err.Error(), "uppercase") { + t.Fatalf("expected unknown method error, got %v", err) + } +} + +func TestWithoutFunctionsStripsStdlib(t *testing.T) { + env := bloblangv2.NewEnvironment().WithoutFunctions("now") + if _, err := env.Parse(`output = now()`); err == nil { + t.Fatal("expected parse failure after WithoutFunctions") + } +} + +func TestExecutorConcurrentUse(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + if err := env.RegisterMethod("bang", bloblangv2.NewPluginSpec(), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return s + "!", nil + }), nil + }); err != nil { + t.Fatal(err) + } + exec, err := env.Parse(`output = input.bang()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + + var wg sync.WaitGroup + errCh := make(chan error, 32) + for i := 0; i < 32; i++ { + wg.Add(1) + go func(idx int) { + defer wg.Done() + in := fmt.Sprintf("val%d", idx) + out, err := exec.Query(in) + if err != nil { + errCh <- err + return + } + if out != in+"!" { + errCh <- fmt.Errorf("expected %s!, got %#v", in, out) + } + }(i) + } + wg.Wait() + close(errCh) + for e := range errCh { + t.Error(e) + } +} + +// TestCtorCachedForStaticArgs verifies the static-args optimisation: when all +// arguments at a call site are literals, the plugin constructor should be +// invoked only once regardless of how many times Query runs. +func TestCtorCachedForStaticArgs(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + var ctorCount atomic.Int64 + + spec := bloblangv2.NewPluginSpec().Param(bloblangv2.NewStringParam("suffix")) + if err := env.RegisterMethod("append_suffix", spec, + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + ctorCount.Add(1) + suf, err := args.GetString("suffix") + if err != nil { + return nil, err + } + return bloblangv2.StringMethod(func(s string) (any, error) { + return s + suf, nil + }), nil + }); err != nil { + t.Fatal(err) + } + + exec, err := env.Parse(`output = input.append_suffix("!")`) + if err != nil { + t.Fatalf("parse: %v", err) + } + for i := 0; i < 5; i++ { + if _, err := exec.Query("x"); err != nil { + t.Fatalf("query %d: %v", i, err) + } + } + if got := ctorCount.Load(); got != 1 { + t.Fatalf("expected constructor to run once; ran %d times", got) + } +} + +func TestCtorCachedForZeroArgs(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + var ctorCount atomic.Int64 + + if err := env.RegisterMethod("stamp", bloblangv2.NewPluginSpec(), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + ctorCount.Add(1) + return bloblangv2.StringMethod(func(s string) (any, error) { + return s + "[stamp]", nil + }), nil + }); err != nil { + t.Fatal(err) + } + + exec, err := env.Parse(`output = input.stamp()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + for i := 0; i < 5; i++ { + if _, err := exec.Query("x"); err != nil { + t.Fatalf("query %d: %v", i, err) + } + } + if got := ctorCount.Load(); got != 1 { + t.Fatalf("expected constructor to run once; ran %d times", got) + } +} + +func TestCtorRunsPerCallForDynamicArgs(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + var ctorCount atomic.Int64 + + spec := bloblangv2.NewPluginSpec().Param(bloblangv2.NewStringParam("s")) + if err := env.RegisterMethod("echo_s", spec, + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + ctorCount.Add(1) + s, err := args.GetString("s") + if err != nil { + return nil, err + } + return func(_ any) (any, error) { return s, nil }, nil + }); err != nil { + t.Fatal(err) + } + + exec, err := env.Parse(`output = input.echo_s(input)`) + if err != nil { + t.Fatalf("parse: %v", err) + } + for i := 0; i < 5; i++ { + if _, err := exec.Query(fmt.Sprintf("v%d", i)); err != nil { + t.Fatalf("query %d: %v", i, err) + } + } + if got := ctorCount.Load(); got != 5 { + t.Fatalf("expected constructor to run per call (5); ran %d times", got) + } +} + +func TestCtorCachedForStaticFunctionArgs(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + var ctorCount atomic.Int64 + + spec := bloblangv2.NewPluginSpec().Param(bloblangv2.NewStringParam("greeting")) + if err := env.RegisterFunction("greet", spec, + func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + ctorCount.Add(1) + g, err := args.GetString("greeting") + if err != nil { + return nil, err + } + return func() (any, error) { return g + ", world!", nil }, nil + }); err != nil { + t.Fatal(err) + } + + exec, err := env.Parse(`output = greet("hello")`) + if err != nil { + t.Fatalf("parse: %v", err) + } + for i := 0; i < 5; i++ { + if _, err := exec.Query(nil); err != nil { + t.Fatalf("query %d: %v", i, err) + } + } + if got := ctorCount.Load(); got != 1 { + t.Fatalf("expected constructor to run once; ran %d times", got) + } +} diff --git a/public/bloblangv2/environment.go b/public/bloblangv2/environment.go new file mode 100644 index 000000000..bb6e28ab2 --- /dev/null +++ b/public/bloblangv2/environment.go @@ -0,0 +1,723 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +import ( + "errors" + "fmt" + "regexp" + "strconv" + "sync" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// pluginNameRegex matches snake-case identifiers, mirroring the V1 constraint. +var pluginNameRegex = regexp.MustCompile(`^[a-z0-9]+(_[a-z0-9]+)*$`) + +// Environment is a self-contained registry of methods and functions available +// to the mappings it parses. It always inherits the full Bloblang V2 standard +// library; plugins registered here extend (or, via WithoutMethods / +// WithoutFunctions, shadow) that baseline. +// +// Environments are safe for concurrent Parse; concurrent RegisterMethod / +// RegisterFunction calls are serialised internally but callers are expected +// to register all plugins before publishing the Environment for parsing. +type Environment struct { + mu sync.RWMutex + + pluginMethods map[string]pluginMethodReg + pluginFunctions map[string]pluginFunctionReg + + // removedMethods / removedFunctions shadow the stdlib — names listed here + // are hidden from the resolver, so mappings that reference them fail at + // parse time with "unknown method" / "unknown function". + removedMethods map[string]struct{} + removedFunctions map[string]struct{} + + onlyPure bool +} + +type pluginMethodReg struct { + spec eval.MethodSpec + info PluginInfo + impure bool +} + +type pluginFunctionReg struct { + spec eval.FunctionSpec + info PluginInfo + impure bool +} + +// globalEnv is the default environment. Plugins registered via the +// package-level RegisterMethod / RegisterFunction functions land here. +var globalEnv = &Environment{ + pluginMethods: map[string]pluginMethodReg{}, + pluginFunctions: map[string]pluginFunctionReg{}, +} + +// GlobalEnvironment returns the shared process-wide environment. Registering +// a plugin against this environment makes it available to any caller that +// uses the package-level Parse or omits an explicit environment. +func GlobalEnvironment() *Environment { + return globalEnv +} + +// NewEnvironment returns a fresh environment seeded with the plugins +// currently registered on the global environment. Further registrations on +// the returned environment do not affect the global one. +func NewEnvironment() *Environment { + return globalEnv.Clone() +} + +// NewEmptyEnvironment returns an environment with no plugins registered. It +// still inherits the standard library; only user-defined plugins are absent. +func NewEmptyEnvironment() *Environment { + return &Environment{ + pluginMethods: map[string]pluginMethodReg{}, + pluginFunctions: map[string]pluginFunctionReg{}, + } +} + +// Clone returns a deep copy of the environment's plugin registry. +func (e *Environment) Clone() *Environment { + e.mu.RLock() + defer e.mu.RUnlock() + + clone := &Environment{ + pluginMethods: make(map[string]pluginMethodReg, len(e.pluginMethods)), + pluginFunctions: make(map[string]pluginFunctionReg, len(e.pluginFunctions)), + onlyPure: e.onlyPure, + } + for k, v := range e.pluginMethods { + clone.pluginMethods[k] = v + } + for k, v := range e.pluginFunctions { + clone.pluginFunctions[k] = v + } + if len(e.removedMethods) > 0 { + clone.removedMethods = make(map[string]struct{}, len(e.removedMethods)) + for k := range e.removedMethods { + clone.removedMethods[k] = struct{}{} + } + } + if len(e.removedFunctions) > 0 { + clone.removedFunctions = make(map[string]struct{}, len(e.removedFunctions)) + for k := range e.removedFunctions { + clone.removedFunctions[k] = struct{}{} + } + } + return clone +} + +// WithoutMethods returns a clone with the named methods removed. Removing a +// method that does not exist is a no-op. +func (e *Environment) WithoutMethods(names ...string) *Environment { + clone := e.Clone() + if clone.removedMethods == nil { + clone.removedMethods = make(map[string]struct{}, len(names)) + } + for _, n := range names { + clone.removedMethods[n] = struct{}{} + } + return clone +} + +// WithoutFunctions returns a clone with the named functions removed. +func (e *Environment) WithoutFunctions(names ...string) *Environment { + clone := e.Clone() + if clone.removedFunctions == nil { + clone.removedFunctions = make(map[string]struct{}, len(names)) + } + for _, n := range names { + clone.removedFunctions[n] = struct{}{} + } + return clone +} + +// OnlyPure returns a clone with impure plugins stripped out. +func (e *Environment) OnlyPure() *Environment { + clone := e.Clone() + clone.onlyPure = true + for name, reg := range clone.pluginMethods { + if reg.impure { + delete(clone.pluginMethods, name) + } + } + for name, reg := range clone.pluginFunctions { + if reg.impure { + delete(clone.pluginFunctions, name) + } + } + return clone +} + +// RegisterMethod registers a method plugin against the environment. The +// PluginSpec declares the method's parameters and documentation; the +// constructor builds the runtime Method closure from those arguments. +// +// Plugin names must match the regular expression /^[a-z0-9]+(_[a-z0-9]+)*$/ +// (snake case). The spec must be non-nil — pass NewPluginSpec() for a +// method that takes no arguments. +func (e *Environment) RegisterMethod(name string, spec *PluginSpec, ctor MethodConstructor) error { + if spec == nil { + return fmt.Errorf("method %q: spec must be non-nil (use NewPluginSpec() for no parameters)", name) + } + if err := validatePluginName(name); err != nil { + return err + } + if spec.impure && e.onlyPure { + return fmt.Errorf("cannot register impure method %q in a pure-only environment", name) + } + + methodSpec := buildMethodSpec(name, spec, ctor) + + e.mu.Lock() + defer e.mu.Unlock() + + if _, exists := e.pluginMethods[name]; exists { + return fmt.Errorf("method %q is already registered", name) + } + if isStdlibMethod(name) { + return fmt.Errorf("method %q shadows a standard library method", name) + } + if e.pluginMethods == nil { + e.pluginMethods = make(map[string]pluginMethodReg) + } + e.pluginMethods[name] = pluginMethodReg{ + spec: methodSpec, + info: pluginInfoFromSpec(name, spec), + impure: spec.impure, + } + return nil +} + +// RegisterFunction registers a function plugin against the environment. See +// RegisterMethod for the parameter-declaration rules; those apply identically +// to functions. +func (e *Environment) RegisterFunction(name string, spec *PluginSpec, ctor FunctionConstructor) error { + if spec == nil { + return fmt.Errorf("function %q: spec must be non-nil (use NewPluginSpec() for no parameters)", name) + } + if err := validatePluginName(name); err != nil { + return err + } + if spec.impure && e.onlyPure { + return fmt.Errorf("cannot register impure function %q in a pure-only environment", name) + } + + funcSpec := buildFunctionSpec(name, spec, ctor) + + e.mu.Lock() + defer e.mu.Unlock() + + if _, exists := e.pluginFunctions[name]; exists { + return fmt.Errorf("function %q is already registered", name) + } + if isStdlibFunction(name) { + return fmt.Errorf("function %q shadows a standard library function", name) + } + if e.pluginFunctions == nil { + e.pluginFunctions = make(map[string]pluginFunctionReg) + } + e.pluginFunctions[name] = pluginFunctionReg{ + spec: funcSpec, + info: pluginInfoFromSpec(name, spec), + impure: spec.impure, + } + return nil +} + +// WalkFunctions invokes fn for every user-registered function plugin on the +// environment, in unspecified order. Standard library functions are not +// included — V2 stdlib metadata lives in the language specification at +// internal/bloblang2/spec rather than in runtime introspection. +func (e *Environment) WalkFunctions(fn func(name string, view *FunctionView)) { + e.mu.RLock() + names := make([]string, 0, len(e.pluginFunctions)) + infos := make(map[string]PluginInfo, len(e.pluginFunctions)) + for name, reg := range e.pluginFunctions { + names = append(names, name) + infos[name] = reg.info + } + e.mu.RUnlock() + for _, name := range names { + view := &FunctionView{info: infos[name]} + fn(name, view) + } +} + +// WalkMethods invokes fn for every user-registered method plugin on the +// environment, in unspecified order. Standard library methods are not +// included — see WalkFunctions for the reasoning. +func (e *Environment) WalkMethods(fn func(name string, view *MethodView)) { + e.mu.RLock() + names := make([]string, 0, len(e.pluginMethods)) + infos := make(map[string]PluginInfo, len(e.pluginMethods)) + for name, reg := range e.pluginMethods { + names = append(names, name) + infos[name] = reg.info + } + e.mu.RUnlock() + for _, name := range names { + view := &MethodView{info: infos[name]} + fn(name, view) + } +} + +func validatePluginName(name string) error { + if !pluginNameRegex.MatchString(name) { + return fmt.Errorf("plugin name %q must be snake-case (matching %s)", name, pluginNameRegex.String()) + } + return nil +} + +// isStdlibMethod reports whether a name collides with a registered stdlib +// method opcode. +func isStdlibMethod(name string) bool { + methods, _ := eval.StdlibOpcodes() + _, ok := methods[name] + return ok +} + +// isStdlibFunction is the function analogue of isStdlibMethod. +func isStdlibFunction(name string) bool { + _, functions := eval.StdlibOpcodes() + _, ok := functions[name] + return ok +} + +// buildMethodSpec converts a PluginSpec + constructor pair into the internal +// eval.MethodSpec used by the interpreter. The returned spec carries both a +// per-call Fn (for dynamic arguments) and a CallFolder that pre-binds the +// constructor at parse time when every argument is a literal. +// +// When the spec declares any lambda parameter the static path is replaced +// with a PluginFn that has access to the interpreter — lambdas have no +// value form, so non-lambda arguments are evaluated eagerly while lambda +// positions are wrapped into a Lambda closure for the plugin to invoke. +func buildMethodSpec(name string, spec *PluginSpec, ctor MethodConstructor) eval.MethodSpec { + params := pluginParamsToMethodParams(spec) + runMethod := func(receiver any, fn Method) any { + out, err := fn(receiver) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + return out + } + if specHasLambdaParam(spec) { + return eval.MethodSpec{ + PluginFn: func(interp *eval.Interpreter, receiver any, callArgs []syntax.CallArg) any { + rawArgs, errVal := pluginResolveArgs(interp, name, spec, callArgs) + if errVal != nil { + return errVal + } + parsed, err := newParsedParams(spec, rawArgs) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + fn, err := ctor(parsed) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + return runMethod(receiver, fn) + }, + Params: params, + AcceptsLambda: true, + } + } + return eval.MethodSpec{ + Fn: func(receiver any, args []any) any { + parsed, err := newParsedParams(spec, args) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + fn, err := ctor(parsed) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + return runMethod(receiver, fn) + }, + Params: params, + CallFolder: func(callArgs []syntax.CallArg) (any, error) { + parsed, ok, err := foldLiteralArgs(spec, callArgs) + if err != nil { + return nil, err + } + if !ok { + return nil, nil + } + fn, err := ctor(parsed) + if err != nil { + return nil, err + } + return eval.PreboundMethod(func(receiver any) any { + return runMethod(receiver, fn) + }), nil + }, + } +} + +// buildFunctionSpec mirrors buildMethodSpec for stdlib-style functions. +func buildFunctionSpec(name string, spec *PluginSpec, ctor FunctionConstructor) eval.FunctionSpec { + params := pluginParamsToFunctionParams(spec) + runFunction := func(fn Function) any { + out, err := fn() + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + return out + } + if specHasLambdaParam(spec) { + return eval.FunctionSpec{ + PluginFn: func(interp *eval.Interpreter, callArgs []syntax.CallArg) any { + rawArgs, errVal := pluginResolveArgs(interp, name, spec, callArgs) + if errVal != nil { + return errVal + } + parsed, err := newParsedParams(spec, rawArgs) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + fn, err := ctor(parsed) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + return runFunction(fn) + }, + Params: params, + } + } + return eval.FunctionSpec{ + Fn: func(args []any) any { + parsed, err := newParsedParams(spec, args) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + fn, err := ctor(parsed) + if err != nil { + return eval.NewError(fmt.Sprintf("%s(): %s", name, err.Error())) + } + return runFunction(fn) + }, + Params: params, + CallFolder: func(callArgs []syntax.CallArg) (any, error) { + parsed, ok, err := foldLiteralArgs(spec, callArgs) + if err != nil { + return nil, err + } + if !ok { + return nil, nil + } + fn, err := ctor(parsed) + if err != nil { + return nil, err + } + return eval.PreboundFunction(func() any { + return runFunction(fn) + }), nil + }, + } +} + +// specHasLambdaParam reports whether spec declares any lambda-typed +// parameter. Plugins with lambda params bypass the static-arg fold path — +// lambdas are not values, so per-call dispatch is required. +func specHasLambdaParam(spec *PluginSpec) bool { + for _, p := range spec.params { + if p.kind == paramKindLambda { + return true + } + } + return false +} + +// pluginResolveArgs resolves the AST argument list for a plugin invocation: +// non-lambda arguments are evaluated through the interpreter, lambda +// positions are wrapped into a public Lambda closure that calls back into +// the same interpreter when invoked. The returned []any matches the order +// of spec.params and is suitable for newParsedParams. +// +// Returns a non-nil errorVal (the eval-package error sentinel) when an +// argument fails to evaluate or a required lambda is missing. +func pluginResolveArgs(interp *eval.Interpreter, name string, spec *PluginSpec, callArgs []syntax.CallArg) ([]any, any) { + out := make([]any, len(spec.params)) + for i, p := range spec.params { + if i >= len(callArgs) { + if p.hasDefault { + out[i] = p.defaultVal + continue + } + if p.optional { + continue + } + return nil, eval.NewError(fmt.Sprintf("%s(): missing required argument %q", name, p.name)) + } + argVal := callArgs[i].Value + if p.kind == paramKindLambda { + lambda := interp.ExtractLambdaOrMapRef([]syntax.CallArg{{Value: argVal}}) + if lambda == nil { + return nil, eval.NewError(fmt.Sprintf("%s(): argument %q must be a lambda", name, p.name)) + } + out[i] = wrapLambda(interp, lambda) + continue + } + v := interp.EvalExpr(argVal) + if eval.IsError(v) { + return nil, v + } + out[i] = v + } + return out, nil +} + +// wrapLambda turns an internal LambdaExpr into the public Lambda type the +// plugin author invokes. Errors emitted from the lambda body are surfaced +// as Go errors so plugin code can react to them. +func wrapLambda(interp *eval.Interpreter, lambda *syntax.LambdaExpr) Lambda { + return func(args ...any) (any, error) { + result := interp.CallLambda(lambda, args) + if eval.IsError(result) { + return nil, fmt.Errorf("%v", result) + } + if eval.IsVoid(result) { + return nil, errors.New("lambda returned void") + } + return result, nil + } +} + +// foldLiteralArgs walks a call's AST arguments and, if every argument is a +// literal, returns a ParsedParams suitable for invoking the plugin +// constructor at parse time. Returns (nil, false, nil) when any argument is +// dynamic or when the shape isn't eligible for folding (e.g. named args). +// Returns (nil, false, err) when the literal values don't satisfy the spec +// (missing required, wrong type) so the resolver can surface the problem. +// The returned error is raw (no callee prefix) — the resolver prepends the +// call-site label when attaching it as a diagnostic. +func foldLiteralArgs(spec *PluginSpec, args []syntax.CallArg) (*ParsedParams, bool, error) { + for _, a := range args { + if a.Name != "" { + // Named arguments aren't folded in Phase 1. + return nil, false, nil + } + if _, lit := a.Value.(*syntax.LiteralExpr); !lit { + return nil, false, nil + } + } + raw := make([]any, len(args)) + for i, a := range args { + v, err := literalValue(a.Value.(*syntax.LiteralExpr)) + if err != nil { + return nil, false, nil + } + raw[i] = v + } + parsed, err := newParsedParams(spec, raw) + if err != nil { + return nil, false, err + } + return parsed, true, nil +} + +// literalValue converts a parsed LiteralExpr into the Go value the runtime +// would produce for it. Mirrors eval.evalLiteral; kept local to avoid a +// cross-package dependency. +func literalValue(lit *syntax.LiteralExpr) (any, error) { + switch lit.TokenType { + case syntax.STRING, syntax.RAW_STRING: + return lit.Value, nil + case syntax.INT: + n, err := strconv.ParseInt(lit.Value, 10, 64) + if err != nil { + return nil, err + } + return n, nil + case syntax.FLOAT: + f, err := strconv.ParseFloat(lit.Value, 64) + if err != nil { + return nil, err + } + return f, nil + case syntax.TRUE: + return true, nil + case syntax.FALSE: + return false, nil + case syntax.NULL: + return nil, nil + default: + return nil, fmt.Errorf("unsupported literal token %v", lit.TokenType) + } +} + +// pluginParamsToMethodParams maps the public ParamDefinition slice onto the +// internal MethodParam slice used by the resolver for arity checking. +func pluginParamsToMethodParams(spec *PluginSpec) []eval.MethodParam { + if len(spec.params) == 0 { + return nil + } + out := make([]eval.MethodParam, len(spec.params)) + for i, p := range spec.params { + out[i] = eval.MethodParam{ + Name: p.name, + Default: p.defaultVal, + HasDefault: p.hasDefault || p.optional, + AcceptsLambda: p.kind == paramKindLambda, + } + } + return out +} + +// pluginParamsToFunctionParams is the function analogue of +// pluginParamsToMethodParams. +func pluginParamsToFunctionParams(spec *PluginSpec) []eval.FunctionParam { + if len(spec.params) == 0 { + return nil + } + out := make([]eval.FunctionParam, len(spec.params)) + for i, p := range spec.params { + out[i] = eval.FunctionParam{ + Name: p.name, + Default: p.defaultVal, + HasDefault: p.hasDefault || p.optional, + AcceptsLambda: p.kind == paramKindLambda, + } + } + return out +} + +// Parse compiles a Bloblang V2 mapping against this environment. The +// returned Executor is safe for concurrent use. +func (e *Environment) Parse(src string) (*Executor, error) { + prog, perrs := syntax.Parse(src, "", nil) + if len(perrs) > 0 { + return nil, parseErrorFromPosErrors(perrs) + } + syntax.Optimize(prog) + + methodInfos, functionInfos, methodOpcodes, functionOpcodes := e.resolverInputs() + + rerrs := syntax.Resolve(prog, syntax.ResolveOptions{ + Methods: methodInfos, + Functions: functionInfos, + MethodOpcodes: methodOpcodes, + FunctionOpcodes: functionOpcodes, + }) + if len(rerrs) > 0 { + return nil, parseErrorFromPosErrors(rerrs) + } + + pluginMethods, pluginFunctions := e.snapshotPlugins() + + return newExecutor(prog, pluginMethods, pluginFunctions), nil +} + +// resolverInputs builds the four maps the resolver consumes, merging the +// standard library with the environment's plugin registry and stripping any +// names removed via WithoutMethods / WithoutFunctions. +func (e *Environment) resolverInputs() ( + methods map[string]syntax.MethodInfo, + functions map[string]syntax.FunctionInfo, + methodOpcodes map[string]uint16, + functionOpcodes map[string]uint16, +) { + stdlibMethods, stdlibFunctions := eval.StdlibNames() + stdlibMethodOpc, stdlibFunctionOpc := eval.StdlibOpcodes() + + methods = make(map[string]syntax.MethodInfo, len(stdlibMethods)) + for k, v := range stdlibMethods { + methods[k] = v + } + functions = make(map[string]syntax.FunctionInfo, len(stdlibFunctions)) + for k, v := range stdlibFunctions { + functions[k] = v + } + methodOpcodes = make(map[string]uint16, len(stdlibMethodOpc)) + for k, v := range stdlibMethodOpc { + methodOpcodes[k] = v + } + functionOpcodes = make(map[string]uint16, len(stdlibFunctionOpc)) + for k, v := range stdlibFunctionOpc { + functionOpcodes[k] = v + } + + e.mu.RLock() + defer e.mu.RUnlock() + + for name := range e.removedMethods { + delete(methods, name) + delete(methodOpcodes, name) + } + for name := range e.removedFunctions { + delete(functions, name) + delete(functionOpcodes, name) + } + + // Plugin names override any same-named stdlib entries. They have no + // opcode — dispatch falls back to interp.lookupMethod by name. + for name, reg := range e.pluginMethods { + methods[name] = eval.MethodSpecToInfo(reg.spec) + delete(methodOpcodes, name) + } + for name, reg := range e.pluginFunctions { + functions[name] = eval.FunctionSpecToInfo(reg.spec) + delete(functionOpcodes, name) + } + + return +} + +// snapshotPlugins copies the current plugin registry so Executors remain +// independent of later registrations. +func (e *Environment) snapshotPlugins() ( + methods map[string]eval.MethodSpec, + functions map[string]eval.FunctionSpec, +) { + e.mu.RLock() + defer e.mu.RUnlock() + + methods = make(map[string]eval.MethodSpec, len(e.pluginMethods)) + for name, reg := range e.pluginMethods { + methods[name] = reg.spec + } + functions = make(map[string]eval.FunctionSpec, len(e.pluginFunctions)) + for name, reg := range e.pluginFunctions { + functions[name] = reg.spec + } + return +} + +// ---------------------------------------------------------------------------- +// Package-level conveniences that operate on the global environment. +// ---------------------------------------------------------------------------- + +// Parse compiles src against the global environment. +func Parse(src string) (*Executor, error) { + return globalEnv.Parse(src) +} + +// RegisterMethod registers a method plugin on the global environment. +func RegisterMethod(name string, spec *PluginSpec, ctor MethodConstructor) error { + return globalEnv.RegisterMethod(name, spec, ctor) +} + +// MustRegisterMethod is RegisterMethod but panics on failure. +func MustRegisterMethod(name string, spec *PluginSpec, ctor MethodConstructor) { + if err := RegisterMethod(name, spec, ctor); err != nil { + panic(err) + } +} + +// RegisterFunction registers a function plugin on the global environment. +func RegisterFunction(name string, spec *PluginSpec, ctor FunctionConstructor) error { + return globalEnv.RegisterFunction(name, spec, ctor) +} + +// MustRegisterFunction is RegisterFunction but panics on failure. +func MustRegisterFunction(name string, spec *PluginSpec, ctor FunctionConstructor) { + if err := RegisterFunction(name, spec, ctor); err != nil { + panic(err) + } +} diff --git a/public/bloblangv2/executor.go b/public/bloblangv2/executor.go new file mode 100644 index 000000000..75e33d06f --- /dev/null +++ b/public/bloblangv2/executor.go @@ -0,0 +1,102 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +import ( + "errors" + "sync" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/eval" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// ErrRootDeleted is returned from Executor.Query when a mapping deletes the +// root of the output document. +var ErrRootDeleted = errors.New("root was deleted") + +// Executor is the compiled form of a Bloblang V2 mapping. It is safe for +// concurrent use: Query and QueryMetadata allocate independent interpreter +// state per call (pooled via sync.Pool). +type Executor struct { + program *syntax.Program + + pluginMethods map[string]eval.MethodSpec + pluginFunctions map[string]eval.FunctionSpec + + pool sync.Pool +} + +func newExecutor( + program *syntax.Program, + pluginMethods map[string]eval.MethodSpec, + pluginFunctions map[string]eval.FunctionSpec, +) *Executor { + e := &Executor{ + program: program, + pluginMethods: pluginMethods, + pluginFunctions: pluginFunctions, + } + e.pool.New = func() any { return e.newInterp() } + return e +} + +func (e *Executor) newInterp() *eval.Interpreter { + interp := eval.NewWithStdlib(e.program) + for name, spec := range e.pluginMethods { + interp.RegisterMethod(name, spec) + } + for name, spec := range e.pluginFunctions { + interp.RegisterFunction(name, spec) + } + return interp +} + +// Query executes the mapping against an input value and returns the output. +// If the mapping deletes the root, ErrRootDeleted is returned. +func (e *Executor) Query(input any) (any, error) { + output, _, err := e.query(input, nil) + return output, err +} + +// QueryMetadata is Query plus access to the resulting metadata map. The +// returned metadata may be nil if the mapping didn't assign any. +func (e *Executor) QueryMetadata(input any, inputMeta map[string]any) (any, map[string]any, error) { + return e.query(input, inputMeta) +} + +// QueryMessage executes the mapping with a bound MessageContext, making +// the message-coupled stdlib (batch_index, content, error, ...) +// available to the mapping. Input value and input metadata are taken +// from the context. +// +// Mappings that reference message-coupled functions but are run via +// Query / QueryMetadata (no bound context) produce a runtime error; +// QueryMessage is the path callers wire when they have a pipeline +// message in hand. +func (e *Executor) QueryMessage(ctx MessageContext) (any, map[string]any, error) { + interp := e.pool.Get().(*eval.Interpreter) + defer e.pool.Put(interp) + + output, outputMeta, deleted, err := interp.RunWithMessage(ctx) + if err != nil { + return nil, nil, err + } + if deleted { + return nil, nil, ErrRootDeleted + } + return output, outputMeta, nil +} + +func (e *Executor) query(input any, inputMeta map[string]any) (any, map[string]any, error) { + interp := e.pool.Get().(*eval.Interpreter) + defer e.pool.Put(interp) + + output, outputMeta, deleted, err := interp.Run(input, inputMeta) + if err != nil { + return nil, nil, err + } + if deleted { + return nil, nil, ErrRootDeleted + } + return output, outputMeta, nil +} diff --git a/public/bloblangv2/function.go b/public/bloblangv2/function.go new file mode 100644 index 000000000..5ebe1910f --- /dev/null +++ b/public/bloblangv2/function.go @@ -0,0 +1,14 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +// Function is the runtime closure implementing a plugin function. It takes +// no receiver — plugins that operate on a value should be registered as a +// Method instead. +type Function func() (any, error) + +// FunctionConstructor constructs a Function from arguments resolved against a +// PluginSpec. When all arguments at a call site are literal, the constructor +// is invoked once at parse time and its Function is reused across every +// Query. When any argument is dynamic, the constructor is invoked per call. +type FunctionConstructor func(args *ParsedParams) (Function, error) diff --git a/public/bloblangv2/lambda_test.go b/public/bloblangv2/lambda_test.go new file mode 100644 index 000000000..5b3ef108a --- /dev/null +++ b/public/bloblangv2/lambda_test.go @@ -0,0 +1,163 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2_test + +import ( + "fmt" + "testing" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// TestPluginFunctionLambdaParam exercises the function-side of the plugin +// lambda support, which has no V1 stdlib equivalent. The function takes +// a count and a producer lambda; the lambda is invoked count times with +// its index and the results are collected. +func TestPluginFunctionLambdaParam(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + if err := env.RegisterFunction("repeat_call", + bloblangv2.NewPluginSpec(). + Description("Calls fn(i) for i in 0..count-1 and returns the array of results."). + Param(bloblangv2.NewInt64Param("count").Description("Number of invocations.")). + Param(bloblangv2.NewLambdaParam("fn").Description("Producer lambda receiving the index.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + n, err := args.GetInt64("count") + if err != nil { + return nil, err + } + fn, err := args.GetLambda("fn") + if err != nil { + return nil, err + } + return func() (any, error) { + out := make([]any, 0, n) + for i := int64(0); i < n; i++ { + v, err := fn(i) + if err != nil { + return nil, fmt.Errorf("index %d: %w", i, err) + } + out = append(out, v) + } + return out, nil + }, nil + }, + ); err != nil { + t.Fatal(err) + } + + exec, err := env.Parse(`output = repeat_call(3, i -> "v" + i.string())`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, err := exec.Query(nil) + if err != nil { + t.Fatalf("query: %v", err) + } + want := []any{"v0", "v1", "v2"} + if fmt.Sprintf("%v", got) != fmt.Sprintf("%v", want) { + t.Fatalf("got %v, want %v", got, want) + } +} + +// TestPluginMethodLambdaMultiArg exercises a plugin method that invokes a +// lambda with more than one positional argument — V2 lambdas are +// expressed as `(a, b) -> body`, and the plugin Lambda closure is +// declared as `func(args ...any)` precisely so multi-arg dispatch works +// without per-arity wrappers. The test uses a fold-shaped reducer that +// passes `(tally, element)` on each step. +func TestPluginMethodLambdaMultiArg(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + if err := env.RegisterMethod("reduce_pairs", + bloblangv2.NewPluginSpec(). + Description("Reduce an array using an explicit two-arg (tally, value) lambda."). + Param(bloblangv2.NewAnyParam("initial").Description("Initial tally.")). + Param(bloblangv2.NewLambdaParam("step").Description("Lambda invoked as step(tally, value).")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + initial, err := args.Get("initial") + if err != nil { + return nil, err + } + step, err := args.GetLambda("step") + if err != nil { + return nil, err + } + return bloblangv2.ArrayMethod(func(arr []any) (any, error) { + tally := initial + for _, v := range arr { + next, err := step(tally, v) + if err != nil { + return nil, err + } + tally = next + } + return tally, nil + }), nil + }, + ); err != nil { + t.Fatal(err) + } + + exec, err := env.Parse(`output = input.reduce_pairs(0, (tally, v) -> tally + v)`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, err := exec.Query([]any{int64(1), int64(2), int64(3), int64(4)}) + if err != nil { + t.Fatalf("query: %v", err) + } + if got != int64(10) { + t.Fatalf("got %v, want 10", got) + } +} + +func TestPluginMethodLambdaWithMapRef(t *testing.T) { + // Bare map references are valid in lambda positions per spec §5.5. + // A mapping that references a user-defined map by name should + // synthesise a single-param lambda automatically when invoking a + // plugin method that takes a Lambda. + env := bloblangv2.NewEnvironment() + if err := env.RegisterMethod("count_matches", + bloblangv2.NewPluginSpec(). + Description("Counts elements for which the predicate is true."). + Param(bloblangv2.NewLambdaParam("pred")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + pred, err := args.GetLambda("pred") + if err != nil { + return nil, err + } + return bloblangv2.ArrayMethod(func(arr []any) (any, error) { + n := int64(0) + for _, e := range arr { + out, err := pred(e) + if err != nil { + return nil, err + } + if b, ok := out.(bool); ok && b { + n++ + } + } + return n, nil + }), nil + }, + ); err != nil { + t.Fatal(err) + } + + src := ` +map is_big(data) { + data > 5 +} +output = input.count_matches(is_big) +` + exec, err := env.Parse(src) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, err := exec.Query([]any{int64(1), int64(7), int64(3), int64(8)}) + if err != nil { + t.Fatalf("query: %v", err) + } + if got != int64(2) { + t.Fatalf("got %v, want 2", got) + } +} diff --git a/public/bloblangv2/messagecontext.go b/public/bloblangv2/messagecontext.go new file mode 100644 index 000000000..8b3f619d0 --- /dev/null +++ b/public/bloblangv2/messagecontext.go @@ -0,0 +1,53 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +// MessageContext is the read surface used by Bloblang V2's +// message-coupled stdlib (batch_index, batch_size, content, error, +// errored, tracing_id, tracing_span). Callers running a mapping with +// access to a pipeline message build a MessageContext and pass it to +// Executor.QueryMessage. +// +// The interface is intentionally small: it covers only the read paths +// the bundled stdlib needs, so the V2 executor stays decoupled from +// public/service.Message and remains usable in non-pipeline contexts +// (for example, tests stub the interface directly). +// +// Mappings parsed by an Environment without a bound message — i.e. +// invoked through Executor.Query or Executor.QueryMetadata — that +// reference a message-coupled function will produce a runtime error +// rather than a silent null fallback. +type MessageContext interface { + // Input returns the structured form of the message body to bind to + // the mapping's `input` keyword. May be []byte, a scalar, an array, + // an object, or nil. + Input() any + + // Metadata returns a snapshot of the message metadata to bind to + // `input@`. Returning nil is equivalent to an empty map. + Metadata() map[string]any + + // Bytes returns the raw byte form of the message body, used by + // `content()`. + Bytes() []byte + + // Error returns the error currently set on the message, or nil. Used + // by `error()` and `errored()`. + Error() error + + // BatchIndex is the 0-based position of the current message within + // its batch. Used by `batch_index()`. + BatchIndex() int + + // BatchSize is the total number of messages in the current batch. + // Used by `batch_size()`. + BatchSize() int + + // TraceID returns the OpenTelemetry trace ID associated with the + // message, or the empty string if none. Used by `tracing_id()`. + TraceID() string + + // Span returns the active tracing span for the message, or nil. + // Used by `tracing_span()`. + Span() any +} diff --git a/public/bloblangv2/messagecontext_test.go b/public/bloblangv2/messagecontext_test.go new file mode 100644 index 000000000..268a4146b --- /dev/null +++ b/public/bloblangv2/messagecontext_test.go @@ -0,0 +1,151 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2_test + +import ( + "errors" + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// stubMessage is a minimal MessageContext used to drive the +// message-coupled stdlib in tests without standing up a full +// service.Message + pipeline. +type stubMessage struct { + input any + meta map[string]any + bytes []byte + err error + batchIndex int + batchSize int + traceID string + span any +} + +func (s *stubMessage) Input() any { return s.input } +func (s *stubMessage) Metadata() map[string]any { return s.meta } +func (s *stubMessage) Bytes() []byte { return s.bytes } +func (s *stubMessage) Error() error { return s.err } +func (s *stubMessage) BatchIndex() int { return s.batchIndex } +func (s *stubMessage) BatchSize() int { return s.batchSize } +func (s *stubMessage) TraceID() string { return s.traceID } +func (s *stubMessage) Span() any { return s.span } + +func TestQueryMessageBatchPosition(t *testing.T) { + exec, err := bloblangv2.NewEnvironment().Parse(` +output.idx = batch_index() +output.size = batch_size() +`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, _, err := exec.QueryMessage(&stubMessage{batchIndex: 2, batchSize: 5}) + if err != nil { + t.Fatalf("query: %v", err) + } + m, ok := got.(map[string]any) + if !ok { + t.Fatalf("expected object output, got %T", got) + } + if m["idx"] != int64(2) || m["size"] != int64(5) { + t.Fatalf("unexpected output: %v", m) + } +} + +func TestQueryMessageContent(t *testing.T) { + exec, err := bloblangv2.NewEnvironment().Parse(`output = content()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, _, err := exec.QueryMessage(&stubMessage{bytes: []byte("hello")}) + if err != nil { + t.Fatalf("query: %v", err) + } + b, ok := got.([]byte) + if !ok { + t.Fatalf("expected []byte output, got %T", got) + } + if string(b) != "hello" { + t.Fatalf("got %q, want %q", string(b), "hello") + } +} + +func TestQueryMessageErrorObject(t *testing.T) { + exec, err := bloblangv2.NewEnvironment().Parse(` +output.failed = errored() +output.err = error() +`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, _, err := exec.QueryMessage(&stubMessage{err: errors.New("kapow")}) + if err != nil { + t.Fatalf("query: %v", err) + } + m := got.(map[string]any) + if m["failed"] != true { + t.Fatalf("expected failed=true, got %v", m["failed"]) + } + errObj, ok := m["err"].(map[string]any) + if !ok { + t.Fatalf("expected error to be an object, got %T", m["err"]) + } + if errObj["what"] != "kapow" { + t.Fatalf("expected what=kapow, got %v", errObj["what"]) + } +} + +func TestQueryMessageNoErrorReturnsNull(t *testing.T) { + exec, err := bloblangv2.NewEnvironment().Parse(`output.err = error()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, _, err := exec.QueryMessage(&stubMessage{}) + if err != nil { + t.Fatalf("query: %v", err) + } + m := got.(map[string]any) + if m["err"] != nil { + t.Fatalf("expected err=nil, got %v", m["err"]) + } +} + +func TestQueryWithoutMessageErrorsOnMessageFunction(t *testing.T) { + exec, err := bloblangv2.NewEnvironment().Parse(`output = batch_index()`) + if err != nil { + t.Fatalf("parse: %v", err) + } + _, qerr := exec.Query(nil) + if qerr == nil { + t.Fatalf("expected error when calling message-coupled function via Query") + } + if !strings.Contains(qerr.Error(), "requires a message context") { + t.Fatalf("error message did not mention message context: %v", qerr) + } +} + +func TestQueryMessageInputAndMetaStillBound(t *testing.T) { + exec, err := bloblangv2.NewEnvironment().Parse(` +output.value = input.x +output.region = input@["region"] +`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, _, err := exec.QueryMessage(&stubMessage{ + input: map[string]any{"x": int64(7)}, + meta: map[string]any{"region": "eu-west"}, + }) + if err != nil { + t.Fatalf("query: %v", err) + } + m := got.(map[string]any) + if m["value"] != int64(7) { + t.Fatalf("expected value=7, got %v", m["value"]) + } + if m["region"] != "eu-west" { + t.Fatalf("expected region=eu-west, got %v", m["region"]) + } +} diff --git a/public/bloblangv2/method.go b/public/bloblangv2/method.go new file mode 100644 index 000000000..c57d06620 --- /dev/null +++ b/public/bloblangv2/method.go @@ -0,0 +1,141 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +import ( + "fmt" + "time" +) + +// Method is the runtime closure implementing a plugin method. It is invoked +// once per method-call evaluation with the (already-evaluated) receiver value. +type Method func(v any) (any, error) + +// Lambda is the callable form of a lambda argument passed to a plugin. The +// plugin author retrieves it via ParsedParams.GetLambda and invokes it once +// per element / iteration with the values to bind to the lambda's parameters. +// +// Errors returned by the lambda body are surfaced unchanged. A lambda that +// returns a void value (an if-without-else that didn't match, etc.) yields a +// non-nil error so plugin code can detect and react to it; plugins that +// tolerate void must check for it explicitly. +type Lambda func(args ...any) (any, error) + +// MethodConstructor constructs a Method from arguments resolved against a +// PluginSpec. When all arguments at a call site are literal, the constructor +// is invoked once at parse time and its Method is reused across every Query. +// When any argument is dynamic, the constructor is invoked per call. +type MethodConstructor func(args *ParsedParams) (Method, error) + +// StringMethod wraps a string-receiver function with a type check. +// +// V2 typed wrappers are strict: they do not coerce non-string inputs. Callers +// whose upstream value might be a number or bytes should use .string() in the +// mapping to coerce before invoking the method. +func StringMethod(fn func(string) (any, error)) Method { + return func(v any) (any, error) { + s, ok := v.(string) + if !ok { + return nil, fmt.Errorf("expected string receiver, got %T", v) + } + return fn(s) + } +} + +// BytesMethod wraps a []byte-receiver function with a type check. +// +// Strict: does not coerce strings. Use .bytes() in the mapping if the +// upstream value is a string. +func BytesMethod(fn func([]byte) (any, error)) Method { + return func(v any) (any, error) { + b, ok := v.([]byte) + if !ok { + return nil, fmt.Errorf("expected bytes receiver, got %T", v) + } + return fn(b) + } +} + +// Int64Method wraps an int64-receiver function. +// +// Widens within the integer family: int32, uint32, and in-range uint64 are +// accepted. Strings and floats are not coerced — use .int64() in the mapping +// for broader conversion. +func Int64Method(fn func(int64) (any, error)) Method { + return func(v any) (any, error) { + n, err := coerceInt64(v) + if err != nil { + return nil, err + } + return fn(n) + } +} + +// Float64Method wraps a float64-receiver function. +// +// Widens within the numeric family: float32, int32, int64, uint32, uint64 +// are accepted. Strings are not coerced — use .float64() in the mapping if +// the upstream value is a string. +func Float64Method(fn func(float64) (any, error)) Method { + return func(v any) (any, error) { + f, err := coerceFloat64(v) + if err != nil { + return nil, err + } + return fn(f) + } +} + +// BoolMethod wraps a bool-receiver function with a type check. +// +// Strict: does not coerce from strings, ints, or floats. Use .bool() in the +// mapping for coercion. +func BoolMethod(fn func(bool) (any, error)) Method { + return func(v any) (any, error) { + b, ok := v.(bool) + if !ok { + return nil, fmt.Errorf("expected bool receiver, got %T", v) + } + return fn(b) + } +} + +// TimestampMethod wraps a time.Time-receiver function with a type check. +// +// Strict: does not coerce from strings or unix timestamp numbers. Use +// .ts_parse() or .ts_from_unix() in the mapping first. +func TimestampMethod(fn func(time.Time) (any, error)) Method { + return func(v any) (any, error) { + t, ok := v.(time.Time) + if !ok { + return nil, fmt.Errorf("expected timestamp receiver, got %T", v) + } + return fn(t) + } +} + +// ArrayMethod wraps an []any-receiver function with a type check. +// +// Strict: the receiver must already be an array. +func ArrayMethod(fn func([]any) (any, error)) Method { + return func(v any) (any, error) { + arr, ok := v.([]any) + if !ok { + return nil, fmt.Errorf("expected array receiver, got %T", v) + } + return fn(arr) + } +} + +// ObjectMethod wraps a map[string]any-receiver function with a type check. +// +// Strict: the receiver must already be an object. +func ObjectMethod(fn func(map[string]any) (any, error)) Method { + return func(v any) (any, error) { + obj, ok := v.(map[string]any) + if !ok { + return nil, fmt.Errorf("expected object receiver, got %T", v) + } + return fn(obj) + } +} diff --git a/public/bloblangv2/package.go b/public/bloblangv2/package.go new file mode 100644 index 000000000..7938ab3fd --- /dev/null +++ b/public/bloblangv2/package.go @@ -0,0 +1,55 @@ +// Copyright 2026 Redpanda Data, Inc. + +// Package bloblangv2 provides the public API for parsing and executing +// Bloblang V2 mappings, and for extending the language with user-defined +// methods and functions. +// +// Bloblang V2 is a redesigned mapping language shipped alongside the +// existing V1 implementation in public/bloblang. V2 and V1 are separate +// languages with separate parsers, interpreters, and plugin registries; +// plugins registered against one cannot be used from the other. +// +// The core types are: +// +// - Environment: an isolated registry of methods and functions that +// mappings may invoke. The global environment can be accessed via +// GlobalEnvironment, or isolated ones built via NewEnvironment / +// NewEmptyEnvironment. +// - Executor: the compiled form of a mapping, produced by +// Environment.Parse. Executors are safe for concurrent use. +// - PluginSpec: a builder for declaring the signature of a plugin +// method or function, including parameter types, defaults, and +// documentation metadata. +// - Method / Function: the closures that implement a plugin. Use the +// typed wrappers (StringMethod, Int64Method, etc.) to avoid writing +// receiver type-checks by hand. +// +// See the examples for a walkthrough of registering a plugin method. +// +// # Coexistence with V1 +// +// V2 ships alongside V1 rather than replacing it. The two languages have +// separate plugin registries: methods and functions registered on a +// public/bloblangv2 Environment are not visible to V1 mappings, and +// vice versa. The V1 stdlib has been ported method-by-method to the V2 +// surface under internal/impl; the per-method status, including any +// semantic shifts (e.g. variadic arguments folded into arrays, error +// object shape), is tracked in internal/bloblang2/PARITY.md. +// +// Host components select a language per field. A bloblang field uses +// the V1 environment and is linted via the V1 path; a bloblang_v2 field +// uses the V2 environment and is linted via LintBloblangV2Mapping in +// internal/docs. Components must pick one or the other — there is no +// "accept either" field type. +// +// One known gap: interpolated string fields (the ${! ... } form) still +// dispatch through the V1 environment only. Plugins registered as +// V2-only methods will not be available inside interpolated strings, +// even when the host component also exposes a bloblang_v2 mapping +// field. The remaining-work list in internal/bloblang2/REMAINING.md +// tracks this and other gaps. +// +// For migrating existing V1 mappings and configs to V2, see the +// public/bloblangv2/migrator (mapping-level) and public/service/migrator +// (config-level) packages. +package bloblangv2 diff --git a/public/bloblangv2/params.go b/public/bloblangv2/params.go new file mode 100644 index 000000000..33aaec20f --- /dev/null +++ b/public/bloblangv2/params.go @@ -0,0 +1,254 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +import ( + "errors" + "fmt" + "math" +) + +// ParsedParams holds the resolved argument values passed to a plugin +// constructor. Values have been positionally matched to the plugin's +// ParamDefinitions and, where applicable, coerced into the declared kind. +// Plugin authors query arguments by name via GetString, GetInt64, and friends. +type ParsedParams struct { + spec *PluginSpec + byName map[string]any + raw []any +} + +// newParsedParams resolves raw positional arguments against a PluginSpec. It +// applies defaults for missing optional parameters, validates types, and +// rejects surplus arguments — V2 plugin signatures are always bounded by +// their declared parameter list. +func newParsedParams(spec *PluginSpec, rawArgs []any) (*ParsedParams, error) { + pp := &ParsedParams{spec: spec, raw: rawArgs} + + pp.byName = make(map[string]any, len(spec.params)) + for i, p := range spec.params { + if i < len(rawArgs) { + v, err := coerceArg(rawArgs[i], p) + if err != nil { + return nil, fmt.Errorf("argument %q: %w", p.name, err) + } + pp.byName[p.name] = v + continue + } + // Missing argument. + if p.hasDefault { + pp.byName[p.name] = p.defaultVal + continue + } + if p.optional { + continue + } + return nil, fmt.Errorf("missing required argument %q", p.name) + } + + if len(rawArgs) > len(spec.params) { + return nil, fmt.Errorf("too many arguments: got %d, expected at most %d", len(rawArgs), len(spec.params)) + } + + return pp, nil +} + +// coerceArg validates and, where safe, coerces a raw argument into the kind +// declared on the ParamDefinition. Bloblang V2 admits multiple integer and +// float widths, so Int64Param accepts any integer that fits in int64 and +// Float64Param accepts any numeric value losslessly convertible to float64. +func coerceArg(v any, p ParamDefinition) (any, error) { + switch p.kind { + case paramKindAny: + return v, nil + case paramKindString: + s, ok := v.(string) + if !ok { + return nil, fmt.Errorf("expected string, got %T", v) + } + return s, nil + case paramKindBool: + b, ok := v.(bool) + if !ok { + return nil, fmt.Errorf("expected bool, got %T", v) + } + return b, nil + case paramKindInt64: + return coerceInt64(v) + case paramKindFloat64: + return coerceFloat64(v) + case paramKindLambda: + // Lambda params are pre-wrapped by the plugin dispatcher into a + // Lambda closure; pass through unchanged. + if _, ok := v.(Lambda); !ok { + return nil, fmt.Errorf("expected lambda argument, got %T", v) + } + return v, nil + } + return nil, fmt.Errorf("unsupported param kind %d", p.kind) +} + +func coerceInt64(v any) (int64, error) { + switch n := v.(type) { + case int64: + return n, nil + case int32: + return int64(n), nil + case uint32: + return int64(n), nil + case uint64: + if n > math.MaxInt64 { + return 0, errors.New("uint64 value exceeds int64 range") + } + return int64(n), nil + } + return 0, fmt.Errorf("expected integer, got %T", v) +} + +func coerceFloat64(v any) (float64, error) { + switch n := v.(type) { + case float64: + return n, nil + case float32: + return float64(n), nil + case int64: + return float64(n), nil + case int32: + return float64(n), nil + case uint32: + return float64(n), nil + case uint64: + return float64(n), nil + } + return 0, fmt.Errorf("expected number, got %T", v) +} + +// GetLambda returns the Lambda closure bound to a named lambda parameter. +// It errors when the parameter was not declared as a lambda or was not +// provided. +func (p *ParsedParams) GetLambda(name string) (Lambda, error) { + v, err := p.Get(name) + if err != nil { + return nil, err + } + lam, ok := v.(Lambda) + if !ok { + return nil, fmt.Errorf("parameter %q is not a lambda (got %T)", name, v) + } + return lam, nil +} + +// Get returns the value associated with a named parameter. An error is +// returned if the parameter was optional and not provided. +func (p *ParsedParams) Get(name string) (any, error) { + v, ok := p.byName[name] + if !ok { + return nil, fmt.Errorf("parameter %q was not provided", name) + } + return v, nil +} + +// GetString returns a string argument by name. +func (p *ParsedParams) GetString(name string) (string, error) { + v, err := p.Get(name) + if err != nil { + return "", err + } + s, ok := v.(string) + if !ok { + return "", fmt.Errorf("parameter %q is not a string", name) + } + return s, nil +} + +// GetOptionalString returns the argument's value if provided, or nil +// otherwise. The param must be declared as a string parameter. +func (p *ParsedParams) GetOptionalString(name string) (*string, error) { + v, ok := p.byName[name] + if !ok { + return nil, nil + } + s, ok := v.(string) + if !ok { + return nil, fmt.Errorf("parameter %q is not a string", name) + } + return &s, nil +} + +// GetInt64 returns an int64 argument by name. +func (p *ParsedParams) GetInt64(name string) (int64, error) { + v, err := p.Get(name) + if err != nil { + return 0, err + } + n, ok := v.(int64) + if !ok { + return 0, fmt.Errorf("parameter %q is not an int64", name) + } + return n, nil +} + +// GetOptionalInt64 returns the argument's value if provided, or nil otherwise. +func (p *ParsedParams) GetOptionalInt64(name string) (*int64, error) { + v, ok := p.byName[name] + if !ok { + return nil, nil + } + n, ok := v.(int64) + if !ok { + return nil, fmt.Errorf("parameter %q is not an int64", name) + } + return &n, nil +} + +// GetFloat64 returns a float64 argument by name. +func (p *ParsedParams) GetFloat64(name string) (float64, error) { + v, err := p.Get(name) + if err != nil { + return 0, err + } + f, ok := v.(float64) + if !ok { + return 0, fmt.Errorf("parameter %q is not a float64", name) + } + return f, nil +} + +// GetOptionalFloat64 returns the argument's value if provided, or nil. +func (p *ParsedParams) GetOptionalFloat64(name string) (*float64, error) { + v, ok := p.byName[name] + if !ok { + return nil, nil + } + f, ok := v.(float64) + if !ok { + return nil, fmt.Errorf("parameter %q is not a float64", name) + } + return &f, nil +} + +// GetBool returns a bool argument by name. +func (p *ParsedParams) GetBool(name string) (bool, error) { + v, err := p.Get(name) + if err != nil { + return false, err + } + b, ok := v.(bool) + if !ok { + return false, fmt.Errorf("parameter %q is not a bool", name) + } + return b, nil +} + +// GetOptionalBool returns the argument's value if provided, or nil. +func (p *ParsedParams) GetOptionalBool(name string) (*bool, error) { + v, ok := p.byName[name] + if !ok { + return nil, nil + } + b, ok := v.(bool) + if !ok { + return nil, fmt.Errorf("parameter %q is not a bool", name) + } + return &b, nil +} diff --git a/public/bloblangv2/parse_error.go b/public/bloblangv2/parse_error.go new file mode 100644 index 000000000..219ee8e6d --- /dev/null +++ b/public/bloblangv2/parse_error.go @@ -0,0 +1,47 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +import ( + "strings" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// ParseError is a structured error type for Bloblang V2 parser and resolver +// errors that provides access to the line and column of the first reported +// issue. The full list of diagnostics is preserved and rendered by Error. +type ParseError struct { + // Line and Column describe the position of the first diagnostic. + Line int + Column int + + errs []syntax.PosError +} + +// Error returns a multi-line error string listing every diagnostic. +func (p *ParseError) Error() string { + if len(p.errs) == 0 { + return "bloblangv2: unknown parse error" + } + var b strings.Builder + for i, e := range p.errs { + if i > 0 { + b.WriteByte('\n') + } + b.WriteString(e.Error()) + } + return b.String() +} + +func parseErrorFromPosErrors(errs []syntax.PosError) *ParseError { + if len(errs) == 0 { + return nil + } + first := errs[0].Pos + return &ParseError{ + Line: first.Line, + Column: first.Column, + errs: errs, + } +} diff --git a/public/bloblangv2/spec.go b/public/bloblangv2/spec.go new file mode 100644 index 000000000..d5cd2c168 --- /dev/null +++ b/public/bloblangv2/spec.go @@ -0,0 +1,174 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +// pluginStatus captures the lifecycle stage of a plugin (stable, experimental, +// etc.). Used for documentation only; it has no effect on parsing or execution. +type pluginStatus string + +const ( + statusStable pluginStatus = "" + statusExperimental pluginStatus = "experimental" + statusBeta pluginStatus = "beta" + statusDeprecated pluginStatus = "deprecated" +) + +// paramKind classifies the expected Go type of a plugin argument. Values +// flowing through ParsedParams are coerced / validated against this kind. +type paramKind int + +const ( + paramKindAny paramKind = iota + paramKindString + paramKindInt64 + paramKindFloat64 + paramKindBool + // paramKindLambda denotes a parameter that accepts an unevaluated + // callable. The plugin receives a Lambda closure via + // ParsedParams.GetLambda; argument expressions and bare map references + // are wrapped automatically. + paramKindLambda +) + +// ParamDefinition describes a single parameter accepted by a plugin. Build +// instances with the NewStringParam / NewInt64Param / ... constructors and +// chain Optional, Default, or Description as needed. +type ParamDefinition struct { + name string + description string + kind paramKind + optional bool + hasDefault bool + defaultVal any +} + +// NewStringParam creates a new string typed parameter. +func NewStringParam(name string) ParamDefinition { + return ParamDefinition{name: name, kind: paramKindString} +} + +// NewInt64Param creates a new 64-bit integer typed parameter. +func NewInt64Param(name string) ParamDefinition { + return ParamDefinition{name: name, kind: paramKindInt64} +} + +// NewFloat64Param creates a new float64 typed parameter. +func NewFloat64Param(name string) ParamDefinition { + return ParamDefinition{name: name, kind: paramKindFloat64} +} + +// NewBoolParam creates a new bool typed parameter. +func NewBoolParam(name string) ParamDefinition { + return ParamDefinition{name: name, kind: paramKindBool} +} + +// NewAnyParam creates a new parameter that accepts any value type. +func NewAnyParam(name string) ParamDefinition { + return ParamDefinition{name: name, kind: paramKindAny} +} + +// NewLambdaParam creates a parameter that accepts a lambda expression. The +// plugin retrieves an invocable closure via ParsedParams.GetLambda. Bare +// map references in mappings (e.g. `arr.find_by(my_map)`) are also accepted +// where the underlying map takes a single required parameter; the resolver +// synthesises the equivalent lambda automatically. +func NewLambdaParam(name string) ParamDefinition { + return ParamDefinition{name: name, kind: paramKindLambda} +} + +// Description attaches an optional human-readable description to the +// parameter, used by documentation generators. +func (d ParamDefinition) Description(str string) ParamDefinition { + d.description = str + return d +} + +// Optional marks the parameter as optional; callers may omit it. +func (d ParamDefinition) Optional() ParamDefinition { + d.optional = true + return d +} + +// Default assigns a default value to the parameter, implicitly marking it +// optional. +func (d ParamDefinition) Default(v any) ParamDefinition { + d.optional = true + d.hasDefault = true + d.defaultVal = v + return d +} + +// PluginSpec describes the signature and documentation of a plugin method or +// function. Build with NewPluginSpec, then chain Param, Description, etc. +type PluginSpec struct { + status pluginStatus + category string + description string + version string + impure bool + requiresMessageContext bool + params []ParamDefinition +} + +// NewPluginSpec creates an empty plugin spec. +func NewPluginSpec() *PluginSpec { + return &PluginSpec{} +} + +// Description attaches a human-readable description to the plugin. +func (p *PluginSpec) Description(s string) *PluginSpec { + p.description = s + return p +} + +// Category attaches an optional category string used by documentation +// generators to group related plugins. +func (p *PluginSpec) Category(s string) *PluginSpec { + p.category = s + return p +} + +// Version records the release in which the plugin was introduced. +func (p *PluginSpec) Version(v string) *PluginSpec { + p.version = v + return p +} + +// Experimental flags the plugin as experimental. +func (p *PluginSpec) Experimental() *PluginSpec { + p.status = statusExperimental + return p +} + +// Beta flags the plugin as beta-quality. +func (p *PluginSpec) Beta() *PluginSpec { + p.status = statusBeta + return p +} + +// Deprecated flags the plugin as deprecated. It remains callable but is +// de-emphasised in documentation. +func (p *PluginSpec) Deprecated() *PluginSpec { + p.status = statusDeprecated + return p +} + +// Impure marks the plugin as having side effects or observing state outside +// the mapping (e.g. reading env vars). Impure plugins are stripped from +// environments produced by Environment.OnlyPure. +func (p *PluginSpec) Impure() *PluginSpec { + p.impure = true + return p +} + +// Param appends a parameter to the plugin spec. Positional arguments must be +// supplied in the order Param is called. +// +// Variadic plugins are intentionally not supported: the V2 specification +// (sections 5.3, 10, 13) bounds arity by the declared parameter list — any +// extra positional argument is an error. Plugin authors that need to accept +// a list of values should declare a single array-typed parameter. +func (p *PluginSpec) Param(d ParamDefinition) *PluginSpec { + p.params = append(p.params, d) + return p +} diff --git a/public/bloblangv2/view.go b/public/bloblangv2/view.go new file mode 100644 index 000000000..8d6495c3f --- /dev/null +++ b/public/bloblangv2/view.go @@ -0,0 +1,243 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2 + +import ( + "encoding/json" +) + +// PluginParamInfo is a JSON-serialisable description of a single plugin +// parameter, suitable for embedding in generated schemas. The JSON tag names +// mirror those used by the V1 bloblang schema for tooling consistency. +type PluginParamInfo struct { + Name string `json:"name"` + Description string `json:"description,omitempty"` + // Kind is one of "any", "string", "int64", "float64", "bool". + Kind string `json:"kind"` + IsOptional bool `json:"is_optional,omitempty"` + HasDefault bool `json:"has_default,omitempty"` + // Default is the default value for the parameter when one is set. + Default any `json:"default,omitempty"` +} + +// UnmarshalJSON normalises Default against Kind so that round-trip through +// JSON preserves Go-typed semantics. encoding/json decodes every numeric +// literal as float64; without this hook a parameter declared with +// NewInt64Param("x").Default(int64(5)) would round-trip with Default of type +// float64, leaving the typed Kind ("int64") inconsistent with the stored +// value. +func (p *PluginParamInfo) UnmarshalJSON(b []byte) error { + type alias PluginParamInfo + var a alias + if err := json.Unmarshal(b, &a); err != nil { + return err + } + a.Default = coerceDefaultByKind(a.Default, a.Kind) + *p = PluginParamInfo(a) + return nil +} + +// coerceDefaultByKind brings a JSON-decoded default value back into the Go +// type implied by the parameter's Kind. Only numeric kinds need coercion; +// bool and string already round-trip with their declared types intact. +func coerceDefaultByKind(v any, kind string) any { + switch kind { + case "int64": + if f, ok := v.(float64); ok { + return int64(f) + } + case "float64": + if f, ok := v.(float64); ok { + return f + } + } + return v +} + +// PluginInfo is a JSON-serialisable description of a registered V2 method or +// function. Use Environment.WalkMethods / Environment.WalkFunctions to obtain +// these via the FunctionView / MethodView wrappers. +type PluginInfo struct { + Name string `json:"name"` + Status string `json:"status,omitempty"` + Category string `json:"category,omitempty"` + Description string `json:"description,omitempty"` + Version string `json:"version,omitempty"` + Impure bool `json:"impure,omitempty"` + // RequiresMessageContext signals that the function reads from a + // pipeline message (batch position, content bytes, error, tracing). + // Such functions only resolve when the executor is run via + // Executor.QueryMessage; calls from a plain Query / QueryMetadata + // path produce a runtime error. Tooling (linters, docs generators) + // can use the flag to gate suggestions. + RequiresMessageContext bool `json:"requires_message_context,omitempty"` + Params []PluginParamInfo `json:"params,omitempty"` +} + +// FunctionView describes a V2 function plugin registered against an +// Environment. Obtain instances via Environment.WalkFunctions. +type FunctionView struct { + info PluginInfo +} + +// Name returns the function name as used in mappings. +func (v *FunctionView) Name() string { return v.info.Name } + +// Status returns the lifecycle stage of the function (stable, experimental, +// beta, deprecated). An empty string is equivalent to stable. +func (v *FunctionView) Status() string { return v.info.Status } + +// Description returns the human-readable description, if one was provided. +func (v *FunctionView) Description() string { return v.info.Description } + +// Info returns the underlying serialisable description of the function. +func (v *FunctionView) Info() PluginInfo { return v.info } + +// FormatJSON returns the function description as a JSON object. The schema of +// the document is the PluginInfo struct. +// +// Experimental: this method is intended for tooling and may change without +// notice. +func (v *FunctionView) FormatJSON() ([]byte, error) { + return json.Marshal(v.info) +} + +// MethodView describes a V2 method plugin registered against an Environment. +// Obtain instances via Environment.WalkMethods. +type MethodView struct { + info PluginInfo +} + +// Name returns the method name as used in mappings. +func (v *MethodView) Name() string { return v.info.Name } + +// Status returns the lifecycle stage of the method (stable, experimental, +// beta, deprecated). An empty string is equivalent to stable. +func (v *MethodView) Status() string { return v.info.Status } + +// Description returns the human-readable description, if one was provided. +func (v *MethodView) Description() string { return v.info.Description } + +// Info returns the underlying serialisable description of the method. +func (v *MethodView) Info() PluginInfo { return v.info } + +// FormatJSON returns the method description as a JSON object. The schema of +// the document is the PluginInfo struct. +// +// Experimental: this method is intended for tooling and may change without +// notice. +func (v *MethodView) FormatJSON() ([]byte, error) { + return json.Marshal(v.info) +} + +// pluginInfoFromSpec converts a registered plugin spec into the public-facing +// PluginInfo description used by views and schema generators. +func pluginInfoFromSpec(name string, spec *PluginSpec) PluginInfo { + if spec == nil { + return PluginInfo{Name: name} + } + info := PluginInfo{ + Name: name, + Status: string(spec.status), + Category: spec.category, + Description: spec.description, + Version: spec.version, + Impure: spec.impure, + RequiresMessageContext: spec.requiresMessageContext, + } + for _, p := range spec.params { + info.Params = append(info.Params, PluginParamInfo{ + Name: p.name, + Description: p.description, + Kind: paramKindToString(p.kind), + IsOptional: p.optional, + HasDefault: p.hasDefault, + Default: p.defaultVal, + }) + } + return info +} + +func paramKindToString(k paramKind) string { + switch k { + case paramKindString: + return "string" + case paramKindInt64: + return "int64" + case paramKindFloat64: + return "float64" + case paramKindBool: + return "bool" + case paramKindLambda: + return "lambda" + default: + return "any" + } +} + +// NewPluginSpecFromInfo reconstructs a PluginSpec from a serialised PluginInfo +// description. This is the reverse of the per-plugin information emitted by +// Environment.WalkFunctions / WalkMethods and is intended for tooling that +// loads schemas serialised from a separate process (e.g. a remote linter +// rebuilding stub registrations from a JSON schema dump). +// +// The reconstruction is performed exclusively through the public builder +// chain (NewPluginSpec, NewStringParam / NewInt64Param / ..., Description, +// Optional, Default, etc.) so that any validation enforced by those +// constructors also applies to round-tripped specs. Unknown status strings +// fall back to the default (stable) status. +// +// Experimental: this function is intended for tooling and may change without +// notice. +func NewPluginSpecFromInfo(info PluginInfo) *PluginSpec { + spec := NewPluginSpec(). + Description(info.Description). + Category(info.Category). + Version(info.Version) + if info.Impure { + spec = spec.Impure() + } + if info.RequiresMessageContext { + spec.requiresMessageContext = true + } + switch info.Status { + case string(statusExperimental): + spec = spec.Experimental() + case string(statusBeta): + spec = spec.Beta() + case string(statusDeprecated): + spec = spec.Deprecated() + } + for _, p := range info.Params { + spec = spec.Param(paramFromInfo(p)) + } + return spec +} + +func paramFromInfo(p PluginParamInfo) ParamDefinition { + var def ParamDefinition + switch p.Kind { + case "string": + def = NewStringParam(p.Name) + case "int64": + def = NewInt64Param(p.Name) + case "float64": + def = NewFloat64Param(p.Name) + case "bool": + def = NewBoolParam(p.Name) + case "lambda": + def = NewLambdaParam(p.Name) + default: + def = NewAnyParam(p.Name) + } + if p.Description != "" { + def = def.Description(p.Description) + } + switch { + case p.HasDefault: + def = def.Default(p.Default) + case p.IsOptional: + def = def.Optional() + } + return def +} diff --git a/public/bloblangv2/view_test.go b/public/bloblangv2/view_test.go new file mode 100644 index 000000000..82282c83c --- /dev/null +++ b/public/bloblangv2/view_test.go @@ -0,0 +1,221 @@ +// Copyright 2026 Redpanda Data, Inc. + +package bloblangv2_test + +import ( + "encoding/json" + "fmt" + "sort" + "testing" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +func TestEnvironmentWalkFunctions(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + + spec := bloblangv2.NewPluginSpec(). + Description("returns a constant greeting"). + Category("Test"). + Version("0.1.0"). + Param(bloblangv2.NewStringParam("name").Description("name to greet").Default("world")) + + if err := env.RegisterFunction("greet", spec, func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { return "hello", nil }, nil + }); err != nil { + t.Fatal(err) + } + + var seen []bloblangv2.PluginInfo + env.WalkFunctions(func(_ string, view *bloblangv2.FunctionView) { + seen = append(seen, view.Info()) + }) + + if len(seen) != 1 { + t.Fatalf("expected 1 function, got %d", len(seen)) + } + got := seen[0] + if got.Name != "greet" { + t.Fatalf("name=%q", got.Name) + } + if got.Description != "returns a constant greeting" { + t.Fatalf("description=%q", got.Description) + } + if got.Category != "Test" { + t.Fatalf("category=%q", got.Category) + } + if got.Version != "0.1.0" { + t.Fatalf("version=%q", got.Version) + } + if len(got.Params) != 1 { + t.Fatalf("params=%v", got.Params) + } + if got.Params[0].Name != "name" { + t.Fatalf("param name=%q", got.Params[0].Name) + } + if got.Params[0].Kind != "string" { + t.Fatalf("param kind=%q", got.Params[0].Kind) + } + if !got.Params[0].HasDefault || got.Params[0].Default != "world" { + t.Fatalf("param default not preserved: %+v", got.Params[0]) + } +} + +func TestEnvironmentWalkMethodsExcludesStdlib(t *testing.T) { + env := bloblangv2.NewEnvironment() + + if err := env.RegisterMethod("bang", bloblangv2.NewPluginSpec(), func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { return s + "!", nil }), nil + }); err != nil { + t.Fatal(err) + } + + var names []string + env.WalkMethods(func(name string, _ *bloblangv2.MethodView) { + names = append(names, name) + }) + sort.Strings(names) + + // Stdlib methods like length / uppercase / contains must NOT show up; only + // the user-registered "bang" method should be enumerated. + if len(names) != 1 || names[0] != "bang" { + t.Fatalf("walk should yield only user plugins, got %v", names) + } +} + +func TestPluginParamInfoUnmarshalNormalisesNumericDefaults(t *testing.T) { + // JSON has no integer type — encoding/json decodes every number as + // float64. The custom UnmarshalJSON on PluginParamInfo coerces Default + // back to int64 / float64 according to the declared Kind so that + // decoded specs match the Go types of the original registration. + cases := []struct { + raw string + wantKind string + wantType string + wantVal any + }{ + {`{"name":"a","kind":"int64","has_default":true,"default":5}`, "int64", "int64", int64(5)}, + {`{"name":"b","kind":"float64","has_default":true,"default":1.5}`, "float64", "float64", float64(1.5)}, + {`{"name":"c","kind":"string","has_default":true,"default":"hi"}`, "string", "string", "hi"}, + {`{"name":"d","kind":"bool","has_default":true,"default":true}`, "bool", "bool", true}, + } + for _, tc := range cases { + t.Run(tc.wantKind, func(t *testing.T) { + var p bloblangv2.PluginParamInfo + if err := json.Unmarshal([]byte(tc.raw), &p); err != nil { + t.Fatalf("unmarshal: %v", err) + } + if p.Kind != tc.wantKind { + t.Fatalf("kind=%q", p.Kind) + } + if got := fmt.Sprintf("%T", p.Default); got != tc.wantType { + t.Fatalf("default type = %s, want %s (value=%v)", got, tc.wantType, p.Default) + } + if p.Default != tc.wantVal { + t.Fatalf("default value = %v, want %v", p.Default, tc.wantVal) + } + }) + } +} + +func TestNewPluginSpecFromInfoRoundTripsRegistration(t *testing.T) { + // Construct a PluginSpec via the public builders, dump → re-load, and + // confirm we can register the reconstructed spec without losing param + // metadata. This exercises every branch of the reverse builder + // (status, params, kinds, defaults, optional) end-to-end. + original := bloblangv2.NewEmptyEnvironment() + spec := bloblangv2.NewPluginSpec(). + Description("does a thing"). + Category("Tooling"). + Version("0.2.0"). + Beta(). + Param(bloblangv2.NewStringParam("name").Description("the name")). + Param(bloblangv2.NewInt64Param("count").Default(int64(7))). + Param(bloblangv2.NewBoolParam("loud").Optional()) + if err := original.RegisterFunction("things", spec, func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { return nil, nil }, nil + }); err != nil { + t.Fatal(err) + } + + var view *bloblangv2.FunctionView + original.WalkFunctions(func(_ string, v *bloblangv2.FunctionView) { view = v }) + if view == nil { + t.Fatal("no view") + } + + // Dump → re-load. + raw, err := json.Marshal(view.Info()) + if err != nil { + t.Fatalf("marshal: %v", err) + } + var info bloblangv2.PluginInfo + if err := json.Unmarshal(raw, &info); err != nil { + t.Fatalf("unmarshal: %v", err) + } + + // Reconstruct + register on a fresh env. No private-field access is + // involved on the reverse path; if any builder validation rejected + // the rebuilt spec it would surface here. + rebuilt := bloblangv2.NewPluginSpecFromInfo(info) + rebuiltEnv := bloblangv2.NewEmptyEnvironment() + if err := rebuiltEnv.RegisterFunction("things", rebuilt, func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + // Confirm typed defaults are intact post round-trip — this is what + // the per-param custom unmarshal exists to guarantee. + count, err := args.GetInt64("count") + if err != nil { + return nil, err + } + if count != 7 { + return nil, fmt.Errorf("count=%d, expected 7", count) + } + return func() (any, error) { return count, nil }, nil + }); err != nil { + t.Fatalf("register rebuilt: %v", err) + } + + exec, err := rebuiltEnv.Parse(`output = things("hello")`) + if err != nil { + t.Fatalf("parse: %v", err) + } + got, err := exec.Query(nil) + if err != nil { + t.Fatalf("query: %v", err) + } + if got != int64(7) { + t.Fatalf("got %#v, want int64(7)", got) + } +} + +func TestFunctionViewFormatJSON(t *testing.T) { + env := bloblangv2.NewEmptyEnvironment() + spec := bloblangv2.NewPluginSpec(). + Description("desc"). + Param(bloblangv2.NewInt64Param("count").Default(int64(3))) + if err := env.RegisterFunction("counter", spec, func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { return 0, nil }, nil + }); err != nil { + t.Fatal(err) + } + + var view *bloblangv2.FunctionView + env.WalkFunctions(func(_ string, v *bloblangv2.FunctionView) { view = v }) + if view == nil { + t.Fatal("no view") + } + + raw, err := view.FormatJSON() + if err != nil { + t.Fatalf("FormatJSON: %v", err) + } + var got map[string]any + if err := json.Unmarshal(raw, &got); err != nil { + t.Fatalf("unmarshal: %v", err) + } + if got["name"] != "counter" { + t.Fatalf("json name=%v", got["name"]) + } + if got["description"] != "desc" { + t.Fatalf("json description=%v", got["description"]) + } +} From 8619ed54b4787953208f7ad167386372f7cfef19 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 5 May 2026 11:29:58 +0100 Subject: [PATCH 15/20] bloblang(v2): Add public bloblangv2 mapping migrator Adds public/bloblangv2/migrator/, the public-facing wrapper around the internal translator. It exposes a stable API for migrating a single Bloblang mapping (or a set of mappings sharing an import graph) from V1 to V2, surfaces structured Change entries (rewrites and flagged SemanticChange divergences), and lets callers register extra rules via the Options surface for plugin-specific rewrites. This is the layer consumed by the upcoming public/service config migrator, and by external tools that just need mapping-level translation without pulling in internal/. --- public/bloblangv2/migrator/change.go | 99 +++++ public/bloblangv2/migrator/context.go | 108 +++++ public/bloblangv2/migrator/doc.go | 43 ++ public/bloblangv2/migrator/end_to_end_test.go | 120 ++++++ public/bloblangv2/migrator/example_test.go | 65 +++ public/bloblangv2/migrator/imports_test.go | 87 ++++ public/bloblangv2/migrator/migrator.go | 164 ++++++++ public/bloblangv2/migrator/migrator_test.go | 322 +++++++++++++++ public/bloblangv2/migrator/options.go | 64 +++ public/bloblangv2/migrator/rule.go | 39 ++ public/bloblangv2/migrator/v1ast.go | 311 ++++++++++++++ public/bloblangv2/migrator/v2ast.go | 382 ++++++++++++++++++ 12 files changed, 1804 insertions(+) create mode 100644 public/bloblangv2/migrator/change.go create mode 100644 public/bloblangv2/migrator/context.go create mode 100644 public/bloblangv2/migrator/doc.go create mode 100644 public/bloblangv2/migrator/end_to_end_test.go create mode 100644 public/bloblangv2/migrator/example_test.go create mode 100644 public/bloblangv2/migrator/imports_test.go create mode 100644 public/bloblangv2/migrator/migrator.go create mode 100644 public/bloblangv2/migrator/migrator_test.go create mode 100644 public/bloblangv2/migrator/options.go create mode 100644 public/bloblangv2/migrator/rule.go create mode 100644 public/bloblangv2/migrator/v1ast.go create mode 100644 public/bloblangv2/migrator/v2ast.go diff --git a/public/bloblangv2/migrator/change.go b/public/bloblangv2/migrator/change.go new file mode 100644 index 000000000..58df0613b --- /dev/null +++ b/public/bloblangv2/migrator/change.go @@ -0,0 +1,99 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" + +// Severity classifies a Change record. Info means the rewrite was +// purely cosmetic / mechanical; Warning flags a semantic divergence +// the user should audit; Error signals an Unsupported V1 construct +// that produced no equivalent V2 output. +type Severity = translator.Severity + +// Severity values. +const ( + SeverityInfo = translator.SeverityInfo + SeverityWarning = translator.SeverityWarning + SeverityError = translator.SeverityError +) + +// Category classifies the broad nature of a Change. +type Category = translator.Category + +// Category values. +const ( + CategoryIdiomRewrite = translator.CategoryIdiomRewrite + CategorySemanticChange = translator.CategorySemanticChange + CategoryUnsupported = translator.CategoryUnsupported + CategoryUncertain = translator.CategoryUncertain +) + +// RuleID identifies the translator rule that emitted a Change. +// Built-in rules use the constants exported below; custom rules can +// either reuse them or define their own (any int64 not colliding with +// a built-in is fine — RuleIDs are taxonomy hints, not authoritative). +type RuleID = translator.RuleID + +// Built-in RuleID values useful for custom rules that want to +// classify their own diagnostics under the same taxonomy. +const ( + RuleUnknown = translator.RuleUnknown + RuleRootToOutput = translator.RuleRootToOutput + RuleThisToInput = translator.RuleThisToInput + RuleThisTargetToOutput = translator.RuleThisTargetToOutput + RuleBareIdentToInput = translator.RuleBareIdentToInput + RuleBarePathToOutput = translator.RuleBarePathToOutput + RuleMetaTargetToOutputMeta = translator.RuleMetaTargetToOutputMeta + RuleMetaReadToInputMeta = translator.RuleMetaReadToInputMeta + RuleCoalescePrecedence = translator.RuleCoalescePrecedence + RuleAndOrSameLevel = translator.RuleAndOrSameLevel + RuleBoolNumberEquality = translator.RuleBoolNumberEquality + RuleModuloFloatTruncation = translator.RuleModuloFloatTruncation + RuleIntDivReturnsFloat = translator.RuleIntDivReturnsFloat + RuleOrCatchesErrors = translator.RuleOrCatchesErrors + RuleIfNoElseNothing = translator.RuleIfNoElseNothing + RuleMatchSubjectRebinds = translator.RuleMatchSubjectRebinds + RuleNoBracketIndexing = translator.RuleNoBracketIndexing + RuleStringLengthBytes = translator.RuleStringLengthBytes + RuleMethodDoesNotExist = translator.RuleMethodDoesNotExist + RuleNowReturnsString = translator.RuleNowReturnsString + RuleMapDeclTranslation = translator.RuleMapDeclTranslation + RuleImportStatement = translator.RuleImportStatement + RuleFromStatement = translator.RuleFromStatement + RuleUnsupportedConstruct = translator.RuleUnsupportedConstruct + RuleEmittedInvalidV2 = translator.RuleEmittedInvalidV2 + RuleBlockScopedLet = translator.RuleBlockScopedLet +) + +// Change records one translator decision: a rewrite, a semantic +// divergence, an unsupported construct. +type Change = translator.Change + +// Report is the result of a successful Migrate call. +type Report = translator.Report + +// Coverage summarises how successfully a V1 source was translated. +type Coverage = translator.Coverage + +// CoverageError is returned by Migrate when the resulting Coverage.Ratio +// falls below Options.MinCoverage. The Report is reachable through the +// error. +type CoverageError = translator.CoverageError + +// Mode classifies the V1 execution context the translated mapping +// will replace. +type Mode = translator.Mode + +// FileResolver lazily resolves a V1 import path during Migrate. See +// Options.FileResolver for semantics. +type FileResolver = translator.FileResolver + +// V2ImportPathRewriter rewrites V1 import path strings to their V2 +// equivalents. See Options.V2ImportPathRewriter. +type V2ImportPathRewriter = translator.V2ImportPathRewriter + +// Mode values. +const ( + ModeMutation = translator.ModeMutation + ModeMapping = translator.ModeMapping +) diff --git a/public/bloblangv2/migrator/context.go b/public/bloblangv2/migrator/context.go new file mode 100644 index 000000000..6d7ec9979 --- /dev/null +++ b/public/bloblangv2/migrator/context.go @@ -0,0 +1,108 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" +) + +// Context is the per-rule handle handed to a custom MethodRule or +// FunctionRule. It exposes the translator helpers a rule needs +// (recursive translation, scope / this-rebind management, source +// position translation, additional diagnostics) plus the three +// Result constructors that drive coverage tracking. +// +// The Context value is only valid for the duration of the rule +// invocation that produced it; storing it for later use is undefined. +type Context struct { + t translator.Translator + defaultV1 v1Position // V1 source position to use when a Result reason omits one +} + +type v1Position struct { + Line int + Column int +} + +// Translate recursively translates a V1 sub-expression into V2. Use +// for the receiver, arguments, or any nested V1 node a rule passes +// through to V2 rather than transforms itself. Returns nil if +// translation cannot proceed (the translator already emitted the +// appropriate diagnostic). +func (c *Context) Translate(e V1Expr) V2Expr { + if e == nil { + return nil + } + out := c.t.TranslateExpr(e.unwrapV1()) + if out == nil { + return nil + } + return wrapV2(out) +} + +// PushScope pushes a named-context frame for the duration of a +// translation walk inside the rule. Each name becomes a bound +// identifier (lambda parameter) so V1 bare-ident references resolve +// to the parameter rather than `input.`. Pair every PushScope +// with a corresponding PopScope. +func (c *Context) PushScope(names ...string) { c.t.PushScope(names...) } + +// PopScope removes the innermost scope frame. +func (c *Context) PopScope() { c.t.PopScope() } + +// PushThisRebind makes V1 `this` translate to the given V2 identifier +// while subsequent Translate calls walk inside it. Used when a rule +// synthesizes a V2 lambda whose parameter takes over what V1 `this` +// referred to (e.g. wrapping a query-form predicate). Pair with +// PopThisRebind. +func (c *Context) PushThisRebind(name string) { c.t.PushThisRebind(name) } + +// PopThisRebind removes the innermost this-rebinding. +func (c *Context) PopThisRebind() { c.t.PopThisRebind() } + +// Pos translates a public V1 source position to a public V2 source +// position. Mostly a convenience — the structures match — but routing +// through this method keeps rule code stable if either side's Pos +// representation changes. +func (c *Context) Pos(p Pos) Pos { return p } + +// Note records an additional Change record alongside the rule's +// Result. Coverage counters are not affected; this hook is for +// flagging semantic divergences or extra context the rule wants in +// the Report. Line/Column are filled from the V1 node currently +// being processed when the Change leaves them as zero. +func (c *Context) Note(ch Change) { + if ch.Line == 0 { + ch.Line = c.defaultV1.Line + } + if ch.Column == 0 { + ch.Column = c.defaultV1.Column + } + c.t.EmitChange(ch) +} + +// Replace produces a Result that swaps the V1 node for the supplied +// V2 expression. The translator records a Rewritten Change carrying +// the rule's name as part of the explanation. +func (c *Context) Replace(e V2Expr) Result { + if e == nil { + return Result{kind: resultUnsupported, reason: "rule returned a nil replacement"} + } + return Result{kind: resultReplace, expr: e.unwrapV2()} +} + +// Skip produces a Result that falls through to the default 1:1 +// translation. Use this when V1 and V2 forms agree byte-for-byte but +// the rule wanted to attach a reason or guard against a future +// built-in being added under the same name. +func (c *Context) Skip(reason string) Result { + return Result{kind: resultSkip, reason: reason} +} + +// Unsupported produces a Result that flags the V1 construct as +// untranslatable. The translator records an Error-severity +// Unsupported Change and emits a `// MIGRATION:` comment in the V2 +// output where the V1 node sat. +func (c *Context) Unsupported(reason string) Result { + return Result{kind: resultUnsupported, reason: reason} +} diff --git a/public/bloblangv2/migrator/doc.go b/public/bloblangv2/migrator/doc.go new file mode 100644 index 000000000..1cad91bde --- /dev/null +++ b/public/bloblangv2/migrator/doc.go @@ -0,0 +1,43 @@ +// Copyright 2026 Redpanda Data, Inc. + +// Package migrator translates Bloblang V1 mappings into Bloblang V2. +// +// The package wraps the internal translator with a stable public API +// so downstream repositories that ship their own V1 plugins can +// register custom translation rules alongside the bundled built-ins. +// +// # Usage +// +// The simplest case mirrors the original +// internal/bloblang2/migrator/translator.Migrate signature: construct a +// Migrator, hand it a V1 source string, get a Report back. +// +// mig := migrator.New() +// report, err := mig.Migrate(v1Source, migrator.Options{}) +// if err != nil { +// return err +// } +// fmt.Println(report.V2Mapping) +// +// # Custom rules +// +// Register a method or function rule keyed by the V1 plugin name. The +// rule receives a Context and a wrapped V1 node and returns a Result +// describing the outcome. Custom rules win on name collision with the +// built-ins. +// +// mig.RegisterMethodRule("widget_encode", +// func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { +// return ctx.Replace(&migrator.V2MethodCallExpr{ +// Receiver: ctx.Translate(m.Receiver), +// Method: "widget_encode_v2", +// }) +// }) +// +// # Stability +// +// Public types and methods follow semantic versioning. The wrapped AST +// shapes intentionally mirror the internal AST 1:1 but evolve +// independently of the internal types — internal refactors never +// reach the public surface. +package migrator diff --git a/public/bloblangv2/migrator/end_to_end_test.go b/public/bloblangv2/migrator/end_to_end_test.go new file mode 100644 index 000000000..cf69affe8 --- /dev/null +++ b/public/bloblangv2/migrator/end_to_end_test.go @@ -0,0 +1,120 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "strings" + "sync" + "testing" + + "github.com/redpanda-data/benthos/v4/public/bloblang" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// TestEndToEndCustomPlugin walks the full lifecycle a downstream +// repository would follow: +// +// 1. Register a fictional V1 plugin with the public bloblang +// environment (mirroring how a real downstream library uses +// RegisterMethodV2). +// 2. Register a matching V2 plugin so V2 mappings can call it. +// 3. Register a custom migrator rule that translates the V1 callsite +// to the V2 callsite. +// 4. Run a V1 source through the migrator and confirm the output +// compiles in V2 and produces the expected value end-to-end. +// +// The plugin is registered against fresh, isolated environments so +// the test doesn't pollute global state. +func TestEndToEndCustomPlugin(t *testing.T) { + endToEndOnce.Do(registerEndToEndPlugins) + + mig := migrator.New() + mig.RegisterMethodRule("widget_double", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + if len(m.Args) != 0 { + return ctx.Unsupported("widget_double takes no arguments") + } + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_double_v2", + }) + }) + + const v1Source = `root.doubled = this.value.widget_double()` + rep, err := mig.Migrate(v1Source, migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, ".widget_double_v2()") { + t.Fatalf("expected .widget_double_v2() in V2 output, got:\n%s", rep.V2Mapping) + } + + v1Exec, err := bloblang.NewEnvironment().Parse(v1Source) + if err != nil { + t.Fatalf("v1 compile: %v", err) + } + v1Out, err := v1Exec.Query(map[string]any{"value": int64(3)}) + if err != nil { + t.Fatalf("v1 exec: %v", err) + } + v1Map, ok := v1Out.(map[string]any) + if !ok { + t.Fatalf("expected v1 output to be an object, got %T", v1Out) + } + if v1Map["doubled"] != int64(6) { + t.Fatalf("v1 widget_double(3) expected 6, got %v", v1Map["doubled"]) + } + + v2Exec, err := bloblangv2.GlobalEnvironment().Parse(rep.V2Mapping) + if err != nil { + t.Fatalf("v2 compile (translated):\n%s\nerr: %v", rep.V2Mapping, err) + } + v2Out, err := v2Exec.Query(map[string]any{"value": int64(3)}) + if err != nil { + t.Fatalf("v2 exec: %v", err) + } + v2Map := v2Out.(map[string]any) + if v2Map["doubled"] != int64(6) { + t.Fatalf("v2 widget_double_v2(3) expected 6, got %v", v2Map["doubled"]) + } +} + +var endToEndOnce sync.Once + +func registerEndToEndPlugins() { + // V1: register on the global bloblang environment. + if err := bloblang.RegisterMethodV2("widget_double", + bloblang.NewPluginSpec().Description("Doubles the receiver integer."), + func(_ *bloblang.ParsedParams) (bloblang.Method, error) { + return func(v any) (any, error) { + switch n := v.(type) { + case int64: + return n * 2, nil + case float64: + return n * 2, nil + } + return nil, nil + }, nil + }, + ); err != nil { + panic("v1 widget_double registration: " + err.Error()) + } + + // V2: register on the global bloblangv2 environment under the new name. + if err := bloblangv2.RegisterMethod("widget_double_v2", + bloblangv2.NewPluginSpec().Description("Doubles the receiver integer."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { + switch n := v.(type) { + case int64: + return n * 2, nil + case float64: + return n * 2, nil + } + return nil, nil + }, nil + }, + ); err != nil { + panic("v2 widget_double_v2 registration: " + err.Error()) + } +} diff --git a/public/bloblangv2/migrator/example_test.go b/public/bloblangv2/migrator/example_test.go new file mode 100644 index 000000000..ac264986f --- /dev/null +++ b/public/bloblangv2/migrator/example_test.go @@ -0,0 +1,65 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "fmt" + "strings" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// ExampleMigrator_RegisterMethodRule demonstrates registering a custom +// method-rewrite rule. The fictional `widget_encode` plugin exists in +// V1 and has been ported to V2 under the new name `widget_encode_v2`. +// Downstream registers a rule so the migrator rewrites the V1 +// callsite into the V2 form during translation. +func ExampleMigrator_RegisterMethodRule() { + mig := migrator.New() + mig.RegisterMethodRule("widget_encode", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + // V1 widget_encode took no arguments. V2 keeps that. + if len(m.Args) != 0 { + return ctx.Unsupported("widget_encode takes no arguments") + } + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_encode_v2", + }) + }) + + report, err := mig.Migrate(`root.encoded = this.payload.widget_encode()`, migrator.Options{}) + if err != nil { + fmt.Println("migrate failed:", err) + return + } + fmt.Println(strings.TrimSpace(report.V2Mapping)) + + // Output: + // output.encoded = input?.payload.widget_encode_v2() +} + +// ExampleMigrator_RegisterFunctionRule demonstrates a function-rule +// rewrite. The fictional V1 `widget_size()` function is replaced in +// V2 by an equivalent method on `input`. +func ExampleMigrator_RegisterFunctionRule() { + mig := migrator.New() + mig.RegisterFunctionRule("widget_size", func(ctx *migrator.Context, f *migrator.V1FunctionCall) migrator.Result { + if len(f.Args) != 0 { + return ctx.Unsupported("widget_size takes no arguments") + } + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: &migrator.V2InputExpr{}, + Method: "widget_size", + }) + }) + + report, err := mig.Migrate(`root.size = widget_size()`, migrator.Options{}) + if err != nil { + fmt.Println("migrate failed:", err) + return + } + fmt.Println(strings.TrimSpace(report.V2Mapping)) + + // Output: + // output.size = input.widget_size() +} diff --git a/public/bloblangv2/migrator/imports_test.go b/public/bloblangv2/migrator/imports_test.go new file mode 100644 index 000000000..fa72501cc --- /dev/null +++ b/public/bloblangv2/migrator/imports_test.go @@ -0,0 +1,87 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "path" + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// TestPublicFileResolver — end-to-end on the public surface, confirming +// FileResolver, V2ImportPathRewriter, and Report.V2Files behave the way +// the docstrings say. +func TestPublicFileResolver(t *testing.T) { + v1Helpers := `map double { root = this * 2 }` + v1Main := `import "./helpers.blobl" +root.x = 21.apply("double") +` + rep, err := migrator.Migrate(v1Main, migrator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + if importPath == "./helpers.blobl" && parentKey == "" { + return "/abs/helpers.blobl", v1Helpers, true + } + return "", "", false + }, + V2ImportPathRewriter: func(p string) string { + return strings.TrimSuffix(p, ".blobl") + ".v5.blobl" + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, `import "./helpers.v5.blobl"`) { + t.Fatalf("expected rewritten import in V2 source, got:\n%s", rep.V2Mapping) + } + if _, ok := rep.V2Files["/abs/helpers.blobl"]; !ok { + t.Fatalf("expected canonical-keyed V2Files entry, got keys: %v", v2FileKeys(rep.V2Files)) + } +} + +// TestPublicFileResolverTransitive — A imports B, B imports C; the +// resolver maintains parent-relative resolution via parentKey. +func TestPublicFileResolverTransitive(t *testing.T) { + files := map[string]string{ + "/abs/a.blobl": `import "./b.blobl" +map a_helper { root = this.b_helper.apply() } +`, + "/abs/b.blobl": `import "./c.blobl" +map b_helper { root = this.c_helper.apply() } +`, + "/abs/c.blobl": `map c_helper { root = this * 3 }`, + } + rep, err := migrator.Migrate(`import "/abs/a.blobl" +root.x = 7.apply("a_helper") +`, migrator.Options{ + MinCoverage: 0, + FileResolver: func(parentKey, importPath string) (string, string, bool) { + var canonical string + if strings.HasPrefix(importPath, "/") { + canonical = importPath + } else { + canonical = path.Join(path.Dir(parentKey), importPath) + } + content, ok := files[canonical] + return canonical, content, ok + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + for _, want := range []string{"/abs/a.blobl", "/abs/b.blobl", "/abs/c.blobl"} { + if _, ok := rep.V2Files[want]; !ok { + t.Fatalf("expected V2Files to contain %q, got: %v", want, v2FileKeys(rep.V2Files)) + } + } +} + +func v2FileKeys(m map[string]string) []string { + out := make([]string, 0, len(m)) + for k := range m { + out = append(out, k) + } + return out +} diff --git a/public/bloblangv2/migrator/migrator.go b/public/bloblangv2/migrator/migrator.go new file mode 100644 index 000000000..44b12bcb6 --- /dev/null +++ b/public/bloblangv2/migrator/migrator.go @@ -0,0 +1,164 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/translator" + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// Migrator translates Bloblang V1 mappings into V2. Construct one with +// New, register any custom rules with RegisterMethodRule / +// RegisterFunctionRule, then call Migrate (any number of times). The +// Migrator is not safe for concurrent registration but is safe for +// concurrent Migrate calls once registration is complete. +type Migrator struct { + methodRules map[string]MethodRule + functionRules map[string]FunctionRule +} + +// New creates an empty Migrator with no custom rules registered. The +// built-in V1→V2 rules are always active; custom rules layer on top +// (and shadow built-ins on name collision per design P2). +func New() *Migrator { + return &Migrator{ + methodRules: map[string]MethodRule{}, + functionRules: map[string]FunctionRule{}, + } +} + +// RegisterMethodRule registers a custom translation rule for a V1 +// method named `name`. If a rule is already registered for the same +// name on this Migrator instance the new rule replaces it. +func (m *Migrator) RegisterMethodRule(name string, rule MethodRule) { + m.methodRules[name] = rule +} + +// RegisterFunctionRule registers a custom translation rule for a V1 +// function (top-level call) named `name`. +func (m *Migrator) RegisterFunctionRule(name string, rule FunctionRule) { + m.functionRules[name] = rule +} + +// Migrate translates a V1 source mapping into V2. Per-call config — +// verbosity, coverage threshold, mode — is supplied via opts. The +// instance-owned rule registry is wired into the underlying +// translator for the duration of the call. +// +// Returns a *Report on success. Returns *CoverageError when the +// computed coverage falls below opts.MinCoverage; the Report is +// reachable via the error. +func (m *Migrator) Migrate(v1Source string, opts Options) (*Report, error) { + internalOpts := translator.Options{ + MinCoverage: opts.MinCoverage, + Verbose: opts.Verbose, + TreatWarningsAsErrors: opts.TreatWarningsAsErrors, + Files: opts.Files, + FileResolver: opts.FileResolver, + V2ImportPathRewriter: opts.V2ImportPathRewriter, + Mode: opts.Mode, + } + if len(m.methodRules) > 0 { + internalOpts.CustomMethodRules = make(map[string]translator.MethodRuleHook, len(m.methodRules)) + for name, rule := range m.methodRules { + internalOpts.CustomMethodRules[name] = m.bridgeMethodRule(name, rule) + } + } + if len(m.functionRules) > 0 { + internalOpts.CustomFunctionRules = make(map[string]translator.FunctionRuleHook, len(m.functionRules)) + for name, rule := range m.functionRules { + internalOpts.CustomFunctionRules[name] = m.bridgeFunctionRule(name, rule) + } + } + return translator.Migrate(v1Source, internalOpts) +} + +// bridgeMethodRule wraps a public MethodRule into the internal hook +// signature. Closure marshals V1 nodes from internal to public, runs +// the user rule, and translates the Result back to the (out, +// handled) tuple the internal translator expects. +func (m *Migrator) bridgeMethodRule(name string, rule MethodRule) translator.MethodRuleHook { + return func(t translator.Translator, mc *v1ast.MethodCall, _ syntax.Expr) (syntax.Expr, bool) { + ctx := &Context{t: t, defaultV1: v1Position{Line: mc.NamePos.Line, Column: mc.NamePos.Column}} + res := rule(ctx, wrapV1MethodCall(mc)) + return resolveResult(t, mc.NamePos, "."+name+"()", res) + } +} + +func (m *Migrator) bridgeFunctionRule(name string, rule FunctionRule) translator.FunctionRuleHook { + return func(t translator.Translator, fc *v1ast.FunctionCall) (syntax.Expr, bool) { + ctx := &Context{t: t, defaultV1: v1Position{Line: fc.NamePos.Line, Column: fc.NamePos.Column}} + res := rule(ctx, wrapV1FunctionCall(fc)) + return resolveResult(t, fc.NamePos, name+"()", res) + } +} + +// resolveResult interprets a public Result inside the internal hook +// contract. The recorder is updated through the public Translator +// interface so a custom rule's outcome moves coverage counters the +// same way a built-in rule's outcome does. +// +// - Replace bumps Rewritten with a Change naming the V1 callsite, +// and the supplied V2 expression replaces the V1 node. +// - Unsupported bumps Unsupported with an Error-severity Change +// carrying the rule's reason. The translator falls through to +// its default 1:1 translation so the V2 source still parses +// (the user is alerted via the Report). +// - Skip falls through silently — the default 1:1 translation +// fires and counts itself as Exact. The rule's reason, if any, +// is logged as an Info Note. +func resolveResult(t translator.Translator, p v1ast.Pos, callsite string, res Result) (syntax.Expr, bool) { + switch res.kind { + case resultReplace: + t.RecordRewritten(translator.Change{ + Line: p.Line, + Column: p.Column, + Severity: translator.SeverityInfo, + Category: translator.CategoryIdiomRewrite, + RuleID: translator.RuleMethodDoesNotExist, + Explanation: "custom migrator rule rewrote V1 " + callsite, + }) + return res.expr, true + case resultUnsupported: + t.RecordUnsupported(translator.Change{ + Line: p.Line, + Column: p.Column, + RuleID: translator.RuleUnsupportedConstruct, + Explanation: "custom migrator rule could not translate V1 " + callsite + ": " + res.reason, + }) + return nil, true + case resultSkip: + if res.reason != "" { + t.EmitChange(translator.Change{ + Line: p.Line, + Column: p.Column, + Severity: translator.SeverityInfo, + Category: translator.CategoryIdiomRewrite, + Explanation: "custom migrator rule deferred V1 " + callsite + " to default translation: " + res.reason, + }) + } + return nil, false + default: + // resultUnset — defensive; treat as skip-without-reason. + return nil, false + } +} + +// Migrate is a package-level convenience that builds a default +// Migrator (no custom rules) and runs it against the supplied source. +// Equivalent to `New().Migrate(src, opts)`. Use this when you only +// need built-in rules; create your own Migrator when you have rules +// to register. +func Migrate(v1Source string, opts Options) (*Report, error) { + return New().Migrate(v1Source, opts) +} + +// IsFromOnly reports whether v1Source consists of a single +// `from "path"` statement and returns the path string if so. Useful +// for callers that want to special-case from-only sources before +// invoking Migrate (e.g. to rewrite a processor config to a +// file-backed form). +func IsFromOnly(v1Source string) (string, bool) { + return translator.IsFromOnly(v1Source) +} diff --git a/public/bloblangv2/migrator/migrator_test.go b/public/bloblangv2/migrator/migrator_test.go new file mode 100644 index 000000000..43c910141 --- /dev/null +++ b/public/bloblangv2/migrator/migrator_test.go @@ -0,0 +1,322 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// TestDefaultMigrate exercises the package-level Migrate helper to +// confirm the public API behaves the same as the internal translator +// when no custom rules are registered. +func TestDefaultMigrate(t *testing.T) { + rep, err := migrator.Migrate(`root.x = this.y`, migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, "output.x") { + t.Fatalf("expected output.x in V2 mapping, got:\n%s", rep.V2Mapping) + } + if !strings.Contains(rep.V2Mapping, "input") { + t.Fatalf("expected input rewrite in V2 mapping, got:\n%s", rep.V2Mapping) + } +} + +// TestRegisterMethodRuleReplace registers a custom rule that maps a +// fictional V1 method `widget_encode` onto a V2 `widget_encode_v2` +// method call. The fictional plugin has no V1 stdlib counterpart — +// the rule fires solely because the migrator dispatched to a +// registered custom rule. +func TestRegisterMethodRuleReplace(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget_encode", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + if len(m.Args) != 0 { + return ctx.Unsupported("widget_encode takes no arguments in V1") + } + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_encode_v2", + }) + }) + + rep, err := mig.Migrate(`root.encoded = this.payload.widget_encode()`, migrator.Options{Verbose: true}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, ".widget_encode_v2()") { + t.Fatalf("expected .widget_encode_v2() in V2 mapping, got:\n%s", rep.V2Mapping) + } + if strings.Contains(rep.V2Mapping, "widget_encode(") { + t.Fatalf("V1 method name leaked into V2 mapping:\n%s", rep.V2Mapping) + } +} + +// TestRegisterMethodRuleArgs exercises argument inspection: the rule +// pulls a literal, recursively translates a non-literal arg via +// ctx.Translate, and constructs a V2 method call with both. +func TestRegisterMethodRuleArgs(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget_pack", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + if len(m.Args) != 2 { + return ctx.Unsupported("widget_pack expects two arguments") + } + // First arg should be a string literal naming the format. + fmtLit, ok := m.Args[0].Value.(*migrator.V1Literal) + if !ok || fmtLit.Kind != migrator.V1LitString { + return ctx.Unsupported("widget_pack: first argument must be a string literal") + } + // Second arg is opaque; recurse via Translate. + payload := ctx.Translate(m.Args[1].Value) + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_pack_v2", + Args: []migrator.V2CallArg{ + {Value: &migrator.V2LiteralExpr{Kind: migrator.V2LitString, Str: fmtLit.Str}}, + {Value: payload}, + }, + }) + }) + + rep, err := mig.Migrate(`root.x = this.handle.widget_pack("brotli", this.payload)`, migrator.Options{Verbose: true}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, "widget_pack_v2") { + t.Fatalf("expected widget_pack_v2 in output:\n%s", rep.V2Mapping) + } + if !strings.Contains(rep.V2Mapping, `"brotli"`) { + t.Fatalf("expected literal arg preserved in output:\n%s", rep.V2Mapping) + } +} + +// TestRegisterMethodRuleSkip exercises the Skip return path: the rule +// declines to transform on a shape it doesn't recognise, and the +// translator falls through to its default 1:1 translation. +func TestRegisterMethodRuleSkip(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget_passthrough", func(ctx *migrator.Context, _ *migrator.V1MethodCall) migrator.Result { + return ctx.Skip("widget_passthrough is identity in both versions") + }) + + rep, err := mig.Migrate(`root.x = this.handle.widget_passthrough()`, migrator.Options{Verbose: true}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + // Default translation preserves the method name. + if !strings.Contains(rep.V2Mapping, ".widget_passthrough()") { + t.Fatalf("expected method name preserved by default translation:\n%s", rep.V2Mapping) + } +} + +// TestRegisterMethodRuleUnsupported asserts the Unsupported path +// records an Error-severity Change. +func TestRegisterMethodRuleUnsupported(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget_dynamic", func(ctx *migrator.Context, _ *migrator.V1MethodCall) migrator.Result { + return ctx.Unsupported("dynamic dispatch has no V2 equivalent") + }) + + rep, err := mig.Migrate(`root.x = this.h.widget_dynamic()`, migrator.Options{Verbose: true, MinCoverage: 0.0001}) + // The Unsupported counts against coverage; below the default + // threshold we'd see CoverageError. We set MinCoverage near zero + // to surface the Report directly. + if err != nil { + t.Fatalf("migrate: %v", err) + } + var sawUnsupported bool + for _, ch := range rep.Changes { + if ch.Severity == migrator.SeverityError && strings.Contains(ch.Explanation, "dynamic dispatch") { + sawUnsupported = true + break + } + } + if !sawUnsupported { + t.Fatalf("expected an Error-severity Unsupported change with the rule's reason; got changes:\n%v", rep.Changes) + } +} + +// TestRegisterMethodRuleOverride registers a rule for a name that has +// a built-in translation (without). The custom rule must win, +// confirming the design P2 precedence model. +func TestRegisterMethodRuleOverride(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("without", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + // Custom override: rewrite into a fictional .strip() method. + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "strip", + }) + }) + rep, err := mig.Migrate(`root = this.without("a", "b")`, migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, ".strip()") { + t.Fatalf("custom rule should win over built-in .without rewrite, got:\n%s", rep.V2Mapping) + } + if strings.Contains(rep.V2Mapping, ".without(") { + t.Fatalf("built-in .without rewrite leaked despite custom override:\n%s", rep.V2Mapping) + } +} + +// TestRegisterFunctionRule exercises the function-rule path with a +// fictional V1 function that V2 turns into a method call on `input`. +func TestRegisterFunctionRule(t *testing.T) { + mig := migrator.New() + mig.RegisterFunctionRule("widget_size", func(ctx *migrator.Context, f *migrator.V1FunctionCall) migrator.Result { + if len(f.Args) != 0 { + return ctx.Unsupported("widget_size takes no arguments") + } + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: &migrator.V2InputExpr{}, + Method: "widget_size", + }) + }) + rep, err := mig.Migrate(`root.size = widget_size()`, migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, "input.widget_size()") { + t.Fatalf("expected input.widget_size() in output:\n%s", rep.V2Mapping) + } +} + +// TestNoteEmitsExtraDiagnostic ensures ctx.Note records a Change +// alongside the Result without breaking coverage. +func TestNoteEmitsExtraDiagnostic(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget_encode", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + ctx.Note(migrator.Change{ + Severity: migrator.SeverityWarning, + Category: migrator.CategorySemanticChange, + Explanation: "widget_encode now defaults to UTF-8 in V2", + }) + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_encode_v2", + }) + }) + rep, err := mig.Migrate(`root.x = this.h.widget_encode()`, migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + var sawNote bool + for _, ch := range rep.Changes { + if ch.Severity == migrator.SeverityWarning && strings.Contains(ch.Explanation, "UTF-8") { + sawNote = true + break + } + } + if !sawNote { + t.Fatalf("expected the rule's Note in the change list, got:\n%v", rep.Changes) + } +} + +// TestPushScopeAndThisRebind walks the translator scope APIs through +// a synthetic lambda body the rule constructs from scratch. +func TestPushScopeAndThisRebind(t *testing.T) { + mig := migrator.New() + // Fake V1 method: turn `recv.where(predicate)` into a V2 + // `recv.find_by(__v -> )` lambda where the predicate + // has its `this` references rebound to __v. + mig.RegisterMethodRule("where", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + if len(m.Args) != 1 { + return ctx.Unsupported("where: need exactly one predicate argument") + } + const param = "__v" + ctx.PushScope(param) + ctx.PushThisRebind(param) + body := ctx.Translate(m.Args[0].Value) + ctx.PopThisRebind() + ctx.PopScope() + if body == nil { + return ctx.Unsupported("where: predicate failed to translate") + } + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "find_by", + Args: []migrator.V2CallArg{{ + Value: &migrator.V2LambdaExpr{ + Params: []migrator.V2LambdaParam{{Name: param}}, + Body: body, + }, + }}, + }) + }) + rep, err := mig.Migrate(`root.match = this.items.where(this.id == 5)`, migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.V2Mapping, ".find_by(__v ->") { + t.Fatalf("expected synthesized lambda in output:\n%s", rep.V2Mapping) + } + if !strings.Contains(rep.V2Mapping, "__v") || !strings.Contains(rep.V2Mapping, "id == 5") { + t.Fatalf("rebinding did not replace `this` in predicate body:\n%s", rep.V2Mapping) + } +} + +// TestCoverageReflectsReplace asserts a custom Replace bumps the +// Rewritten counter so coverage stats stay honest. Without proper +// recorder hooks the counter would silently stay at zero and the +// coverage gate would lie. +func TestCoverageReflectsReplace(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget_encode", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_encode_v2", + }) + }) + rep, err := mig.Migrate(`root.x = this.h.widget_encode()`, migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if rep.Coverage.Rewritten == 0 { + t.Fatalf("expected Rewritten counter to bump on a Replace; got coverage %+v", rep.Coverage) + } +} + +// TestCoverageReflectsUnsupported asserts a custom Unsupported bumps +// the Unsupported counter (mirroring how built-in Unsupported flows +// through recorder.Unsupported). +func TestCoverageReflectsUnsupported(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget_dynamic", func(ctx *migrator.Context, _ *migrator.V1MethodCall) migrator.Result { + return ctx.Unsupported("dynamic dispatch has no V2 equivalent") + }) + rep, err := mig.Migrate(`root.x = this.h.widget_dynamic()`, migrator.Options{MinCoverage: 0.0001}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if rep.Coverage.Unsupported == 0 { + t.Fatalf("expected Unsupported counter to bump on an Unsupported result; got coverage %+v", rep.Coverage) + } +} + +// TestMigrateConcurrentSafe asserts the Migrator is safe for +// concurrent Migrate calls once registration completes. +func TestMigrateConcurrentSafe(t *testing.T) { + mig := migrator.New() + mig.RegisterMethodRule("widget", func(ctx *migrator.Context, m *migrator.V1MethodCall) migrator.Result { + return ctx.Replace(&migrator.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_v2", + }) + }) + const goroutines = 8 + done := make(chan error, goroutines) + for i := 0; i < goroutines; i++ { + go func() { + _, err := mig.Migrate(`root.x = this.h.widget()`, migrator.Options{}) + done <- err + }() + } + for i := 0; i < goroutines; i++ { + if err := <-done; err != nil { + t.Fatalf("concurrent migrate failed: %v", err) + } + } +} diff --git a/public/bloblangv2/migrator/options.go b/public/bloblangv2/migrator/options.go new file mode 100644 index 000000000..85b7cde35 --- /dev/null +++ b/public/bloblangv2/migrator/options.go @@ -0,0 +1,64 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +// Options controls a single Migrate call. Per-instance configuration +// (registered rules) lives on the Migrator; per-call configuration +// (verbosity, coverage threshold, mode) lives here. +type Options struct { + // MinCoverage is the minimum Coverage.Ratio required before + // Migrate returns successfully. If the computed ratio falls below + // this value Migrate returns *CoverageError. Default 0.75. + MinCoverage float64 + + // Verbose emits Info-severity Changes. Without it, only Warning + // and Error Changes are recorded. + Verbose bool + + // TreatWarningsAsErrors promotes Warning-severity Changes to + // Error. Useful for CI gates. + TreatWarningsAsErrors bool + + // Files is a virtual filesystem for `import` resolution. Keys are + // treated as canonical identifiers for the imported files: an + // entry keyed "helpers.blobl" satisfies any V1 import statement + // whose path string equals "helpers.blobl". Pre-populated entries + // take precedence over FileResolver. + Files map[string]string + + // FileResolver, when set, lazily resolves V1 imports during + // Migrate. The migrator walks the closure of imports starting + // from the main source and any transitively imported files, + // calling the resolver for each unique import path it encounters. + // + // parentKey is the canonical key of the file the import appears + // in (empty for imports in the main V1 source). importPath is + // the path string as written in the import statement. The + // returned canonicalKey identifies the resolved file for + // de-duplication and Report.V2Files emission — two import + // statements that resolve to the same canonicalKey are + // translated once. + // + // Returning ok=false records an Unsupported import at the import + // site and continues with the rest of the migration. + // + // Pre-populated Files take precedence: if Files contains + // importPath as a key, the resolver is not consulted and + // importPath itself is treated as the canonical key. + FileResolver FileResolver + + // V2ImportPathRewriter, when set, rewrites V1 import path strings + // to their V2 equivalents in the emitted V2 source. Default: + // identity. Useful for callers that emit V2-translated files at + // sibling paths (e.g. "helpers.blobl" -> "helpers.v5.blobl"). + // Operates on the verbatim path string from the V1 source so + // locality is preserved (relative imports stay relative). + V2ImportPathRewriter V2ImportPathRewriter + + // Mode selects how the V1 mapping's implicit root is treated. + // Default (zero value) is ModeMutation — V2's `output` semantics + // align with V1's `mutation` processor. Use ModeMapping when + // translating mappings authored for V1's `mapping` processor (the + // translator prepends `output = input`). + Mode Mode +} diff --git a/public/bloblangv2/migrator/rule.go b/public/bloblangv2/migrator/rule.go new file mode 100644 index 000000000..fc3889080 --- /dev/null +++ b/public/bloblangv2/migrator/rule.go @@ -0,0 +1,39 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" + +// MethodRule is the callback shape for a custom V1 method-call +// translation rule. Rules are registered with +// Migrator.RegisterMethodRule, keyed by the V1 method name. The +// callback receives a Context (helpers + Result constructors) and +// the wrapped V1 method-call node, and returns a Result describing +// the outcome. +// +// Custom rules win on name collision with the built-ins (the +// downstream rule fully replaces the built-in for that name). +type MethodRule func(ctx *Context, m *V1MethodCall) Result + +// FunctionRule is the function-call analogue of MethodRule. +type FunctionRule func(ctx *Context, f *V1FunctionCall) Result + +// resultKind is the discriminant for Result. +type resultKind int + +const ( + resultUnset resultKind = iota + resultReplace + resultSkip + resultUnsupported +) + +// Result is the outcome of a rule. Construct via Context.Replace, +// Context.Skip, or Context.Unsupported — the zero value is invalid. +type Result struct { + kind resultKind + // expr is the V2 expression for resultReplace. + expr syntax.Expr + // reason carries the explanation for resultSkip / resultUnsupported. + reason string +} diff --git a/public/bloblangv2/migrator/v1ast.go b/public/bloblangv2/migrator/v1ast.go new file mode 100644 index 000000000..b293a267b --- /dev/null +++ b/public/bloblangv2/migrator/v1ast.go @@ -0,0 +1,311 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + "github.com/redpanda-data/benthos/v4/internal/bloblang2/migrator/v1ast" +) + +// V1Expr is the public-API marker interface for a Bloblang V1 +// expression node. The concrete shapes a custom rule will commonly +// inspect (V1MethodCall, V1FunctionCall, V1Lambda, V1Literal, +// V1Ident, V1ArrayLit, V1ObjectLit, V1ThisExpr, V1RootExpr, +// V1VarRef, V1FieldAccess, V1BinaryExpr) all satisfy it; less common +// shapes are exposed as an opaque carrier that is still translatable +// via Context.Translate. +// +// Implementations are migrator-internal only — the unexported method +// prevents external types from satisfying the interface, so the +// translator can rely on every V1Expr round-tripping through unwrap. +type V1Expr interface { + unwrapV1() v1ast.Expr +} + +// Pos is a public source position. Mirrors the internal v1ast.Pos / +// syntax.Pos pair (both use the same shape). +type Pos struct { + Line int + Column int +} + +func wrapPos(p v1ast.Pos) Pos { return Pos{Line: p.Line, Column: p.Column} } + +// V1CallArg is one argument to a V1 method or function call. +type V1CallArg struct { + // Name is empty for positional arguments. + Name string + Value V1Expr + Pos Pos +} + +// V1MethodCall is `recv.name(args)`. +type V1MethodCall struct { + Receiver V1Expr + Name string + NamePos Pos + Args []V1CallArg + // Named is true when every argument is named (name: value). + Named bool + + inner *v1ast.MethodCall +} + +func (m *V1MethodCall) unwrapV1() v1ast.Expr { return m.inner } + +// V1FunctionCall is a top-level `name(args)`. +type V1FunctionCall struct { + Name string + NamePos Pos + Args []V1CallArg + Named bool + + inner *v1ast.FunctionCall +} + +func (f *V1FunctionCall) unwrapV1() v1ast.Expr { return f.inner } + +// V1Lambda is ` -> ` or `_ -> `. +type V1Lambda struct { + Param string + Discard bool + Body V1Expr + Pos Pos + + inner *v1ast.Lambda +} + +func (l *V1Lambda) unwrapV1() v1ast.Expr { return l.inner } + +// V1LiteralKind classifies V1 literal nodes. +type V1LiteralKind int + +// V1LiteralKind values mirror v1ast.LiteralKind. +const ( + V1LitNull V1LiteralKind = iota + V1LitBool + V1LitInt + V1LitFloat + V1LitString + V1LitRawString +) + +// V1Literal represents null, true/false, integers, floats, or strings. +type V1Literal struct { + Kind V1LiteralKind + Raw string + Str string + Bool bool + Int int64 + Float float64 + Pos Pos + + inner *v1ast.Literal +} + +func (l *V1Literal) unwrapV1() v1ast.Expr { return l.inner } + +// V1Ident is a bare identifier at expression position (V1's legacy +// `foo` ≡ `this.foo` form). The translator's default rewrite turns +// these into V2 `input.foo`; a custom rule can opt out and emit +// something else. +type V1Ident struct { + Name string + Pos Pos + + inner *v1ast.Ident +} + +func (i *V1Ident) unwrapV1() v1ast.Expr { return i.inner } + +// V1ThisExpr is the literal `this` keyword. +type V1ThisExpr struct { + Pos Pos + + inner *v1ast.ThisExpr +} + +func (t *V1ThisExpr) unwrapV1() v1ast.Expr { return t.inner } + +// V1RootExpr is the literal `root` keyword at expression position. +type V1RootExpr struct { + Pos Pos + + inner *v1ast.RootExpr +} + +func (r *V1RootExpr) unwrapV1() v1ast.Expr { return r.inner } + +// V1VarRef is `$name`. +type V1VarRef struct { + Name string + Pos Pos + + inner *v1ast.VarRef +} + +func (v *V1VarRef) unwrapV1() v1ast.Expr { return v.inner } + +// V1ArrayLit is `[...]`. +type V1ArrayLit struct { + Elems []V1Expr + Pos Pos + + inner *v1ast.ArrayLit +} + +func (a *V1ArrayLit) unwrapV1() v1ast.Expr { return a.inner } + +// V1ObjectEntry is one `key: value` member. +type V1ObjectEntry struct { + Key V1Expr + Value V1Expr +} + +// V1ObjectLit is `{...}`. +type V1ObjectLit struct { + Entries []V1ObjectEntry + Pos Pos + + inner *v1ast.ObjectLit +} + +func (o *V1ObjectLit) unwrapV1() v1ast.Expr { return o.inner } + +// V1FieldAccess is `recv.`. +type V1FieldAccess struct { + Receiver V1Expr + Field string + // Quoted reports whether the path segment was a quoted string literal. + Quoted bool + Pos Pos + + inner *v1ast.FieldAccess +} + +func (f *V1FieldAccess) unwrapV1() v1ast.Expr { return f.inner } + +// V1BinaryExpr is a binary-operator expression. Op is the original +// V1 operator token text (e.g. "+", "==", "&&"). +type V1BinaryExpr struct { + Op string + Left V1Expr + Right V1Expr + Pos Pos + + inner *v1ast.BinaryExpr +} + +func (b *V1BinaryExpr) unwrapV1() v1ast.Expr { return b.inner } + +// v1Opaque is the catch-all wrapper for V1 expression shapes that +// don't have a concrete public type. A rule cannot pattern-match on +// these but can pass them to Context.Translate to get the V2 form. +type v1Opaque struct { + inner v1ast.Expr +} + +func (o *v1Opaque) unwrapV1() v1ast.Expr { return o.inner } + +// wrapV1 wraps an internal v1ast.Expr into the public V1Expr surface. +// Recursion is eager: the receiver / args / body of a wrapped node +// are themselves wrapped so a rule can switch on their shape. +func wrapV1(e v1ast.Expr) V1Expr { + if e == nil { + return nil + } + switch n := e.(type) { + case *v1ast.MethodCall: + return wrapV1MethodCall(n) + case *v1ast.FunctionCall: + return wrapV1FunctionCall(n) + case *v1ast.Lambda: + return wrapV1Lambda(n) + case *v1ast.Literal: + return &V1Literal{ + Kind: V1LiteralKind(n.Kind), + Raw: n.Raw, + Str: n.Str, + Bool: n.Bool, + Int: n.Int, + Float: n.Float, + Pos: wrapPos(n.TokPos), + inner: n, + } + case *v1ast.Ident: + return &V1Ident{Name: n.Name, Pos: wrapPos(n.TokPos), inner: n} + case *v1ast.ThisExpr: + return &V1ThisExpr{Pos: wrapPos(n.TokPos), inner: n} + case *v1ast.RootExpr: + return &V1RootExpr{Pos: wrapPos(n.TokPos), inner: n} + case *v1ast.VarRef: + return &V1VarRef{Name: n.Name, Pos: wrapPos(n.TokPos), inner: n} + case *v1ast.ArrayLit: + elems := make([]V1Expr, len(n.Elems)) + for i, e := range n.Elems { + elems[i] = wrapV1(e) + } + return &V1ArrayLit{Elems: elems, Pos: wrapPos(n.TokPos), inner: n} + case *v1ast.ObjectLit: + entries := make([]V1ObjectEntry, len(n.Entries)) + for i, e := range n.Entries { + entries[i] = V1ObjectEntry{Key: wrapV1(e.Key), Value: wrapV1(e.Value)} + } + return &V1ObjectLit{Entries: entries, Pos: wrapPos(n.TokPos), inner: n} + case *v1ast.FieldAccess: + return &V1FieldAccess{ + Receiver: wrapV1(n.Recv), + Field: n.Seg.Name, + Quoted: n.Seg.Quoted, + Pos: wrapPos(n.Seg.Pos), + inner: n, + } + case *v1ast.BinaryExpr: + return &V1BinaryExpr{ + Op: n.Op.String(), + Left: wrapV1(n.Left), + Right: wrapV1(n.Right), + Pos: wrapPos(n.OpPos), + inner: n, + } + } + return &v1Opaque{inner: e} +} + +func wrapV1MethodCall(m *v1ast.MethodCall) *V1MethodCall { + args := make([]V1CallArg, len(m.Args)) + for i, a := range m.Args { + args[i] = V1CallArg{Name: a.Name, Value: wrapV1(a.Value), Pos: wrapPos(a.Pos)} + } + return &V1MethodCall{ + Receiver: wrapV1(m.Recv), + Name: m.Name, + NamePos: wrapPos(m.NamePos), + Args: args, + Named: m.Named, + inner: m, + } +} + +func wrapV1FunctionCall(f *v1ast.FunctionCall) *V1FunctionCall { + args := make([]V1CallArg, len(f.Args)) + for i, a := range f.Args { + args[i] = V1CallArg{Name: a.Name, Value: wrapV1(a.Value), Pos: wrapPos(a.Pos)} + } + return &V1FunctionCall{ + Name: f.Name, + NamePos: wrapPos(f.NamePos), + Args: args, + Named: f.Named, + inner: f, + } +} + +func wrapV1Lambda(l *v1ast.Lambda) *V1Lambda { + return &V1Lambda{ + Param: l.Param, + Discard: l.Discard, + Body: wrapV1(l.Body), + Pos: wrapPos(l.ParamPos), + inner: l, + } +} diff --git a/public/bloblangv2/migrator/v2ast.go b/public/bloblangv2/migrator/v2ast.go new file mode 100644 index 000000000..d6e564174 --- /dev/null +++ b/public/bloblangv2/migrator/v2ast.go @@ -0,0 +1,382 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + "strconv" + + "github.com/redpanda-data/benthos/v4/internal/bloblang2/go/pratt/syntax" +) + +// V2Expr is the public-API marker interface for a Bloblang V2 +// expression node constructed by a custom migration rule. The +// concrete shapes a rule will commonly construct (V2MethodCallExpr, +// V2CallExpr, V2LambdaExpr, V2FieldAccessExpr, V2IndexExpr, +// V2LiteralExpr, V2IdentExpr, V2VarExpr, V2InputExpr, V2OutputExpr, +// V2InputMetaExpr, V2OutputMetaExpr, V2ArrayLiteral, V2ObjectLiteral, +// V2BinaryExpr) all satisfy it. Less common shapes are returned by +// Context.Translate as opaque carriers that round-trip back through +// the internal translator. +// +// Implementations are migrator-internal — the unexported method +// prevents downstream code from satisfying the interface with custom +// types. +type V2Expr interface { + unwrapV2() syntax.Expr +} + +// V2CallArg is one argument in a V2 method or function call. +type V2CallArg struct { + Name string // empty for positional + Value V2Expr +} + +func (a V2CallArg) toInternal() syntax.CallArg { + return syntax.CallArg{Name: a.Name, Value: unwrapV2OrNil(a.Value)} +} + +// V2MethodCallExpr is a V2 method call (`receiver.method(args)`). +type V2MethodCallExpr struct { + Receiver V2Expr + Method string + Args []V2CallArg + Named bool + NullSafe bool + // Pos is optional; when zero the translator fills in from the V1 + // node currently being processed. + Pos Pos +} + +func (m *V2MethodCallExpr) unwrapV2() syntax.Expr { + args := make([]syntax.CallArg, len(m.Args)) + for i, a := range m.Args { + args[i] = a.toInternal() + } + return &syntax.MethodCallExpr{ + Receiver: unwrapV2OrNil(m.Receiver), + Method: m.Method, + MethodPos: unwrapPosOrZero(m.Pos), + Args: args, + Named: m.Named, + NullSafe: m.NullSafe, + } +} + +// V2CallExpr is a V2 function call (`name(args)`) or namespaced call +// (`namespace::name(args)`). +type V2CallExpr struct { + Name string + Namespace string + Args []V2CallArg + Named bool + Pos Pos +} + +func (c *V2CallExpr) unwrapV2() syntax.Expr { + args := make([]syntax.CallArg, len(c.Args)) + for i, a := range c.Args { + args[i] = a.toInternal() + } + return &syntax.CallExpr{ + TokenPos: unwrapPosOrZero(c.Pos), + Name: c.Name, + Namespace: c.Namespace, + Args: args, + Named: c.Named, + } +} + +// V2LambdaParam is one parameter of a V2 lambda. +type V2LambdaParam struct { + Name string + Discard bool + Pos Pos +} + +// V2LambdaExpr is a V2 lambda expression (`(params) -> body` or +// `name -> body`). +type V2LambdaExpr struct { + Params []V2LambdaParam + Body V2Expr + Pos Pos +} + +func (l *V2LambdaExpr) unwrapV2() syntax.Expr { + params := make([]syntax.Param, len(l.Params)) + for i, p := range l.Params { + params[i] = syntax.Param{ + Name: p.Name, + Discard: p.Discard, + Pos: unwrapPosOrZero(p.Pos), + SlotIndex: -1, + } + } + return &syntax.LambdaExpr{ + TokenPos: unwrapPosOrZero(l.Pos), + Params: params, + Body: &syntax.ExprBody{Result: unwrapV2OrNil(l.Body)}, + } +} + +// V2FieldAccessExpr is a V2 field access (`receiver.field` or +// `receiver?.field`). +type V2FieldAccessExpr struct { + Receiver V2Expr + Field string + NullSafe bool + Pos Pos +} + +func (f *V2FieldAccessExpr) unwrapV2() syntax.Expr { + return &syntax.FieldAccessExpr{ + Receiver: unwrapV2OrNil(f.Receiver), + Field: f.Field, + FieldPos: unwrapPosOrZero(f.Pos), + NullSafe: f.NullSafe, + } +} + +// V2IndexExpr is a V2 index access (`receiver[index]` or +// `receiver?[index]`). +type V2IndexExpr struct { + Receiver V2Expr + Index V2Expr + NullSafe bool + Pos Pos +} + +func (i *V2IndexExpr) unwrapV2() syntax.Expr { + return &syntax.IndexExpr{ + Receiver: unwrapV2OrNil(i.Receiver), + Index: unwrapV2OrNil(i.Index), + LBracketPos: unwrapPosOrZero(i.Pos), + NullSafe: i.NullSafe, + } +} + +// V2LiteralKind classifies V2 literal nodes. Mirrors the subset of +// syntax.TokenType that may appear in a literal expression. +type V2LiteralKind int + +// V2LiteralKind values. +const ( + V2LitNull V2LiteralKind = iota + V2LitBool + V2LitInt + V2LitFloat + V2LitString + V2LitRawString +) + +// V2LiteralExpr is a V2 literal value. Only one of the typed fields +// is meaningful per Kind; the translator picks the right one. +type V2LiteralExpr struct { + Kind V2LiteralKind + Bool bool + Int int64 + Float float64 + Str string + Pos Pos +} + +func (l *V2LiteralExpr) unwrapV2() syntax.Expr { + out := &syntax.LiteralExpr{TokenPos: unwrapPosOrZero(l.Pos)} + switch l.Kind { + case V2LitNull: + out.TokenType = syntax.NULL + out.Value = "null" + case V2LitBool: + if l.Bool { + out.TokenType = syntax.TRUE + out.Value = "true" + } else { + out.TokenType = syntax.FALSE + out.Value = "false" + } + case V2LitInt: + out.TokenType = syntax.INT + out.Value = strconv.FormatInt(l.Int, 10) + case V2LitFloat: + out.TokenType = syntax.FLOAT + out.Value = strconv.FormatFloat(l.Float, 'g', -1, 64) + case V2LitString: + out.TokenType = syntax.STRING + out.Value = l.Str + case V2LitRawString: + out.TokenType = syntax.RAW_STRING + out.Value = l.Str + } + return out +} + +// V2IdentExpr is a bare identifier in expression position (e.g. a +// lambda parameter reference). +type V2IdentExpr struct { + Name string + Namespace string + Pos Pos +} + +func (i *V2IdentExpr) unwrapV2() syntax.Expr { + return &syntax.IdentExpr{ + TokenPos: unwrapPosOrZero(i.Pos), + Namespace: i.Namespace, + Name: i.Name, + SlotIndex: -1, + } +} + +// V2VarExpr is a variable reference (`$name`). +type V2VarExpr struct { + Name string + Pos Pos +} + +func (v *V2VarExpr) unwrapV2() syntax.Expr { + return &syntax.VarExpr{TokenPos: unwrapPosOrZero(v.Pos), Name: v.Name, SlotIndex: -1} +} + +// V2InputExpr is the `input` keyword. +type V2InputExpr struct{ Pos Pos } + +func (i *V2InputExpr) unwrapV2() syntax.Expr { + return &syntax.InputExpr{TokenPos: unwrapPosOrZero(i.Pos)} +} + +// V2InputMetaExpr is `input@`. +type V2InputMetaExpr struct{ Pos Pos } + +func (i *V2InputMetaExpr) unwrapV2() syntax.Expr { + return &syntax.InputMetaExpr{TokenPos: unwrapPosOrZero(i.Pos)} +} + +// V2OutputExpr is the `output` keyword. +type V2OutputExpr struct{ Pos Pos } + +func (o *V2OutputExpr) unwrapV2() syntax.Expr { + return &syntax.OutputExpr{TokenPos: unwrapPosOrZero(o.Pos)} +} + +// V2OutputMetaExpr is `output@`. +type V2OutputMetaExpr struct{ Pos Pos } + +func (o *V2OutputMetaExpr) unwrapV2() syntax.Expr { + return &syntax.OutputMetaExpr{TokenPos: unwrapPosOrZero(o.Pos)} +} + +// V2ArrayLiteral is a `[...]` literal. +type V2ArrayLiteral struct { + Elements []V2Expr + Pos Pos +} + +func (a *V2ArrayLiteral) unwrapV2() syntax.Expr { + out := make([]syntax.Expr, len(a.Elements)) + for i, e := range a.Elements { + out[i] = unwrapV2OrNil(e) + } + return &syntax.ArrayLiteral{LBracketPos: unwrapPosOrZero(a.Pos), Elements: out} +} + +// V2ObjectEntry is one entry in an object literal. +type V2ObjectEntry struct { + Key V2Expr + Value V2Expr +} + +// V2ObjectLiteral is a `{...}` literal. +type V2ObjectLiteral struct { + Entries []V2ObjectEntry + Pos Pos +} + +func (o *V2ObjectLiteral) unwrapV2() syntax.Expr { + out := make([]syntax.ObjectEntry, len(o.Entries)) + for i, e := range o.Entries { + out[i] = syntax.ObjectEntry{Key: unwrapV2OrNil(e.Key), Value: unwrapV2OrNil(e.Value)} + } + return &syntax.ObjectLiteral{LBracePos: unwrapPosOrZero(o.Pos), Entries: out} +} + +// V2BinaryOp identifies a V2 binary operator. The string form mirrors +// the source-syntax token: "+", "-", "==", "!=", "&&", "||", etc. +type V2BinaryOp string + +// V2BinaryExpr is a binary expression. +type V2BinaryExpr struct { + Op V2BinaryOp + Left V2Expr + Right V2Expr + Pos Pos +} + +func (b *V2BinaryExpr) unwrapV2() syntax.Expr { + return &syntax.BinaryExpr{ + Left: unwrapV2OrNil(b.Left), + Op: binaryOpFromString(string(b.Op)), + OpPos: unwrapPosOrZero(b.Pos), + Right: unwrapV2OrNil(b.Right), + } +} + +// v2Opaque carries an internal V2 expression returned from +// Context.Translate when the result doesn't map to a concrete public +// shape. Rules can pass it back as the receiver / args of constructed +// V2 nodes; the migrator unwraps it transparently. +type v2Opaque struct { + inner syntax.Expr +} + +func (o *v2Opaque) unwrapV2() syntax.Expr { return o.inner } + +func unwrapV2OrNil(e V2Expr) syntax.Expr { + if e == nil { + return nil + } + return e.unwrapV2() +} + +func unwrapPosOrZero(p Pos) syntax.Pos { + return syntax.Pos{Line: p.Line, Column: p.Column} +} + +func wrapV2(e syntax.Expr) V2Expr { + if e == nil { + return nil + } + return &v2Opaque{inner: e} +} + +// binaryOpFromString maps a source-syntax operator string to the +// internal TokenType. Used by V2BinaryExpr.unwrapV2. +func binaryOpFromString(op string) syntax.TokenType { + switch op { + case "+": + return syntax.PLUS + case "-": + return syntax.MINUS + case "*": + return syntax.STAR + case "/": + return syntax.SLASH + case "%": + return syntax.PERCENT + case "==": + return syntax.EQ + case "!=": + return syntax.NE + case "<": + return syntax.LT + case "<=": + return syntax.LE + case ">": + return syntax.GT + case ">=": + return syntax.GE + case "&&": + return syntax.AND + case "||": + return syntax.OR + } + return 0 +} From 72c9d2ba3de49bab4974e5d8cebff9af957b07a5 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Mon, 27 Apr 2026 10:53:49 +0100 Subject: [PATCH 16/20] bloblang(v2): Wire bloblangv2 through internals and public/service Threads public/bloblangv2 through the Benthos framework so V2 mappings can be parsed, linted, and executed alongside V1 in existing pipelines: - internal/bundle, internal/manager: NewManagement now carries a BloblV2Environment alongside the V1 BloblEnvironment, with a matching OptSetBloblV2Environment option. - internal/docs: LintBloblangV2Mapping lints a field as V2, side-effect-free using the configured V2 environment. - internal/cli, internal/cli/studio, internal/cli/test, internal/config/schema, internal/stream/manager: V2 envs are plumbed through CLI, studio sync, config schema enumeration, and the stream manager. - public/service: a bloblangv2 batch processor, a config_bloblangv2 field type, schema and linter integration, plus the corresponding additions to Environment, StreamBuilder, ResourceBuilder, StreamConfigLinter, and ComponentConfigLinter. Adds config/test/bloblang/ YAML fixtures covering the V2 batch processor's golden path, filter behaviour, and metadata reset semantics. --- config/test/bloblang/bloblang_v2.yaml | 37 ++ config/test/bloblang/bloblang_v2_filter.yaml | 16 + .../bloblang/bloblang_v2_metadata_reset.yaml | 17 + internal/bundle/package.go | 2 + internal/cli/common/manager.go | 1 + internal/cli/common/opts.go | 13 +- internal/cli/lint.go | 1 + internal/cli/list.go | 2 +- internal/cli/studio/pull_runner.go | 17 +- internal/cli/studio/sync_schema.go | 2 +- internal/cli/studio/sync_schema_test.go | 14 +- internal/cli/test/command.go | 1 + internal/config/schema/schema.go | 113 ++++-- internal/docs/bloblang.go | 31 ++ internal/docs/field.go | 37 +- internal/manager/mock/manager.go | 6 + internal/manager/type.go | 25 +- internal/stream/manager/api.go | 1 + public/service/component_config_linter.go | 1 + public/service/config.go | 1 + public/service/config_bloblangv2.go | 44 +++ public/service/environment.go | 120 ++++--- public/service/environment_schema.go | 2 +- public/service/environment_test.go | 330 ++++++++++++++++++ public/service/resource_builder.go | 3 + public/service/stream_builder.go | 3 + public/service/stream_config_linter.go | 1 + public/service/stream_schema.go | 115 ++++-- 28 files changed, 832 insertions(+), 124 deletions(-) create mode 100644 config/test/bloblang/bloblang_v2.yaml create mode 100644 config/test/bloblang/bloblang_v2_filter.yaml create mode 100644 config/test/bloblang/bloblang_v2_metadata_reset.yaml create mode 100644 public/service/config_bloblangv2.go diff --git a/config/test/bloblang/bloblang_v2.yaml b/config/test/bloblang/bloblang_v2.yaml new file mode 100644 index 000000000..a607f88e1 --- /dev/null +++ b/config/test/bloblang/bloblang_v2.yaml @@ -0,0 +1,37 @@ +pipeline: + processors: + - bloblang_v2: | + output@ = input@ + output.id = input.id + output.fans = input.fans.filter(fan -> fan.obsession > 0.5) + +tests: + - name: Filters fans by obsession score + target_processors: /pipeline/processors + input_batch: + - json_content: + id: foo + fans: + - {"name":"bev","obsession":0.57} + - {"name":"grace","obsession":0.21} + - {"name":"ali","obsession":0.89} + - {"name":"vic","obsession":0.43} + output_batches: + - - json_equals: + id: foo + fans: + - {"name":"bev","obsession":0.57} + - {"name":"ali","obsession":0.89} + + - name: Copies incoming metadata when mapping uses output@ = input@ + target_processors: /pipeline/processors + input_batch: + - content: '{"id":"bar","fans":[]}' + metadata: + origin: upstream + tenant: acme + output_batches: + - - json_equals: {"id":"bar","fans":[]} + metadata_equals: + origin: upstream + tenant: acme diff --git a/config/test/bloblang/bloblang_v2_filter.yaml b/config/test/bloblang/bloblang_v2_filter.yaml new file mode 100644 index 000000000..f6a08c0cd --- /dev/null +++ b/config/test/bloblang/bloblang_v2_filter.yaml @@ -0,0 +1,16 @@ +pipeline: + processors: + - bloblang_v2: | + output = if input.drop == true { deleted() } else { input } + +tests: + - name: Root deletion filters messages out of the batch + target_processors: /pipeline/processors + input_batch: + - content: '{"drop":false,"id":"keep1"}' + - content: '{"drop":true,"id":"drop1"}' + - content: '{"drop":false,"id":"keep2"}' + - content: '{"drop":true,"id":"drop2"}' + output_batches: + - - json_equals: {"drop":false,"id":"keep1"} + - json_equals: {"drop":false,"id":"keep2"} diff --git a/config/test/bloblang/bloblang_v2_metadata_reset.yaml b/config/test/bloblang/bloblang_v2_metadata_reset.yaml new file mode 100644 index 000000000..5436ede26 --- /dev/null +++ b/config/test/bloblang/bloblang_v2_metadata_reset.yaml @@ -0,0 +1,17 @@ +pipeline: + processors: + - bloblang_v2: | + output = input + output@.stamped_by = "bloblang_v2" + +tests: + - name: Incoming metadata is replaced when mapping does not copy input@ + target_processors: /pipeline/processors + input_batch: + - content: '{"id":"baz"}' + metadata: + will_be_dropped: yes + output_batches: + - - json_equals: {"id":"baz"} + metadata_equals: + stamped_by: bloblang_v2 diff --git a/internal/bundle/package.go b/internal/bundle/package.go index 675264f71..1531a6fc2 100644 --- a/internal/bundle/package.go +++ b/internal/bundle/package.go @@ -32,6 +32,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/filepath/ifs" "github.com/redpanda-data/benthos/v4/internal/log" "github.com/redpanda-data/benthos/v4/internal/message" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) var ( @@ -58,6 +59,7 @@ type NewManagement interface { FS() ifs.FS Environment() *Environment BloblEnvironment() *bloblang.Environment + BloblV2Environment() *bloblangv2.Environment RegisterEndpoint(path, desc string, h http.HandlerFunc) diff --git a/internal/cli/common/manager.go b/internal/cli/common/manager.go index c38e51f56..7a55403af 100644 --- a/internal/cli/common/manager.go +++ b/internal/cli/common/manager.go @@ -104,6 +104,7 @@ func CreateManager( manager.OptSetTracer(trac), manager.OptSetStreamsMode(streamsMode), manager.OptSetBloblangEnvironment(cliOpts.BloblEnvironment), + manager.OptSetBloblV2Environment(cliOpts.BloblV2Environment), manager.OptSetEnvironment(cliOpts.Environment), }, mgrOpts...) diff --git a/internal/cli/common/opts.go b/internal/cli/common/opts.go index b046354ee..78bcbe8c0 100644 --- a/internal/cli/common/opts.go +++ b/internal/cli/common/opts.go @@ -17,6 +17,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/config" "github.com/redpanda-data/benthos/v4/internal/docs" "github.com/redpanda-data/benthos/v4/internal/log" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) // StreamInitFunc is an optional func to be called when a stream (or streams @@ -39,9 +40,10 @@ type CLIOpts struct { ConfigSearchPaths []string - Environment *bundle.Environment - BloblEnvironment *bloblang.Environment - SecretAccessFn func(context.Context, string) (string, bool) + Environment *bundle.Environment + BloblEnvironment *bloblang.Environment + BloblV2Environment *bloblangv2.Environment + SecretAccessFn func(context.Context, string) (string, bool) MainConfigSpecCtor func() docs.FieldSpecs // TODO: This becomes a service.Environment OnManagerInitialised func(mgr bundle.NewManagement, pConf *docs.ParsedConfig) error @@ -75,8 +77,9 @@ func NewCLIOpts(version, dateBuilt string) *CLIOpts { "/etc/benthos/config.yaml", "/etc/benthos.yaml", }, - Environment: bundle.GlobalEnvironment, - BloblEnvironment: bloblang.GlobalEnvironment(), + Environment: bundle.GlobalEnvironment, + BloblEnvironment: bloblang.GlobalEnvironment(), + BloblV2Environment: bloblangv2.GlobalEnvironment(), SecretAccessFn: func(ctx context.Context, key string) (string, bool) { return os.LookupEnv(key) }, diff --git a/internal/cli/lint.go b/internal/cli/lint.go index 630a92a7c..bccccd604 100644 --- a/internal/cli/lint.go +++ b/internal/cli/lint.go @@ -211,6 +211,7 @@ func LintAction(c *cli.Context, opts *common.CLIOpts, stderr io.Writer) error { lConf := docs.NewLintConfig(opts.Environment) lConf.BloblangEnv = bloblang.XWrapEnvironment(opts.BloblEnvironment) + lConf.BloblangV2Env = opts.BloblV2Environment lConf.RejectDeprecated = c.Bool("deprecated") lConf.RequireLabels = c.Bool("labels") skipEnvVarCheck := c.Bool("skip-env-var-check") diff --git a/internal/cli/list.go b/internal/cli/list.go index 4d04d5355..a19b8ac9c 100644 --- a/internal/cli/list.go +++ b/internal/cli/list.go @@ -75,7 +75,7 @@ func listComponents(c *cli.Context, opts *common.CLIOpts) { ofTypes[k] = struct{}{} } - schema := schema.New(opts.Version, opts.DateBuilt, opts.Environment, opts.BloblEnvironment) + schema := schema.New(opts.Version, opts.DateBuilt, opts.Environment, opts.BloblEnvironment, opts.BloblV2Environment) if status := c.String("status"); status != "" { schema.ReduceToStatus(status) } diff --git a/internal/cli/studio/pull_runner.go b/internal/cli/studio/pull_runner.go index 18f49ab8c..30c4b7e7e 100644 --- a/internal/cli/studio/pull_runner.go +++ b/internal/cli/studio/pull_runner.go @@ -27,6 +27,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/manager" "github.com/redpanda-data/benthos/v4/internal/stream" "github.com/redpanda-data/benthos/v4/public/bloblang" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) type noopStopper struct{} @@ -54,12 +55,13 @@ const defaultCloseDeadline = time.Second * 30 // reallocations, or config changes and attempt to reflect those changes in the // running stream. type PullRunner struct { - secretLookupFn func(context.Context, string) (string, bool) - bloblEnvironment *ibloblang.Environment - environment *bundle.Environment - confReaderSpec docs.FieldSpecs - confReader *config.Reader - sessionTracker *sessionTracker + secretLookupFn func(context.Context, string) (string, bool) + bloblEnvironment *ibloblang.Environment + bloblV2Environment *bloblangv2.Environment + environment *bundle.Environment + confReaderSpec docs.FieldSpecs + confReader *config.Reader + sessionTracker *sessionTracker // Controls disabled deployment rotations isDisabled bool @@ -109,6 +111,7 @@ func NewPullRunner(c *cli.Context, cliOpts *common.CLIOpts, token, secret string r := &PullRunner{ secretLookupFn: cliOpts.SecretAccessFn, bloblEnvironment: cliOpts.BloblEnvironment, + bloblV2Environment: cliOpts.BloblV2Environment, environment: cliOpts.Environment, confReaderSpec: cliOpts.MainConfigSpecCtor(), metricsFlushPeriod: time.Second * 30, @@ -278,6 +281,7 @@ func (r *PullRunner) bootstrapConfigReader(ctx context.Context) (bootstrapErr er lintConf := docs.NewLintConfig(r.environment) lintConf.BloblangEnv = bloblang.XWrapEnvironment(bloblEnv.Deactivated()) + lintConf.BloblangV2Env = r.bloblV2Environment confReaderTmp := config.NewReader(initMainFile, initResources, config.OptUseEnvLookupFunc(r.secretLookupFn), @@ -314,6 +318,7 @@ func (r *PullRunner) bootstrapConfigReader(ctx context.Context) (bootstrapErr er r.cliContext, r.cliOpts, r.logger, false, conf, manager.OptSetEnvironment(tmpEnv), manager.OptSetBloblangEnvironment(bloblEnv), + manager.OptSetBloblV2Environment(r.bloblV2Environment), manager.OptSetFS(sessFS)) if err != nil { return fmt.Errorf("failed to create manager from bootstrap config: %w", err) diff --git a/internal/cli/studio/sync_schema.go b/internal/cli/studio/sync_schema.go index 0ceb0b25e..908978fbf 100644 --- a/internal/cli/studio/sync_schema.go +++ b/internal/cli/studio/sync_schema.go @@ -63,7 +63,7 @@ page within the studio application.`[1:], } u.Path = path.Join(u.Path, "/", apiPathPrefix, fmt.Sprintf("/v1/token/%v/session/%v/schema", tokenID, sessionID)) - schema := schema.New(cliOpts.Version, cliOpts.DateBuilt, cliOpts.Environment, cliOpts.BloblEnvironment) + schema := schema.New(cliOpts.Version, cliOpts.DateBuilt, cliOpts.Environment, cliOpts.BloblEnvironment, cliOpts.BloblV2Environment) schema.Config = cliOpts.MainConfigSpecCtor() schema.Scrub() schemaBytes, err := json.Marshal(schema) diff --git a/internal/cli/studio/sync_schema_test.go b/internal/cli/studio/sync_schema_test.go index f087a5d67..b0208fefd 100644 --- a/internal/cli/studio/sync_schema_test.go +++ b/internal/cli/studio/sync_schema_test.go @@ -36,12 +36,22 @@ func TestSyncSchema(t *testing.T) { err = json.Unmarshal(body, &schema) require.NoError(t, err) - assert.ElementsMatch(t, slices.Collect(maps.Keys(schema)), []string{ + expected := []string{ "version", "date", "config", "buffers", "caches", "inputs", "outputs", "processors", "rate-limits", "metrics", "tracers", "scanners", "bloblang-functions", "bloblang-methods", - }) + } + // V2 plugin fields (bloblang-v2-functions / bloblang-v2-methods) + // only appear in the dump when the env has registered V2 plugins, + // which happens for any binary importing public/components/{pure,io}. + if _, ok := schema["bloblang-v2-functions"]; ok { + expected = append(expected, "bloblang-v2-functions") + } + if _, ok := schema["bloblang-v2-methods"]; ok { + expected = append(expected, "bloblang-v2-methods") + } + assert.ElementsMatch(t, slices.Collect(maps.Keys(schema)), expected) var version string err = json.Unmarshal(schema["version"], &version) diff --git a/internal/cli/test/command.go b/internal/cli/test/command.go index e642e3fa8..c8c080bfa 100644 --- a/internal/cli/test/command.go +++ b/internal/cli/test/command.go @@ -107,6 +107,7 @@ func lintTarget(opts *common.CLIOpts, spec docs.FieldSpecs, path, testSuffix str lintConf := docs.NewLintConfig(opts.Environment) lintConf.BloblangEnv = bloblang.XWrapEnvironment(opts.BloblEnvironment) + lintConf.BloblangV2Env = opts.BloblV2Environment // This is necessary as each test case can provide a different set of // environment variables, so in order to test env vars properly we would diff --git a/internal/config/schema/schema.go b/internal/config/schema/schema.go index e09a245b3..f0edbde05 100644 --- a/internal/config/schema/schema.go +++ b/internal/config/schema/schema.go @@ -8,30 +8,33 @@ import ( "github.com/redpanda-data/benthos/v4/internal/bundle" "github.com/redpanda-data/benthos/v4/internal/config" "github.com/redpanda-data/benthos/v4/internal/docs" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) // Full represents the entirety of the Benthos instances configuration spec and // all plugins. type Full struct { - Version string `json:"version"` - Date string `json:"date"` - Config docs.FieldSpecs `json:"config,omitempty"` - Buffers []docs.ComponentSpec `json:"buffers,omitempty"` - Caches []docs.ComponentSpec `json:"caches,omitempty"` - Inputs []docs.ComponentSpec `json:"inputs,omitempty"` - Outputs []docs.ComponentSpec `json:"outputs,omitempty"` - Processors []docs.ComponentSpec `json:"processors,omitempty"` - RateLimits []docs.ComponentSpec `json:"rate-limits,omitempty"` - Metrics []docs.ComponentSpec `json:"metrics,omitempty"` - Tracers []docs.ComponentSpec `json:"tracers,omitempty"` - Scanners []docs.ComponentSpec `json:"scanners,omitempty"` - BloblangFunctions []query.FunctionSpec `json:"bloblang-functions,omitempty"` - BloblangMethods []query.MethodSpec `json:"bloblang-methods,omitempty"` + Version string `json:"version"` + Date string `json:"date"` + Config docs.FieldSpecs `json:"config,omitempty"` + Buffers []docs.ComponentSpec `json:"buffers,omitempty"` + Caches []docs.ComponentSpec `json:"caches,omitempty"` + Inputs []docs.ComponentSpec `json:"inputs,omitempty"` + Outputs []docs.ComponentSpec `json:"outputs,omitempty"` + Processors []docs.ComponentSpec `json:"processors,omitempty"` + RateLimits []docs.ComponentSpec `json:"rate-limits,omitempty"` + Metrics []docs.ComponentSpec `json:"metrics,omitempty"` + Tracers []docs.ComponentSpec `json:"tracers,omitempty"` + Scanners []docs.ComponentSpec `json:"scanners,omitempty"` + BloblangFunctions []query.FunctionSpec `json:"bloblang-functions,omitempty"` + BloblangMethods []query.MethodSpec `json:"bloblang-methods,omitempty"` + BloblangV2Functions []bloblangv2.PluginInfo `json:"bloblang-v2-functions,omitempty"` + BloblangV2Methods []bloblangv2.PluginInfo `json:"bloblang-v2-methods,omitempty"` } // New walks all registered Benthos components and creates a full schema // definition of it. -func New(version, date string, env *bundle.Environment, bEnv *bloblang.Environment) Full { +func New(version, date string, env *bundle.Environment, bEnv *bloblang.Environment, bV2Env *bloblangv2.Environment) Full { s := Full{ Version: version, Date: date, @@ -52,6 +55,14 @@ func New(version, date string, env *bundle.Environment, bEnv *bloblang.Environme bEnv.WalkMethods(func(name string, spec query.MethodSpec) { s.BloblangMethods = append(s.BloblangMethods, spec) }) + if bV2Env != nil { + bV2Env.WalkFunctions(func(_ string, view *bloblangv2.FunctionView) { + s.BloblangV2Functions = append(s.BloblangV2Functions, view.Info()) + }) + bV2Env.WalkMethods(func(_ string, view *bloblangv2.MethodView) { + s.BloblangV2Methods = append(s.BloblangV2Methods, view.Info()) + }) + } return s } @@ -93,6 +104,30 @@ func (f *Full) ReduceToStatus(status string) { } } f.BloblangMethods = newMethods + + // V2 plugin status reuses the same status string vocabulary; an empty + // status is equivalent to "stable" for filtering purposes. + v2Match := func(specStatus string) bool { + if specStatus == "" { + return status == "stable" + } + return specStatus == status + } + var newV2Funcs []bloblangv2.PluginInfo + for _, s := range f.BloblangV2Functions { + if v2Match(s.Status) { + newV2Funcs = append(newV2Funcs, s) + } + } + f.BloblangV2Functions = newV2Funcs + + var newV2Methods []bloblangv2.PluginInfo + for _, s := range f.BloblangV2Methods { + if v2Match(s.Status) { + newV2Methods = append(newV2Methods, s) + } + } + f.BloblangV2Methods = newV2Methods } func justNames(components []docs.ComponentSpec) []string { @@ -125,21 +160,33 @@ func justNamesBloblMethods(fns []query.MethodSpec) []string { return names } +func justNamesBloblV2(specs []bloblangv2.PluginInfo) []string { + names := []string{} + for _, s := range specs { + if s.Status != "deprecated" { + names = append(names, s.Name) + } + } + return names +} + // Flattened returns a flattened representation of all registered plugin types // and names. func (f *Full) Flattened() map[string][]string { return map[string][]string{ - "buffers": justNames(f.Buffers), - "caches": justNames(f.Caches), - "inputs": justNames(f.Inputs), - "outputs": justNames(f.Outputs), - "processors": justNames(f.Processors), - "rate-limits": justNames(f.RateLimits), - "metrics": justNames(f.Metrics), - "tracers": justNames(f.Tracers), - "scanners": justNames(f.Scanners), - "bloblang-functions": justNamesBloblFuncs(f.BloblangFunctions), - "bloblang-methods": justNamesBloblMethods(f.BloblangMethods), + "buffers": justNames(f.Buffers), + "caches": justNames(f.Caches), + "inputs": justNames(f.Inputs), + "outputs": justNames(f.Outputs), + "processors": justNames(f.Processors), + "rate-limits": justNames(f.RateLimits), + "metrics": justNames(f.Metrics), + "tracers": justNames(f.Tracers), + "scanners": justNames(f.Scanners), + "bloblang-functions": justNamesBloblFuncs(f.BloblangFunctions), + "bloblang-methods": justNamesBloblMethods(f.BloblangMethods), + "bloblang-v2-functions": justNamesBloblV2(f.BloblangV2Functions), + "bloblang-v2-methods": justNamesBloblV2(f.BloblangV2Methods), } } @@ -168,6 +215,20 @@ func (f *Full) Scrub() { f.BloblangMethods[i].Categories = nil scrubParams(f.BloblangMethods[i].Params.Definitions) } + for i := range f.BloblangV2Functions { + f.BloblangV2Functions[i].Description = "" + scrubV2Params(f.BloblangV2Functions[i].Params) + } + for i := range f.BloblangV2Methods { + f.BloblangV2Methods[i].Description = "" + scrubV2Params(f.BloblangV2Methods[i].Params) + } +} + +func scrubV2Params(p []bloblangv2.PluginParamInfo) { + for i := range p { + p[i].Description = "" + } } func scrubParams(p []query.ParamDefinition) { diff --git a/internal/docs/bloblang.go b/internal/docs/bloblang.go index dc705e839..81c7807b9 100644 --- a/internal/docs/bloblang.go +++ b/internal/docs/bloblang.go @@ -3,7 +3,10 @@ package docs import ( + "errors" + "github.com/redpanda-data/benthos/v4/public/bloblang" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) // LintBloblangMapping is function for linting a config field expected to be a @@ -49,3 +52,31 @@ func LintBloblangField(ctx LintContext, line, col int, v any) []Lint { } return []Lint{NewLintError(line, LintBadBloblang, err)} } + +// LintBloblangV2Mapping is the linter for a config field expected to be a +// Bloblang V2 mapping. V2 parsing is side-effect free so the configured +// environment is used directly, no deactivated mode required. +func LintBloblangV2Mapping(ctx LintContext, line, col int, v any) []Lint { + str, ok := v.(string) + if !ok { + return nil + } + if str == "" { + return nil + } + env := ctx.conf.BloblangV2Env + if env == nil { + env = bloblangv2.GlobalEnvironment() + } + _, err := env.Parse(str) + if err == nil { + return nil + } + var pErr *bloblangv2.ParseError + if errors.As(err, &pErr) { + lint := NewLintError(line+pErr.Line-1, LintBadBloblang, pErr) + lint.Column = col + pErr.Column + return []Lint{lint} + } + return []Lint{NewLintError(line, LintBadBloblang, err)} +} diff --git a/internal/docs/field.go b/internal/docs/field.go index 49531e627..7d0116570 100644 --- a/internal/docs/field.go +++ b/internal/docs/field.go @@ -9,6 +9,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/value" "github.com/redpanda-data/benthos/v4/public/bloblang" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) // FieldType represents a field type. @@ -115,6 +116,9 @@ type FieldSpec struct { // Bloblang indicates that a string field is a Bloblang mapping. Bloblang bool `json:"bloblang,omitempty"` + // BloblangV2 indicates that a string field is a Bloblang V2 mapping. + BloblangV2 bool `json:"bloblang_v2,omitempty"` + // Examples is a slice of optional example values for a field. Examples []any `json:"examples,omitempty"` @@ -154,6 +158,12 @@ func (f FieldSpec) IsBloblang() FieldSpec { return f } +// IsBloblangV2 indicates that the field is a Bloblang V2 mapping. +func (f FieldSpec) IsBloblangV2() FieldSpec { + f.BloblangV2 = true + return f +} + // HasType returns a new FieldSpec that specifies a specific type. func (f FieldSpec) HasType(t FieldType) FieldSpec { f.Type = t @@ -464,6 +474,18 @@ func (f FieldSpec) GetLintFunc() LintFunc { fn = LintBloblangMapping } } + if f.BloblangV2 { + if fn != nil { + innerFn := fn + fn = func(ctx LintContext, line, col int, value any) []Lint { + lints := innerFn(ctx, line, col, value) + moreLints := LintBloblangV2Mapping(ctx, line, col, value) + return append(lints, moreLints...) + } + } else { + fn = LintBloblangV2Mapping + } + } return fn } @@ -494,6 +516,12 @@ func FieldBloblang(name, description string, examples ...any) FieldSpec { return newField(name, description, examples...).HasType(FieldTypeString).IsBloblang() } +// FieldBloblangV2 returns a field spec for a string typed field containing a +// Bloblang V2 mapping. +func FieldBloblangV2(name, description string, examples ...any) FieldSpec { + return newField(name, description, examples...).HasType(FieldTypeString).IsBloblangV2() +} + // FieldInt returns a field spec for a common int typed field. func FieldInt(name, description string, examples ...any) FieldSpec { return newField(name, description, examples...).HasType(FieldTypeInt) @@ -679,6 +707,10 @@ type LintConfig struct { // Provides an isolated context for Bloblang parsing. BloblangEnv *bloblang.Environment + // Provides the registry used to parse Bloblang V2 mapping fields. V2 + // parsing is side-effect free so this does not need a deactivated mode. + BloblangV2Env *bloblangv2.Environment + // Reject any deprecated components or fields as linting errors. RejectDeprecated bool @@ -692,8 +724,9 @@ type LintConfig struct { // NewLintConfig creates a default linting config. func NewLintConfig(prov Provider) LintConfig { return LintConfig{ - DocsProvider: prov, - BloblangEnv: bloblang.GlobalEnvironment().Deactivated(), + DocsProvider: prov, + BloblangEnv: bloblang.GlobalEnvironment().Deactivated(), + BloblangV2Env: bloblangv2.GlobalEnvironment(), } } diff --git a/internal/manager/mock/manager.go b/internal/manager/mock/manager.go index 93a83326e..5519291a0 100644 --- a/internal/manager/mock/manager.go +++ b/internal/manager/mock/manager.go @@ -24,6 +24,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/filepath/ifs" "github.com/redpanda-data/benthos/v4/internal/log" "github.com/redpanda-data/benthos/v4/internal/message" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) // Manager provides a mock benthos manager that components can use to test @@ -186,6 +187,11 @@ func (m *Manager) BloblEnvironment() *bloblang.Environment { return bloblang.GlobalEnvironment() } +// BloblV2Environment always returns the global Bloblang V2 environment. +func (m *Manager) BloblV2Environment() *bloblangv2.Environment { + return bloblangv2.GlobalEnvironment() +} + // ProbeCache returns true if a cache resource exists under the provided name. func (m *Manager) ProbeCache(name string) bool { m.lock.Lock() diff --git a/internal/manager/type.go b/internal/manager/type.go index 76b1f456e..ed771a56a 100644 --- a/internal/manager/type.go +++ b/internal/manager/type.go @@ -31,6 +31,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/log" "github.com/redpanda-data/benthos/v4/internal/manager/mock" "github.com/redpanda-data/benthos/v4/internal/message" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) const ( @@ -99,8 +100,9 @@ type Type struct { consumeTriggered *atomic.Bool // Collections of component constructors - env *bundle.Environment - bloblEnv *bloblang.Environment + env *bundle.Environment + bloblEnv *bloblang.Environment + bloblV2Env *bloblangv2.Environment logger log.Modular stats *metrics.Namespaced @@ -184,6 +186,14 @@ func OptSetBloblangEnvironment(env *bloblang.Environment) OptFunc { } } +// OptSetBloblV2Environment determines the environment from which the manager +// parses Bloblang V2 mappings. This option is for internal use only. +func OptSetBloblV2Environment(env *bloblangv2.Environment) OptFunc { + return func(t *Type) { + t.bloblV2Env = env + } +} + // OptSetStreamsMode marks the manager as being created for running streams mode // resources. This ensures that a label "stream" is added to metrics. func OptSetStreamsMode(b bool) OptFunc { @@ -216,8 +226,9 @@ func New(conf ResourceConfig, opts ...OptFunc) (*Type, error) { rateLimits: newLiveResources[ratelimit.V1](), // Environment defaults to global (everything that was imported). - env: bundle.GlobalEnvironment, - bloblEnv: bloblang.GlobalEnvironment(), + env: bundle.GlobalEnvironment, + bloblEnv: bloblang.GlobalEnvironment(), + bloblV2Env: bloblangv2.GlobalEnvironment(), logger: log.Noop(), stats: metrics.Noop(), @@ -556,6 +567,12 @@ func (t *Type) BloblEnvironment() *bloblang.Environment { return t.bloblEnv } +// BloblV2Environment returns the Bloblang V2 environment used by the manager +// to parse V2 mapping fields. This is for internal use only. +func (t *Type) BloblV2Environment() *bloblangv2.Environment { + return t.bloblV2Env +} + //------------------------------------------------------------------------------ // GetDocs returns a documentation spec for an implementation of a component. diff --git a/internal/stream/manager/api.go b/internal/stream/manager/api.go index 6c7146db6..1ed01e0e6 100644 --- a/internal/stream/manager/api.go +++ b/internal/stream/manager/api.go @@ -71,6 +71,7 @@ type lintErrors struct { func (m *Type) lintCtx() docs.LintContext { lConf := docs.NewLintConfig(m.manager.Environment()) lConf.BloblangEnv = bloblang.XWrapEnvironment(m.manager.BloblEnvironment()).Deactivated() + lConf.BloblangV2Env = m.manager.BloblV2Environment() return docs.NewLintContext(lConf) } diff --git a/public/service/component_config_linter.go b/public/service/component_config_linter.go index 22bb9bbe8..a2568fb5a 100644 --- a/public/service/component_config_linter.go +++ b/public/service/component_config_linter.go @@ -29,6 +29,7 @@ type ComponentConfigLinter struct { func (e *Environment) NewComponentConfigLinter() *ComponentConfigLinter { lintConf := docs.NewLintConfig(e.internal) lintConf.BloblangEnv = e.bloblangEnv.Deactivated() + lintConf.BloblangV2Env = e.getBloblangV2ParserEnv() return &ComponentConfigLinter{ env: e, lintConf: lintConf, diff --git a/public/service/config.go b/public/service/config.go index a89ba72bf..587992b4b 100644 --- a/public/service/config.go +++ b/public/service/config.go @@ -323,6 +323,7 @@ func (c *ConfigSpec) ParseYAML(yamlStr string, env *Environment) (*ParsedConfig, manager.NewResourceConfig(), manager.OptSetEnvironment(env.internal), manager.OptSetBloblangEnvironment(env.getBloblangParserEnv()), + manager.OptSetBloblV2Environment(env.getBloblangV2ParserEnv()), ) if err != nil { return nil, fmt.Errorf("failed to instantiate resources: %w", err) diff --git a/public/service/config_bloblangv2.go b/public/service/config_bloblangv2.go new file mode 100644 index 000000000..38e1f60a2 --- /dev/null +++ b/public/service/config_bloblangv2.go @@ -0,0 +1,44 @@ +// Copyright 2026 Redpanda Data, Inc. + +package service + +import ( + "fmt" + "strings" + + "github.com/redpanda-data/benthos/v4/internal/docs" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// NewBloblangV2Field defines a new config field that describes a Bloblang V2 +// mapping string. A *bloblangv2.Executor can then be extracted from the parsed +// config via FieldBloblangV2, and the field is parsed at lint time so syntax +// errors surface during config load rather than at component construction. +// +// Bloblang V2 is a separate language from V1 with its own parser and plugin +// registry; see public/bloblangv2 for details. +func NewBloblangV2Field(name string) *ConfigField { + tf := docs.FieldBloblangV2(name, "") + return &ConfigField{field: tf} +} + +// FieldBloblangV2 accesses a field from a parsed config that was defined with +// NewBloblangV2Field and returns either a *bloblangv2.Executor or an error if +// the mapping was invalid. +func (p *ParsedConfig) FieldBloblangV2(path ...string) (*bloblangv2.Executor, error) { + v, exists := p.i.Field(path...) + if !exists { + return nil, fmt.Errorf("field '%v' was not found in the config", strings.Join(path, ".")) + } + + str, ok := v.(string) + if !ok { + return nil, fmt.Errorf("expected field '%v' to be a string, got %T", strings.Join(path, "."), v) + } + + exec, err := p.mgr.BloblV2Environment().Parse(str) + if err != nil { + return nil, fmt.Errorf("failed to parse bloblang v2 mapping '%v': %v", strings.Join(path, "."), err) + } + return exec, nil +} diff --git a/public/service/environment.go b/public/service/environment.go index e63d5adaa..7664b600e 100644 --- a/public/service/environment.go +++ b/public/service/environment.go @@ -27,6 +27,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/filepath/ifs" "github.com/redpanda-data/benthos/v4/internal/template" "github.com/redpanda-data/benthos/v4/public/bloblang" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) // Environment is a collection of Benthos component plugins that can be used in @@ -35,15 +36,17 @@ import ( // authors do not need to create an Environment and can simply use the global // environment. type Environment struct { - internal *bundle.Environment - bloblangEnv *bloblang.Environment - fs ifs.FS + internal *bundle.Environment + bloblangEnv *bloblang.Environment + bloblangV2Env *bloblangv2.Environment + fs ifs.FS } var globalEnvironment = &Environment{ - internal: bundle.GlobalEnvironment, - bloblangEnv: bloblang.GlobalEnvironment(), - fs: ifs.OS(), + internal: bundle.GlobalEnvironment, + bloblangEnv: bloblang.GlobalEnvironment(), + bloblangV2Env: bloblangv2.GlobalEnvironment(), + fs: ifs.OS(), } // GlobalEnvironment returns a reference to the global environment, adding @@ -62,9 +65,10 @@ func NewEnvironment() *Environment { // NewEmptyEnvironment creates a new environment with zero registered plugins. func NewEmptyEnvironment() *Environment { return &Environment{ - internal: bundle.NewEnvironment(), - bloblangEnv: bloblang.NewEmptyEnvironment(), - fs: ifs.OS(), + internal: bundle.NewEnvironment(), + bloblangEnv: bloblang.NewEmptyEnvironment(), + bloblangV2Env: bloblangv2.NewEmptyEnvironment(), + fs: ifs.OS(), } } @@ -72,9 +76,10 @@ func NewEmptyEnvironment() *Environment { // that can be modified independently of the source. func (e *Environment) Clone() *Environment { return &Environment{ - internal: e.internal.Clone(), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.Clone(), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -82,9 +87,10 @@ func (e *Environment) Clone() *Environment { // plugin names excluded from the resulting environment. func (e *Environment) Without(names ...string) *Environment { return &Environment{ - internal: e.internal.Without(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.Without(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -92,9 +98,10 @@ func (e *Environment) Without(names ...string) *Environment { // plugin names included from the resulting environment. func (e *Environment) With(names ...string) *Environment { return &Environment{ - internal: e.internal.With(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.With(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -104,6 +111,13 @@ func (e *Environment) UseBloblangEnvironment(bEnv *bloblang.Environment) { e.bloblangEnv = bEnv } +// UseBloblangV2Environment configures the service environment to restrict +// components constructed with it to a specific Bloblang V2 environment, which +// controls the functions and methods available to V2 mapping fields. +func (e *Environment) UseBloblangV2Environment(bEnv *bloblangv2.Environment) { + e.bloblangV2Env = bEnv +} + // UseFS configures the service environment to use an instantiation of *FS as // its filesystem. This provides extra control over the file access of all // Benthos components within the stream. However, this functionality is opt-in @@ -132,6 +146,13 @@ func (e *Environment) getBloblangParserEnv() *ibloblang.Environment { return ibloblang.GlobalEnvironment() } +func (e *Environment) getBloblangV2ParserEnv() *bloblangv2.Environment { + if e.bloblangV2Env == nil { + return bloblangv2.GlobalEnvironment() + } + return e.bloblangV2Env +} + //------------------------------------------------------------------------------ // RegisterCustomResource registers a named custom resource type. The provided @@ -729,9 +750,10 @@ func XFormatConfigJSON() ([]byte, error) { // buffers, where only the specified plugins are included. func (e *Environment) WithBuffers(names ...string) *Environment { return &Environment{ - internal: e.internal.WithBuffers(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithBuffers(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -739,9 +761,10 @@ func (e *Environment) WithBuffers(names ...string) *Environment { // caches, where only the specified plugins are included. func (e *Environment) WithCaches(names ...string) *Environment { return &Environment{ - internal: e.internal.WithCaches(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithCaches(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -749,9 +772,10 @@ func (e *Environment) WithCaches(names ...string) *Environment { // inputs, where only the specified plugins are included. func (e *Environment) WithInputs(names ...string) *Environment { return &Environment{ - internal: e.internal.WithInputs(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithInputs(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -759,9 +783,10 @@ func (e *Environment) WithInputs(names ...string) *Environment { // outputs, where only the specified plugins are included. func (e *Environment) WithOutputs(names ...string) *Environment { return &Environment{ - internal: e.internal.WithOutputs(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithOutputs(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -769,9 +794,10 @@ func (e *Environment) WithOutputs(names ...string) *Environment { // of processors, where only the specified plugins are included. func (e *Environment) WithProcessors(names ...string) *Environment { return &Environment{ - internal: e.internal.WithProcessors(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithProcessors(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -779,9 +805,10 @@ func (e *Environment) WithProcessors(names ...string) *Environment { // of rate limits, where only the specified plugins are included. func (e *Environment) WithRateLimits(names ...string) *Environment { return &Environment{ - internal: e.internal.WithRateLimits(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithRateLimits(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -789,9 +816,10 @@ func (e *Environment) WithRateLimits(names ...string) *Environment { // metrics, where only the specified plugins are included. func (e *Environment) WithMetrics(names ...string) *Environment { return &Environment{ - internal: e.internal.WithMetrics(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithMetrics(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -799,9 +827,10 @@ func (e *Environment) WithMetrics(names ...string) *Environment { // tracers, where only the specified plugins are included. func (e *Environment) WithTracers(names ...string) *Environment { return &Environment{ - internal: e.internal.WithTracers(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithTracers(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } @@ -809,8 +838,9 @@ func (e *Environment) WithTracers(names ...string) *Environment { // of scanners, where only the specified plugins are included. func (e *Environment) WithScanners(names ...string) *Environment { return &Environment{ - internal: e.internal.WithScanners(names...), - bloblangEnv: e.bloblangEnv.Clone(), - fs: e.fs, + internal: e.internal.WithScanners(names...), + bloblangEnv: e.bloblangEnv.Clone(), + bloblangV2Env: e.bloblangV2Env.Clone(), + fs: e.fs, } } diff --git a/public/service/environment_schema.go b/public/service/environment_schema.go index 5637c1522..5df6bf760 100644 --- a/public/service/environment_schema.go +++ b/public/service/environment_schema.go @@ -15,7 +15,7 @@ type EnvironmentSchema struct { // GenerateSchema creates a new EnvironmentSchema. func (e *Environment) GenerateSchema(version, dateBuilt string) *EnvironmentSchema { - schema := schema.New(version, dateBuilt, e.internal, e.getBloblangParserEnv()) + schema := schema.New(version, dateBuilt, e.internal, e.getBloblangParserEnv(), e.getBloblangV2ParserEnv()) return &EnvironmentSchema{s: schema} } diff --git a/public/service/environment_test.go b/public/service/environment_test.go index febc6e242..5ec0ca6ed 100644 --- a/public/service/environment_test.go +++ b/public/service/environment_test.go @@ -4,6 +4,7 @@ package service_test import ( "context" + "encoding/json" "errors" "fmt" "io/fs" @@ -18,6 +19,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/filepath/ifs" "github.com/redpanda-data/benthos/v4/public/bloblang" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" "github.com/redpanda-data/benthos/v4/public/service" ) @@ -156,6 +158,334 @@ logger: assert.Equal(t, []string{"meow"}, received) } +func TestEnvironmentBloblangV2SchemaIncludesPlugins(t *testing.T) { + bEnv := bloblangv2.NewEmptyEnvironment() + require.NoError(t, bEnv.RegisterFunction("hoot", bloblangv2.NewPluginSpec().Description("owl noise"), + func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { return "hoot", nil }, nil + }, + )) + require.NoError(t, bEnv.RegisterMethod("yell", bloblangv2.NewPluginSpec(), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { return s + "!", nil }), nil + }, + )) + + env := service.NewEnvironment() + env.UseBloblangV2Environment(bEnv) + + flat := env.GenerateSchema("test", "now").XFlattened() + assert.Contains(t, flat["bloblang-v2-functions"], "hoot") + assert.Contains(t, flat["bloblang-v2-methods"], "yell") +} + +func TestConfigSchemaFromJSONV0RoundTripsBloblangV2(t *testing.T) { + // Build a schema dump on a "remote" environment that has registered V2 + // plugins, then load it as JSON on a fresh "local" environment that has + // no implementations of those plugins. Linting against the loaded + // schema should accept mappings that reference the remote plugins, and + // reject mappings that reference unknown ones — proving the stub + // registrations carried through. + remoteEnv := bloblangv2.NewEmptyEnvironment() + require.NoError(t, remoteEnv.RegisterFunction("hoot", + bloblangv2.NewPluginSpec().Description("owl noise"), + func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { return "hoot", nil }, nil + }, + )) + require.NoError(t, remoteEnv.RegisterMethod("yell", bloblangv2.NewPluginSpec(). + Param(bloblangv2.NewStringParam("suffix").Default("!")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + suf, _ := args.GetString("suffix") + return bloblangv2.StringMethod(func(s string) (any, error) { return s + suf, nil }), nil + }, + )) + + remoteSvcEnv := service.NewEmptyEnvironment() + remoteSvcEnv.UseBloblangV2Environment(remoteEnv) + dump, err := remoteSvcEnv.FullConfigSchema("v0", "now").MarshalJSONV0() + require.NoError(t, err) + + localSchema, err := service.ConfigSchemaFromJSONV0(dump) + require.NoError(t, err) + + // Re-marshal the loaded schema and confirm the V2 plugin descriptors + // survived the decode → register-stub → enumerate cycle. If the decode + // step had silently dropped them, MarshalJSONV0 on the loaded schema + // would emit an empty set. + rehydrated, err := localSchema.MarshalJSONV0() + require.NoError(t, err) + + var redump struct { + BloblangV2Functions []bloblangv2.PluginInfo `json:"bloblang-v2-functions"` + BloblangV2Methods []bloblangv2.PluginInfo `json:"bloblang-v2-methods"` + } + require.NoError(t, json.Unmarshal(rehydrated, &redump)) + + require.Len(t, redump.BloblangV2Functions, 1) + assert.Equal(t, "hoot", redump.BloblangV2Functions[0].Name) + assert.Equal(t, "owl noise", redump.BloblangV2Functions[0].Description) + + require.Len(t, redump.BloblangV2Methods, 1) + assert.Equal(t, "yell", redump.BloblangV2Methods[0].Name) + require.Len(t, redump.BloblangV2Methods[0].Params, 1) + assert.Equal(t, "suffix", redump.BloblangV2Methods[0].Params[0].Name) + assert.Equal(t, "string", redump.BloblangV2Methods[0].Params[0].Kind) + assert.True(t, redump.BloblangV2Methods[0].Params[0].HasDefault) +} + +func TestEnvironmentBloblangV2ClonePropagation(t *testing.T) { + customEnv := bloblangv2.NewEmptyEnvironment() + require.NoError(t, customEnv.RegisterFunction("oink", bloblangv2.NewPluginSpec(), func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { return "oink", nil }, nil + })) + + procSpec := service.NewConfigSpec().Field(service.NewBloblangV2Field("mapping")) + procCtor := func(conf *service.ParsedConfig, _ *service.Resources) (service.Processor, error) { + exec, err := conf.FieldBloblangV2("mapping") + if err != nil { + return nil, err + } + return &v2MappingProc{exec: exec}, nil + } + + base := service.NewEnvironment() + base.UseBloblangV2Environment(customEnv) + require.NoError(t, base.RegisterProcessor("v2_clone_map", procSpec, procCtor)) + + // A clone must inherit the custom V2 env and resolve the plugin. + cloned := base.Clone() + assertEnvResolves(t, cloned, "v2_clone_map", "output = oink()", "oink") + + // The With* variants also propagate the V2 env (exercise Without as the + // most restrictive one — others share the same Clone-based machinery). + without := base.Without("nonexistent_plugin") + assertEnvResolves(t, without, "v2_clone_map", "output = oink()", "oink") +} + +// assertEnvResolves runs a minimal stream that uses the v2_clone_map processor +// with the given mapping and asserts the consumer receives the expected output. +func assertEnvResolves(t *testing.T, env *service.Environment, procName, mapping, expected string) { + t.Helper() + + yamlConf := fmt.Sprintf(` +pipeline: + processors: + - %s: + mapping: '%s' + +input: + generate: + count: 1 + mapping: 'root = "hello"' + +output: + drop: {} + +logger: + level: OFF +`, procName, mapping) + + builder := env.NewStreamBuilder() + require.NoError(t, builder.SetYAML(yamlConf)) + + var received []string + require.NoError(t, builder.AddConsumerFunc(func(_ context.Context, m *service.Message) error { + b, err := m.AsBytes() + if err != nil { + return err + } + received = append(received, string(b)) + return nil + })) + + strm, err := builder.Build() + require.NoError(t, err) + + ctx, cancel := context.WithTimeout(t.Context(), 5*time.Second) + defer cancel() + require.NoError(t, strm.Run(ctx)) + + assert.Equal(t, []string{expected}, received) +} + +func TestEnvironmentBloblangV2LintFailsForBadMapping(t *testing.T) { + env := service.NewEnvironment() + require.NoError(t, env.RegisterProcessor( + "v2_lint_check", + service.NewConfigSpec().Field(service.NewBloblangV2Field("mapping")), + func(conf *service.ParsedConfig, _ *service.Resources) (service.Processor, error) { + _, err := conf.FieldBloblangV2("mapping") + return nil, err + }, + )) + + badConfig := ` +pipeline: + processors: + - v2_lint_check: + mapping: 'output = nope(' + +output: + drop: {} + +logger: + level: OFF +` + err := env.NewStreamBuilder().SetYAML(badConfig) + require.Error(t, err, "lint pass should reject malformed V2 mappings at SetYAML time") + assert.Contains(t, err.Error(), "expected") +} + +func TestEnvironmentBloblangV2LintRespectsCustomEnv(t *testing.T) { + bEnv := bloblangv2.NewEmptyEnvironment() + require.NoError(t, bEnv.RegisterFunction("squeak", bloblangv2.NewPluginSpec(), func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { return "squeak", nil }, nil + })) + + env := service.NewEnvironment() + env.UseBloblangV2Environment(bEnv) + require.NoError(t, env.RegisterProcessor( + "v2_lint_check", + service.NewConfigSpec().Field(service.NewBloblangV2Field("mapping")), + func(conf *service.ParsedConfig, _ *service.Resources) (service.Processor, error) { + _, err := conf.FieldBloblangV2("mapping") + return nil, err + }, + )) + + goodConfig := ` +pipeline: + processors: + - v2_lint_check: + mapping: 'output = squeak()' + +output: + drop: {} + +logger: + level: OFF +` + // The custom env knows squeak so SetYAML should accept the mapping. A + // fresh environment with the global V2 env would reject it during lint. + require.NoError(t, env.NewStreamBuilder().SetYAML(goodConfig)) + + plainEnv := service.NewEnvironment() + require.NoError(t, plainEnv.RegisterProcessor( + "v2_lint_check", + service.NewConfigSpec().Field(service.NewBloblangV2Field("mapping")), + func(conf *service.ParsedConfig, _ *service.Resources) (service.Processor, error) { + _, err := conf.FieldBloblangV2("mapping") + return nil, err + }, + )) + err := plainEnv.NewStreamBuilder().SetYAML(goodConfig) + require.Error(t, err, "lint should reject squeak() against the default V2 env") +} + +func TestEnvironmentBloblangV2Isolation(t *testing.T) { + bEnv := bloblangv2.NewEmptyEnvironment() + require.NoError(t, bEnv.RegisterFunction("woof", bloblangv2.NewPluginSpec(), func(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { + return "woof", nil + }, nil + })) + + // Register a processor plugin on the global environment that extracts a V2 + // mapping at construction time. Both environments below share this plugin, + // but each environment has its own Bloblang V2 registry — so the "woof" + // function only resolves on the environment that has the custom env set. + procSpec := service.NewConfigSpec().Field(service.NewBloblangV2Field("mapping")) + procCtor := func(conf *service.ParsedConfig, _ *service.Resources) (service.Processor, error) { + exec, err := conf.FieldBloblangV2("mapping") + if err != nil { + return nil, err + } + return &v2MappingProc{exec: exec}, nil + } + + envOne := service.NewEnvironment() + envOne.UseBloblangV2Environment(bEnv) + require.NoError(t, envOne.RegisterProcessor("v2_test_map", procSpec, procCtor)) + + envTwo := service.NewEnvironment() + require.NoError(t, envTwo.RegisterProcessor("v2_test_map", procSpec, procCtor)) + + mappingConfig := ` +pipeline: + processors: + - v2_test_map: + mapping: 'output = woof()' + +input: + generate: + count: 1 + mapping: 'root = "hello"' + +output: + drop: {} + +logger: + level: OFF +` + + // envTwo has the default global V2 environment, which does not have + // "woof". The lint pass at SetYAML time should reject the mapping. + builderTwo := envTwo.NewStreamBuilder() + err := builderTwo.SetYAML(mappingConfig) + require.Error(t, err) + assert.Contains(t, err.Error(), "woof") + + // envOne has a custom V2 env containing "woof". Run must succeed and the + // processor must rewrite messages to "woof". + builderOne := envOne.NewStreamBuilder() + require.NoError(t, builderOne.SetYAML(mappingConfig)) + + var received []string + require.NoError(t, builderOne.AddConsumerFunc(func(_ context.Context, m *service.Message) error { + b, err := m.AsBytes() + if err != nil { + return err + } + received = append(received, string(b)) + return nil + })) + + strm, err := builderOne.Build() + require.NoError(t, err) + + ctx, cancel := context.WithTimeout(t.Context(), 5*time.Second) + defer cancel() + require.NoError(t, strm.Run(ctx)) + + assert.Equal(t, []string{"woof"}, received) +} + +type v2MappingProc struct { + exec *bloblangv2.Executor +} + +func (p *v2MappingProc) Process(_ context.Context, m *service.Message) (service.MessageBatch, error) { + in, err := m.AsStructured() + if err != nil { + b, _ := m.AsBytes() + in = b + } + out, err := p.exec.Query(in) + if err != nil { + return nil, err + } + nm := m.Copy() + if s, ok := out.(string); ok { + nm.SetBytes([]byte(s)) + } else { + nm.SetStructured(out) + } + return service.MessageBatch{nm}, nil +} + +func (p *v2MappingProc) Close(context.Context) error { return nil } + type testFS struct { ifs.FS override fstest.MapFS diff --git a/public/service/resource_builder.go b/public/service/resource_builder.go index 09b2f74ee..cf4e5e1b5 100644 --- a/public/service/resource_builder.go +++ b/public/service/resource_builder.go @@ -80,6 +80,7 @@ func (r *ResourceBuilder) getLintContext() docs.LintContext { conf := docs.NewLintConfig(r.env.internal) conf.DocsProvider = r.env.internal conf.BloblangEnv = r.env.bloblangEnv.Deactivated() + conf.BloblangV2Env = r.env.getBloblangV2ParserEnv() return docs.NewLintContext(conf) } @@ -404,6 +405,7 @@ func (r *ResourceBuilder) buildNotStarted() (*manager.Type, error) { manager.OptSetEngineVersion(engVer), manager.OptSetEnvironment(r.env.internal), manager.OptSetBloblangEnvironment(r.env.getBloblangParserEnv()), + manager.OptSetBloblV2Environment(r.env.getBloblangV2ParserEnv()), ) if err != nil { return nil, err @@ -436,6 +438,7 @@ func (r *ResourceBuilder) buildNotStarted() (*manager.Type, error) { manager.OptSetAPIReg(r.apiMut), manager.OptSetEnvironment(r.env.internal), manager.OptSetBloblangEnvironment(r.env.getBloblangParserEnv()), + manager.OptSetBloblV2Environment(r.env.getBloblangV2ParserEnv()), manager.OptSetFS(r.env.fs), } diff --git a/public/service/stream_builder.go b/public/service/stream_builder.go index 7cbf89181..acb9be738 100644 --- a/public/service/stream_builder.go +++ b/public/service/stream_builder.go @@ -108,6 +108,7 @@ func (s *StreamBuilder) getLintContext() docs.LintContext { conf := docs.NewLintConfig(s.env.internal) conf.DocsProvider = s.env.internal conf.BloblangEnv = s.env.bloblangEnv.Deactivated() + conf.BloblangV2Env = s.env.getBloblangV2ParserEnv() return docs.NewLintContext(conf) } @@ -872,6 +873,7 @@ func (s *StreamBuilder) buildWithEnv(env *bundle.Environment) (*Stream, error) { manager.OptSetLogger(logger), manager.OptSetEnvironment(env), manager.OptSetBloblangEnvironment(s.env.getBloblangParserEnv()), + manager.OptSetBloblV2Environment(s.env.getBloblangV2ParserEnv()), ) if err != nil { return nil, err @@ -917,6 +919,7 @@ func (s *StreamBuilder) buildWithEnv(env *bundle.Environment) (*Stream, error) { manager.OptSetTracer(tracer), manager.OptSetEnvironment(env), manager.OptSetBloblangEnvironment(s.env.getBloblangParserEnv()), + manager.OptSetBloblV2Environment(s.env.getBloblangV2ParserEnv()), manager.OptSetFS(s.env.fs), ) if err != nil { diff --git a/public/service/stream_config_linter.go b/public/service/stream_config_linter.go index 5efba6976..978b5cb17 100644 --- a/public/service/stream_config_linter.go +++ b/public/service/stream_config_linter.go @@ -30,6 +30,7 @@ type StreamConfigLinter struct { func (s *ConfigSchema) NewStreamConfigLinter() *StreamConfigLinter { lintConf := docs.NewLintConfig(s.env.internal) lintConf.BloblangEnv = s.env.bloblangEnv.Deactivated() + lintConf.BloblangV2Env = s.env.getBloblangV2ParserEnv() return &StreamConfigLinter{ env: s.env, spec: s.fields, diff --git a/public/service/stream_schema.go b/public/service/stream_schema.go index 3055ab438..84884fe15 100644 --- a/public/service/stream_schema.go +++ b/public/service/stream_schema.go @@ -20,6 +20,7 @@ import ( "github.com/redpanda-data/benthos/v4/internal/stream" "github.com/redpanda-data/benthos/v4/internal/template" "github.com/redpanda-data/benthos/v4/public/bloblang" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" ) // ConfigSchema contains the definitions of all config fields for the overall @@ -164,9 +165,10 @@ pathLoop: // against plugin definitions that they themselves haven't imported. func ConfigSchemaFromJSONV0(jBytes []byte) (*ConfigSchema, error) { emptyEnvironment := &Environment{ - internal: bundle.NewEnvironment(), - bloblangEnv: bloblang.NewEmptyEnvironment().WithDisabledImports(), - fs: ifs.OS(), // TODO: Isolate this as well? + internal: bundle.NewEnvironment(), + bloblangEnv: bloblang.NewEmptyEnvironment().WithDisabledImports(), + bloblangV2Env: bloblangv2.NewEmptyEnvironment(), + fs: ifs.OS(), // TODO: Isolate this as well? } var tmpSchema rawMessageSchema @@ -180,6 +182,9 @@ func ConfigSchemaFromJSONV0(jBytes []byte) (*ConfigSchema, error) { if err := expandBloblEnvWithSchema(&tmpSchema, emptyEnvironment.bloblangEnv); err != nil { return nil, err } + if err := expandBloblV2EnvWithSchema(&tmpSchema, emptyEnvironment.bloblangV2Env); err != nil { + return nil, err + } return &ConfigSchema{ version: tmpSchema.Version, dateBuilt: tmpSchema.Date, @@ -204,21 +209,34 @@ func (s *ConfigSchema) MarshalJSONV0() ([]byte, error) { methodDocs = append(methodDocs, spec) }) + var v2Functions []bloblangv2.PluginInfo + var v2Methods []bloblangv2.PluginInfo + if v2Env := s.env.getBloblangV2ParserEnv(); v2Env != nil { + v2Env.WalkFunctions(func(_ string, view *bloblangv2.FunctionView) { + v2Functions = append(v2Functions, view.Info()) + }) + v2Env.WalkMethods(func(_ string, view *bloblangv2.MethodView) { + v2Methods = append(v2Methods, view.Info()) + }) + } + iSchema := schema.Full{ - Version: s.version, - Date: s.dateBuilt, - Config: s.fields, - Buffers: s.env.internal.BufferDocs(), - Caches: s.env.internal.CacheDocs(), - Inputs: s.env.internal.InputDocs(), - Outputs: s.env.internal.OutputDocs(), - Processors: s.env.internal.ProcessorDocs(), - RateLimits: s.env.internal.RateLimitDocs(), - Metrics: s.env.internal.MetricsDocs(), - Tracers: s.env.internal.TracersDocs(), - Scanners: s.env.internal.ScannerDocs(), - BloblangFunctions: functionDocs, - BloblangMethods: methodDocs, + Version: s.version, + Date: s.dateBuilt, + Config: s.fields, + Buffers: s.env.internal.BufferDocs(), + Caches: s.env.internal.CacheDocs(), + Inputs: s.env.internal.InputDocs(), + Outputs: s.env.internal.OutputDocs(), + Processors: s.env.internal.ProcessorDocs(), + RateLimits: s.env.internal.RateLimitDocs(), + Metrics: s.env.internal.MetricsDocs(), + Tracers: s.env.internal.TracersDocs(), + Scanners: s.env.internal.ScannerDocs(), + BloblangFunctions: functionDocs, + BloblangMethods: methodDocs, + BloblangV2Functions: v2Functions, + BloblangV2Methods: v2Methods, } return json.Marshal(iSchema) @@ -265,20 +283,22 @@ func (s *ConfigSchema) Fields(fs ...*ConfigField) *ConfigSchema { //------------------------------------------------------------------------------ type rawMessageSchema struct { - Version string `json:"version"` - Date string `json:"date"` - Config docs.FieldSpecs `json:"config,omitempty"` - Buffers []json.RawMessage `json:"buffers,omitempty"` - Caches []json.RawMessage `json:"caches,omitempty"` - Inputs []json.RawMessage `json:"inputs,omitempty"` - Outputs []json.RawMessage `json:"outputs,omitempty"` - Processors []json.RawMessage `json:"processors,omitempty"` - RateLimits []json.RawMessage `json:"rate-limits,omitempty"` - Metrics []json.RawMessage `json:"metrics,omitempty"` - Tracers []json.RawMessage `json:"tracers,omitempty"` - Scanners []json.RawMessage `json:"scanners,omitempty"` - BloblangFunctions []json.RawMessage `json:"bloblang-functions,omitempty"` - BloblangMethods []json.RawMessage `json:"bloblang-methods,omitempty"` + Version string `json:"version"` + Date string `json:"date"` + Config docs.FieldSpecs `json:"config,omitempty"` + Buffers []json.RawMessage `json:"buffers,omitempty"` + Caches []json.RawMessage `json:"caches,omitempty"` + Inputs []json.RawMessage `json:"inputs,omitempty"` + Outputs []json.RawMessage `json:"outputs,omitempty"` + Processors []json.RawMessage `json:"processors,omitempty"` + RateLimits []json.RawMessage `json:"rate-limits,omitempty"` + Metrics []json.RawMessage `json:"metrics,omitempty"` + Tracers []json.RawMessage `json:"tracers,omitempty"` + Scanners []json.RawMessage `json:"scanners,omitempty"` + BloblangFunctions []json.RawMessage `json:"bloblang-functions,omitempty"` + BloblangMethods []json.RawMessage `json:"bloblang-methods,omitempty"` + BloblangV2Functions []bloblangv2.PluginInfo `json:"bloblang-v2-functions,omitempty"` + BloblangV2Methods []bloblangv2.PluginInfo `json:"bloblang-v2-methods,omitempty"` } func nameAndBloblSpec(data []byte) (string, *bloblang.PluginSpec, error) { @@ -340,6 +360,39 @@ func expandBloblEnvWithSchema(schema *rawMessageSchema, bEnv *bloblang.Environme return nil } +// expandBloblV2EnvWithSchema registers stub V2 plugins onto bV2Env for every +// plugin described in the schema, allowing a remote linter to validate +// configs that reference V2 plugins it does not itself implement. The stubs +// preserve the original plugin's signature (name, params, arity) so parsing +// and arity checks succeed; constructor calls always return an error since +// no executable implementation exists. +func expandBloblV2EnvWithSchema(schema *rawMessageSchema, bV2Env *bloblangv2.Environment) error { + registerStubFn := func(info bloblangv2.PluginInfo) error { + spec := bloblangv2.NewPluginSpecFromInfo(info) + return bV2Env.RegisterFunction(info.Name, spec, func(_ *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return nil, fmt.Errorf("function %v not enabled", info.Name) + }) + } + for _, info := range schema.BloblangV2Functions { + if err := registerStubFn(info); err != nil { + return err + } + } + + registerStubMethod := func(info bloblangv2.PluginInfo) error { + spec := bloblangv2.NewPluginSpecFromInfo(info) + return bV2Env.RegisterMethod(info.Name, spec, func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return nil, fmt.Errorf("method %v not enabled", info.Name) + }) + } + for _, info := range schema.BloblangV2Methods { + if err := registerStubMethod(info); err != nil { + return err + } + } + return nil +} + var errComponentDisabled = errors.New("component not enabled") func expandEnvWithSchema(schema *rawMessageSchema, env *Environment) error { From fdc0bd51aefa1991404e221286b35d437535a0c3 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 5 May 2026 11:45:57 +0100 Subject: [PATCH 17/20] bloblang(v2): Add bloblang_v2 and bloblang_v2_file processors Adds internal/impl/pure/processor_bloblang_v2.go and processor_bloblang_v2_file.go, the runtime processors that execute V2 mappings inside a Benthos pipeline. The _v2 processor takes a mapping inline; the _file processor reads it from disk, which is how the public/service config migrator emits long mappings. Both register against the public/service registry and parse via the configured public/bloblangv2 environment, so they pick up plugin methods and functions registered elsewhere in the binary (including the V1 stdlib parity ports). --- internal/impl/pure/processor_bloblang_v2.go | 204 ++++++++++++++++++ .../impl/pure/processor_bloblang_v2_file.go | 165 ++++++++++++++ .../pure/processor_bloblang_v2_file_test.go | 140 ++++++++++++ .../impl/pure/processor_bloblang_v2_test.go | 203 +++++++++++++++++ 4 files changed, 712 insertions(+) create mode 100644 internal/impl/pure/processor_bloblang_v2.go create mode 100644 internal/impl/pure/processor_bloblang_v2_file.go create mode 100644 internal/impl/pure/processor_bloblang_v2_file_test.go create mode 100644 internal/impl/pure/processor_bloblang_v2_test.go diff --git a/internal/impl/pure/processor_bloblang_v2.go b/internal/impl/pure/processor_bloblang_v2.go new file mode 100644 index 000000000..e7bc271a6 --- /dev/null +++ b/internal/impl/pure/processor_bloblang_v2.go @@ -0,0 +1,204 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "context" + "errors" + + tracing "github.com/redpanda-data/benthos/v4/internal/tracing/v2" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + "github.com/redpanda-data/benthos/v4/public/service" +) + +func init() { + service.MustRegisterBatchProcessor("bloblang_v2", bloblangV2ProcConfig(), + func(conf *service.ParsedConfig, mgr *service.Resources) (service.BatchProcessor, error) { + return newBloblangV2FromParsed(conf, mgr) + }) +} + +func bloblangV2ProcConfig() *service.ConfigSpec { + return service.NewConfigSpec(). + Categories("Mapping", "Parsing"). + Field(service.NewBloblangV2Field("")). + Summary("Executes a Bloblang V2 mapping on messages, producing a new document that replaces (or filters) the original message."). + Description(` +Bloblang V2 is a redesigned mapping language with explicit input/output +contexts and deterministic evaluation. See the V2 specification in +`+"`internal/bloblang2/spec`"+` for the full language reference. + +== Input and output semantics + +Each message is evaluated with `+"`input`"+` bound to the incoming document +and `+"`output`"+` starting as an empty object. The mapping is expected to +build up `+"`output`"+` (or assign it directly), for example: + +`+"```"+` +output.id = input.id +output.fans = input.fans.filter(f -> f.obsession > 0.5) +`+"```"+` + +If the mapping assigns `+"`output = deleted()`"+` the message is filtered out +of the batch. If the mapping fails the original message continues down the +pipeline but is marked with the error via standard processor error handling. + +== Metadata + +Metadata follows V2 semantics: `+"`input@`"+` exposes the incoming metadata +(immutable) and `+"`output@`"+` starts as an empty object on every +invocation. Whatever the mapping writes to `+"`output@`"+` becomes the +metadata of the produced message — metadata is not preserved implicitly. +To copy the incoming metadata through, write: + +`+"```"+` +output@ = input@ +`+"```"+` + +This differs from the V1 `+"`mapping`"+` processor, which preserves metadata +by default. +`). + Example("Mapping with metadata preserved", ` +Given JSON documents containing an array of fans, reduce them to just the ID +and the fans with an obsession score above 0.5, while keeping the original +metadata on the resulting message:`, + ` +pipeline: + processors: + - bloblang_v2: | + output@ = input@ + output.id = input.id + output.fans = input.fans.filter(fan -> fan.obsession > 0.5) +`) +} + +func newBloblangV2FromParsed(conf *service.ParsedConfig, mgr *service.Resources) (*bloblangV2Proc, error) { + exec, err := conf.FieldBloblangV2() + if err != nil { + return nil, err + } + return &bloblangV2Proc{exec: exec, log: mgr.Logger()}, nil +} + +type bloblangV2Proc struct { + exec *bloblangv2.Executor + log *service.Logger +} + +func (p *bloblangV2Proc) ProcessBatch(_ context.Context, batch service.MessageBatch) ([]service.MessageBatch, error) { + newBatch := make(service.MessageBatch, 0, len(batch)) + batchSize := len(batch) + for i, msg := range batch { + input, err := messageInputValue(msg) + if err != nil { + newMsg := msg.Copy() + newMsg.SetError(err) + p.log.Errorf("%v", err) + newBatch = append(newBatch, newMsg) + continue + } + + inputMeta := map[string]any{} + _ = msg.MetaWalkMut(func(k string, v any) error { + inputMeta[k] = v + return nil + }) + + ctx := &messageContext{ + msg: msg, + input: input, + meta: inputMeta, + batchIndex: i, + batchSize: batchSize, + } + output, outputMeta, err := p.exec.QueryMessage(ctx) + if err != nil { + if errors.Is(err, bloblangv2.ErrRootDeleted) { + continue + } + newMsg := msg.Copy() + newMsg.SetError(err) + p.log.Errorf("%v", err) + newBatch = append(newBatch, newMsg) + continue + } + + newMsg := msg.Copy() + switch v := output.(type) { + case []byte: + newMsg.SetBytes(v) + case string: + newMsg.SetBytes([]byte(v)) + default: + newMsg.SetStructured(output) + } + + // V2 metadata semantics: output@ starts empty, so the produced + // metadata fully replaces the message metadata on each invocation. + _ = newMsg.MetaWalkMut(func(k string, _ any) error { + newMsg.MetaDelete(k) + return nil + }) + for k, v := range outputMeta { + newMsg.MetaSetMut(k, v) + } + + newBatch = append(newBatch, newMsg) + } + if len(newBatch) == 0 { + return nil, nil + } + return []service.MessageBatch{newBatch}, nil +} + +func (p *bloblangV2Proc) Close(context.Context) error { + return nil +} + +// messageInputValue returns the value to bind to `input` in the mapping. +// Structured messages parse to their JSON-equivalent Go value; raw messages +// fall back to their byte contents. The V2 interpreter normalises json.Number +// values internally, so callers do not need to pre-process them. +func messageInputValue(msg *service.Message) (any, error) { + if v, err := msg.AsStructured(); err == nil { + return v, nil + } + b, err := msg.AsBytes() + if err != nil { + return nil, err + } + return b, nil +} + +// messageContext is the adapter that exposes a service.Message + its +// position within a batch through the bloblangv2.MessageContext +// surface. The bundled batch-3 stdlib (batch_index, content, error, +// tracing_id, ...) reads from this adapter. +type messageContext struct { + msg *service.Message + input any + meta map[string]any + batchIndex int + batchSize int +} + +func (c *messageContext) Input() any { return c.input } +func (c *messageContext) Metadata() map[string]any { return c.meta } +func (c *messageContext) BatchIndex() int { return c.batchIndex } +func (c *messageContext) BatchSize() int { return c.batchSize } +func (c *messageContext) Error() error { return c.msg.GetError() } +func (c *messageContext) TraceID() string { return tracing.GetTraceID(c.msg) } +func (c *messageContext) Span() any { + if s := tracing.GetSpan(c.msg); s != nil { + return s + } + return nil +} + +func (c *messageContext) Bytes() []byte { + b, err := c.msg.AsBytes() + if err != nil { + return nil + } + return b +} diff --git a/internal/impl/pure/processor_bloblang_v2_file.go b/internal/impl/pure/processor_bloblang_v2_file.go new file mode 100644 index 000000000..f53e0f3a3 --- /dev/null +++ b/internal/impl/pure/processor_bloblang_v2_file.go @@ -0,0 +1,165 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "context" + "errors" + "fmt" + "io" + + "github.com/redpanda-data/benthos/v4/internal/component/interop" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + "github.com/redpanda-data/benthos/v4/public/service" +) + +func init() { + service.MustRegisterBatchProcessor("bloblang_v2_file", bloblangV2FileProcConfig(), + func(conf *service.ParsedConfig, mgr *service.Resources) (service.BatchProcessor, error) { + return newBloblangV2FileFromParsed(conf, mgr) + }) +} + +func bloblangV2FileProcConfig() *service.ConfigSpec { + return service.NewConfigSpec(). + Categories("Mapping", "Parsing"). + Field(service.NewStringField(""). + Description("Path to a Bloblang V2 mapping file. The file is read once at processor construction; subsequent file changes are picked up only when the config is reloaded.")). + Summary("Executes a Bloblang V2 mapping loaded from a file on disk."). + Description(` +Counterpart to the inline `+"`bloblang_v2`"+` processor for cases where the +mapping lives in its own file. The file is loaded and compiled once when the +processor is constructed; the resulting executor is reused for every message, +so there is no per-message file-system overhead. + +This is the V2 equivalent of writing `+"`bloblang: 'from \"path\"'`"+` against +the V1 processor. The migrator rewrites such configs to this processor when +upgrading to V2. + +Paths are resolved through the host filesystem (typically relative to the +working directory the process started in). + +== Imports + +The file is parsed as a self-contained V2 mapping. V2 `+"`import`"+` +statements within the file are not currently resolved by this processor — +mappings that need imports should keep them inline via `+"`bloblang_v2`"+` +or wait for follow-up support here. +`). + Example("File-backed mapping", ` +Given a mapping file `+"`./mappings/extract.blobl`"+` containing: + +`+"```"+` +output.id = input.id +output.fans = input.fans.filter(f -> f.obsession > 0.5) +`+"```"+` + +The pipeline references it by path:`, + ` +pipeline: + processors: + - bloblang_v2_file: ./mappings/extract.blobl +`) +} + +func newBloblangV2FileFromParsed(conf *service.ParsedConfig, mgr *service.Resources) (*bloblangV2FileProc, error) { + path, err := conf.FieldString() + if err != nil { + return nil, err + } + if path == "" { + return nil, errors.New("bloblang_v2_file: path is required") + } + + f, err := mgr.FS().Open(path) + if err != nil { + return nil, fmt.Errorf("bloblang_v2_file: opening %q: %w", path, err) + } + defer f.Close() + + srcBytes, err := io.ReadAll(f) + if err != nil { + return nil, fmt.Errorf("bloblang_v2_file: reading %q: %w", path, err) + } + + exec, err := interop.UnwrapManagement(mgr).BloblV2Environment().Parse(string(srcBytes)) + if err != nil { + return nil, fmt.Errorf("bloblang_v2_file: parsing %q: %w", path, err) + } + + return &bloblangV2FileProc{exec: exec, log: mgr.Logger(), path: path}, nil +} + +type bloblangV2FileProc struct { + exec *bloblangv2.Executor + log *service.Logger + path string +} + +func (p *bloblangV2FileProc) ProcessBatch(_ context.Context, batch service.MessageBatch) ([]service.MessageBatch, error) { + newBatch := make(service.MessageBatch, 0, len(batch)) + batchSize := len(batch) + for i, msg := range batch { + input, err := messageInputValue(msg) + if err != nil { + newMsg := msg.Copy() + newMsg.SetError(err) + p.log.Errorf("%v", err) + newBatch = append(newBatch, newMsg) + continue + } + + inputMeta := map[string]any{} + _ = msg.MetaWalkMut(func(k string, v any) error { + inputMeta[k] = v + return nil + }) + + ctx := &messageContext{ + msg: msg, + input: input, + meta: inputMeta, + batchIndex: i, + batchSize: batchSize, + } + output, outputMeta, err := p.exec.QueryMessage(ctx) + if err != nil { + if errors.Is(err, bloblangv2.ErrRootDeleted) { + continue + } + newMsg := msg.Copy() + newMsg.SetError(err) + p.log.Errorf("%v", err) + newBatch = append(newBatch, newMsg) + continue + } + + newMsg := msg.Copy() + switch v := output.(type) { + case []byte: + newMsg.SetBytes(v) + case string: + newMsg.SetBytes([]byte(v)) + default: + newMsg.SetStructured(output) + } + + _ = newMsg.MetaWalkMut(func(k string, _ any) error { + newMsg.MetaDelete(k) + return nil + }) + for k, v := range outputMeta { + newMsg.MetaSetMut(k, v) + } + + newBatch = append(newBatch, newMsg) + } + if len(newBatch) == 0 { + return nil, nil + } + return []service.MessageBatch{newBatch}, nil +} + +func (p *bloblangV2FileProc) Close(context.Context) error { + return nil +} diff --git a/internal/impl/pure/processor_bloblang_v2_file_test.go b/internal/impl/pure/processor_bloblang_v2_file_test.go new file mode 100644 index 000000000..d179e9d52 --- /dev/null +++ b/internal/impl/pure/processor_bloblang_v2_file_test.go @@ -0,0 +1,140 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "os" + "path/filepath" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/service" +) + +func parseBloblangV2FileProc(t *testing.T, path string) (*bloblangV2FileProc, error) { + t.Helper() + conf, err := bloblangV2FileProcConfig().ParseYAML(path, nil) + require.NoError(t, err) + return newBloblangV2FileFromParsed(conf, service.MockResources()) +} + +func TestBloblangV2FileProcessorReadsAndExecutes(t *testing.T) { + dir := t.TempDir() + mappingPath := filepath.Join(dir, "mapping.blobl") + require.NoError(t, os.WriteFile(mappingPath, []byte( + `output.upper_name = input.name.uppercase()`+"\n", + ), 0o644)) + + proc, err := parseBloblangV2FileProc(t, mappingPath) + require.NoError(t, err) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + msg := service.NewMessage(nil) + msg.SetStructured(map[string]any{"name": "blob"}) + + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{msg}) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 1) + + got, err := batches[0][0].AsStructured() + require.NoError(t, err) + assert.Equal(t, map[string]any{"upper_name": "BLOB"}, got) +} + +func TestBloblangV2FileProcessorPreservesPath(t *testing.T) { + dir := t.TempDir() + mappingPath := filepath.Join(dir, "mapping.blobl") + require.NoError(t, os.WriteFile(mappingPath, []byte(`output = input`), 0o644)) + + proc, err := parseBloblangV2FileProc(t, mappingPath) + require.NoError(t, err) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + assert.Equal(t, mappingPath, proc.path) +} + +func TestBloblangV2FileProcessorMissingFile(t *testing.T) { + _, err := parseBloblangV2FileProc(t, "/does/not/exist.blobl") + require.Error(t, err) + assert.Contains(t, err.Error(), "bloblang_v2_file") + assert.Contains(t, err.Error(), "opening") +} + +func TestBloblangV2FileProcessorEmptyPath(t *testing.T) { + _, err := parseBloblangV2FileProc(t, `""`) + require.Error(t, err) + assert.Contains(t, err.Error(), "path is required") +} + +func TestBloblangV2FileProcessorInvalidMapping(t *testing.T) { + dir := t.TempDir() + mappingPath := filepath.Join(dir, "broken.blobl") + require.NoError(t, os.WriteFile(mappingPath, []byte(`@@@ not bloblang @@@`), 0o644)) + + _, err := parseBloblangV2FileProc(t, mappingPath) + require.Error(t, err) + assert.Contains(t, err.Error(), "parsing") +} + +// TestBloblangV2FileProcessorLintsClean exercises the schema-level +// linter against a stream config containing the new processor, +// ensuring the registration is wired correctly into the global env +// and the field shape matches what users will write. +func TestBloblangV2FileProcessorLintsClean(t *testing.T) { + dir := t.TempDir() + mappingPath := filepath.Join(dir, "mapping.blobl") + require.NoError(t, os.WriteFile(mappingPath, []byte(`output = input`), 0o644)) + + yamlConfig := []byte(` +input: + generate: + count: 1 + interval: "" + mapping: 'root = {"id": "abc"}' +pipeline: + processors: + - bloblang_v2_file: ` + mappingPath + ` +output: + drop: {} +`) + + schema := service.GlobalEnvironment().FullConfigSchema("", "") + linter := schema.NewStreamConfigLinter() + lints, err := linter.LintYAML(yamlConfig) + require.NoError(t, err) + for _, l := range lints { + t.Errorf("unexpected lint: %+v", l) + } +} + +func TestBloblangV2FileProcessorMetadataOverwrite(t *testing.T) { + dir := t.TempDir() + mappingPath := filepath.Join(dir, "meta.blobl") + require.NoError(t, os.WriteFile(mappingPath, []byte(` +output = input +output@.stamp = "added" +`), 0o644)) + + proc, err := parseBloblangV2FileProc(t, mappingPath) + require.NoError(t, err) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + msg := service.NewMessage(nil) + msg.SetStructured(map[string]any{"id": "x"}) + msg.MetaSetMut("original", "should-be-dropped") + + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{msg}) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 1) + + stamp, ok := batches[0][0].MetaGet("stamp") + require.True(t, ok) + assert.Equal(t, "added", stamp) + + _, exists := batches[0][0].MetaGet("original") + assert.False(t, exists, "V2 metadata semantics should drop input metadata not assigned in output@") +} diff --git a/internal/impl/pure/processor_bloblang_v2_test.go b/internal/impl/pure/processor_bloblang_v2_test.go new file mode 100644 index 000000000..943a2fba7 --- /dev/null +++ b/internal/impl/pure/processor_bloblang_v2_test.go @@ -0,0 +1,203 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "errors" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/service" +) + +func parseBloblangV2Proc(t *testing.T, mapping string) *bloblangV2Proc { + t.Helper() + conf, err := bloblangV2ProcConfig().ParseYAML(mapping, nil) + require.NoError(t, err) + proc, err := newBloblangV2FromParsed(conf, service.MockResources()) + require.NoError(t, err) + return proc +} + +func TestBloblangV2ProcessorStructuredMapping(t *testing.T) { + proc := parseBloblangV2Proc(t, `| + output.id = input.id + output.fans = input.fans.filter(fan -> fan.obsession > 0.5) +`) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + msg := service.NewMessage(nil) + msg.SetStructured(map[string]any{ + "id": "foo", + "fans": []any{map[string]any{"obsession": 0.8}, map[string]any{"obsession": 0.2}}, + }) + + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{msg}) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 1) + + got, err := batches[0][0].AsStructured() + require.NoError(t, err) + assert.Equal(t, map[string]any{ + "id": "foo", + "fans": []any{map[string]any{"obsession": 0.8}}, + }, got) +} + +func TestBloblangV2ProcessorRootDeletionFilters(t *testing.T) { + proc := parseBloblangV2Proc(t, `| + output = if input.drop == true { deleted() } else { input } +`) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + keep := service.NewMessage(nil) + keep.SetStructured(map[string]any{"drop": false, "id": "a"}) + + drop := service.NewMessage(nil) + drop.SetStructured(map[string]any{"drop": true, "id": "b"}) + + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{keep, drop}) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 1, "dropped message should be filtered out") + + got, err := batches[0][0].AsStructured() + require.NoError(t, err) + assert.Equal(t, map[string]any{"drop": false, "id": "a"}, got) +} + +func TestBloblangV2ProcessorAllDeletedReturnsEmpty(t *testing.T) { + proc := parseBloblangV2Proc(t, `output = deleted()`) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + msg := service.NewMessage([]byte(`{"x":1}`)) + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{msg}) + require.NoError(t, err) + assert.Empty(t, batches, "batch with every message deleted should collapse to nothing") +} + +func TestBloblangV2ProcessorMetadataReplacement(t *testing.T) { + // V2 semantics: output@ starts empty on each invocation. A mapping that + // writes only one key should leave the produced message with only that + // one key, regardless of what was on the incoming message. + proc := parseBloblangV2Proc(t, `| + output = input + output@.kept = "yes" +`) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + msg := service.NewMessage(nil) + msg.SetStructured(map[string]any{"v": 1}) + msg.MetaSetMut("will_be_dropped", "original") + + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{msg}) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 1) + + out := batches[0][0] + _, exists := out.MetaGetMut("will_be_dropped") + assert.False(t, exists, "incoming metadata should not leak when mapping does not copy it") + + v, ok := out.MetaGetMut("kept") + require.True(t, ok) + assert.Equal(t, "yes", v) +} + +func TestBloblangV2ProcessorMetadataCopyThrough(t *testing.T) { + proc := parseBloblangV2Proc(t, `| + output = input + output@ = input@ + output@.added = "new" +`) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + msg := service.NewMessage(nil) + msg.SetStructured(map[string]any{"v": 1}) + msg.MetaSetMut("kept_from_input", "original") + + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{msg}) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 1) + + out := batches[0][0] + v, ok := out.MetaGetMut("kept_from_input") + require.True(t, ok) + assert.Equal(t, "original", v) + + v, ok = out.MetaGetMut("added") + require.True(t, ok) + assert.Equal(t, "new", v) +} + +func TestBloblangV2ProcessorBatchPositionAndContent(t *testing.T) { + proc := parseBloblangV2Proc(t, `| + output.idx = batch_index() + output.size = batch_size() + output.raw = content().string() +`) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + msgs := service.MessageBatch{ + service.NewMessage([]byte(`{"v":1}`)), + service.NewMessage([]byte(`{"v":2}`)), + service.NewMessage([]byte(`{"v":3}`)), + } + + batches, err := proc.ProcessBatch(t.Context(), msgs) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 3) + + for i, m := range batches[0] { + got, err := m.AsStructured() + require.NoError(t, err) + out := got.(map[string]any) + assert.Equal(t, int64(i), out["idx"], "batch_index for message %d", i) + assert.Equal(t, int64(3), out["size"], "batch_size") + assert.NotEmpty(t, out["raw"], "content() should expose raw bytes") + } +} + +func TestBloblangV2ProcessorErrorIntrospection(t *testing.T) { + proc := parseBloblangV2Proc(t, `| + output.failed = errored() + output.err = error() +`) + t.Cleanup(func() { _ = proc.Close(t.Context()) }) + + clean := service.NewMessage(nil) + clean.SetStructured(map[string]any{"x": 1}) + + bad := service.NewMessage(nil) + bad.SetStructured(map[string]any{"x": 2}) + bad.SetError(errors.New("kapow")) + + batches, err := proc.ProcessBatch(t.Context(), service.MessageBatch{clean, bad}) + require.NoError(t, err) + require.Len(t, batches, 1) + require.Len(t, batches[0], 2) + + cleanOut, err := batches[0][0].AsStructured() + require.NoError(t, err) + assert.Equal(t, map[string]any{"failed": false, "err": nil}, cleanOut) + + badOut, err := batches[0][1].AsStructured() + require.NoError(t, err) + badMap := badOut.(map[string]any) + assert.Equal(t, true, badMap["failed"]) + errObj, ok := badMap["err"].(map[string]any) + require.True(t, ok, "error() should resolve to a structured object") + assert.Equal(t, "kapow", errObj["what"]) +} + +func TestBloblangV2ProcessorParseErrorAtConstruction(t *testing.T) { + conf, err := bloblangV2ProcConfig().ParseYAML(`output = nope(`, nil) + require.NoError(t, err) + _, err = newBloblangV2FromParsed(conf, service.MockResources()) + assert.Error(t, err) +} From 2be746f9b61647d00eddab644d316f0017e84e60 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 28 Apr 2026 14:16:52 +0100 Subject: [PATCH 18/20] bloblang(v2): Port V1 standard library methods and functions Adds V2 implementations of V1 stdlib methods and functions, registered against the global public/bloblangv2 environment via init() side-effects. Pure (deterministic) helpers live under internal/impl/pure/bloblangv2_*.go; impure (random, time-based) helpers live under internal/impl/io/bloblangv2_*.go. Coverage: - pure: arrays, crypto (AES, hashes), encoding (JSON, YAML, CSV, base64, JSON schema, URLs), numbers (bitwise, log family, min, max), objects, parsing (parse_json, format_json, escapes), regex (replace, replace_many, find/find_all variants), string (case, trim, replace, hash, filepath, uuid_v5), time (deterministic timestamp formatting and arithmetic). - io: ids (uuid_v7, ksuid, nanoid), time (timestamp_unix variants that touch the wall clock). Each port mirrors the V1 method or function signature so the V1 -> V2 translator can rewrite call sites with no semantic change. The internal/bloblang2/PARITY.md table tracks remaining gaps. --- internal/impl/io/bloblangv2_ids.go | 100 +++++++ internal/impl/io/bloblangv2_ids_test.go | 88 ++++++ internal/impl/io/bloblangv2_time.go | 62 ++++ internal/impl/io/bloblangv2_time_test.go | 49 ++++ internal/impl/pure/bloblangv2_arrays.go | 271 ++++++++++++++++++ internal/impl/pure/bloblangv2_arrays_test.go | 172 +++++++++++ internal/impl/pure/bloblangv2_crypto.go | 162 +++++++++++ internal/impl/pure/bloblangv2_crypto_test.go | 72 +++++ internal/impl/pure/bloblangv2_encoding.go | 253 ++++++++++++++++ .../impl/pure/bloblangv2_encoding_test.go | 93 ++++++ internal/impl/pure/bloblangv2_numbers.go | 194 +++++++++++++ internal/impl/pure/bloblangv2_numbers_test.go | 102 +++++++ internal/impl/pure/bloblangv2_objects.go | 240 ++++++++++++++++ internal/impl/pure/bloblangv2_objects_test.go | 139 +++++++++ internal/impl/pure/bloblangv2_parsing.go | 201 +++++++++++++ internal/impl/pure/bloblangv2_parsing_test.go | 95 ++++++ internal/impl/pure/bloblangv2_regex.go | 128 +++++++++ internal/impl/pure/bloblangv2_regex_test.go | 57 ++++ internal/impl/pure/bloblangv2_string.go | 250 ++++++++++++++++ internal/impl/pure/bloblangv2_string_test.go | 178 ++++++++++++ internal/impl/pure/bloblangv2_time.go | 113 ++++++++ internal/impl/pure/bloblangv2_time_test.go | 81 ++++++ 22 files changed, 3100 insertions(+) create mode 100644 internal/impl/io/bloblangv2_ids.go create mode 100644 internal/impl/io/bloblangv2_ids_test.go create mode 100644 internal/impl/io/bloblangv2_time.go create mode 100644 internal/impl/io/bloblangv2_time_test.go create mode 100644 internal/impl/pure/bloblangv2_arrays.go create mode 100644 internal/impl/pure/bloblangv2_arrays_test.go create mode 100644 internal/impl/pure/bloblangv2_crypto.go create mode 100644 internal/impl/pure/bloblangv2_crypto_test.go create mode 100644 internal/impl/pure/bloblangv2_encoding.go create mode 100644 internal/impl/pure/bloblangv2_encoding_test.go create mode 100644 internal/impl/pure/bloblangv2_numbers.go create mode 100644 internal/impl/pure/bloblangv2_numbers_test.go create mode 100644 internal/impl/pure/bloblangv2_objects.go create mode 100644 internal/impl/pure/bloblangv2_objects_test.go create mode 100644 internal/impl/pure/bloblangv2_parsing.go create mode 100644 internal/impl/pure/bloblangv2_parsing_test.go create mode 100644 internal/impl/pure/bloblangv2_regex.go create mode 100644 internal/impl/pure/bloblangv2_regex_test.go create mode 100644 internal/impl/pure/bloblangv2_string.go create mode 100644 internal/impl/pure/bloblangv2_string_test.go create mode 100644 internal/impl/pure/bloblangv2_time.go create mode 100644 internal/impl/pure/bloblangv2_time_test.go diff --git a/internal/impl/io/bloblangv2_ids.go b/internal/impl/io/bloblangv2_ids.go new file mode 100644 index 000000000..09ab30cba --- /dev/null +++ b/internal/impl/io/bloblangv2_ids.go @@ -0,0 +1,100 @@ +// Copyright 2026 Redpanda Data, Inc. + +package io + +import ( + "errors" + "fmt" + "time" + + "github.com/gofrs/uuid/v5" + gonanoid "github.com/matoous/go-nanoid/v2" + "github.com/segmentio/ksuid" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 ID-generating functions. All non-deterministic (random or +// time-based), so they live in internal/impl/io. + +func init() { + bloblangv2.MustRegisterFunction("ksuid", + bloblangv2.NewPluginSpec(). + Category("General"). + Description("Generates a K-Sortable Unique Identifier (KSUID) with millisecond timestamp ordering."). + Impure(), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { + return ksuid.New().String(), nil + }, nil + }, + ) + + bloblangv2.MustRegisterFunction("nanoid", + bloblangv2.NewPluginSpec(). + Category("General"). + Description("Generates a URL-safe unique identifier using Nano ID. Customise the length (default 21) and supply an alphabet to control the character set; alphabet requires length to also be supplied."). + Param(bloblangv2.NewInt64Param("length").Description("Optional length.").Optional()). + Param(bloblangv2.NewStringParam("alphabet").Description("Optional custom alphabet. When supplied, length must also be set.").Optional()). + Impure(), + nanoidV2Ctor, + ) + + bloblangv2.MustRegisterFunction("uuid_v7", + bloblangv2.NewPluginSpec(). + Category("General"). + Description("Generates a time-ordered UUID v7. Optionally pass a timestamp to back-date the time component."). + Param(bloblangv2.NewAnyParam("time").Description("Optional timestamp to use for the time-ordered portion of the UUID.").Optional()). + Impure(), + uuidV7V2Ctor, + ) +} + +func nanoidV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + lenArg, err := args.GetOptionalInt64("length") + if err != nil { + return nil, err + } + alphabetArg, err := args.GetOptionalString("alphabet") + if err != nil { + return nil, err + } + if alphabetArg != nil && lenArg == nil { + return nil, errors.New("field length must be specified when an alphabet is specified") + } + return func() (any, error) { + if alphabetArg != nil { + return gonanoid.Generate(*alphabetArg, int(*lenArg)) + } + if lenArg != nil { + return gonanoid.New(int(*lenArg)) + } + return gonanoid.New() + }, nil +} + +func uuidV7V2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + rawTime, _ := args.Get("time") + var ts *time.Time + if rawTime != nil { + t, ok := rawTime.(time.Time) + if !ok { + return nil, fmt.Errorf("expected timestamp argument, got %T", rawTime) + } + ts = &t + } + return func() (any, error) { + if ts == nil { + u7, err := uuid.NewV7() + if err != nil { + return nil, fmt.Errorf("unable to generate uuid v7: %w", err) + } + return u7.String(), nil + } + u7, err := uuid.NewV7AtTime(*ts) + if err != nil { + return nil, fmt.Errorf("unable to generate uuid v7 at time %s: %w", *ts, err) + } + return u7.String(), nil + }, nil +} diff --git a/internal/impl/io/bloblangv2_ids_test.go b/internal/impl/io/bloblangv2_ids_test.go new file mode 100644 index 000000000..c13b64bf0 --- /dev/null +++ b/internal/impl/io/bloblangv2_ids_test.go @@ -0,0 +1,88 @@ +// Copyright 2026 Redpanda Data, Inc. + +package io_test + +import ( + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/io" +) + +func TestBloblangV2KsuidShape(t *testing.T) { + got := runBloblangV2(t, `output = ksuid()`, nil).(string) + // KSUIDs are 27 base62 characters. + if len(got) != 27 { + t.Fatalf("ksuid length=%d, want 27 (got %q)", len(got), got) + } +} + +func TestBloblangV2KsuidIsRandom(t *testing.T) { + a := runBloblangV2(t, `output = ksuid()`, nil).(string) + b := runBloblangV2(t, `output = ksuid()`, nil).(string) + assert.NotEqual(t, a, b) +} + +func TestBloblangV2NanoidDefault(t *testing.T) { + got := runBloblangV2(t, `output = nanoid()`, nil).(string) + // Default Nano ID length is 21. + if len(got) != 21 { + t.Fatalf("nanoid length=%d, want 21 (got %q)", len(got), got) + } +} + +func TestBloblangV2NanoidCustomLength(t *testing.T) { + got := runBloblangV2(t, `output = nanoid(10)`, nil).(string) + assert.Len(t, got, 10) +} + +func TestBloblangV2NanoidCustomAlphabet(t *testing.T) { + got := runBloblangV2(t, `output = nanoid(8, "abc")`, nil).(string) + assert.Len(t, got, 8) + for _, r := range got { + assert.Contains(t, "abc", string(r)) + } +} + +func TestBloblangV2NanoidAlphabetWithoutLengthErrors(t *testing.T) { + // Named args bypass V2's static-arg folding so the constructor runs at + // query time. The validation error surfaces from Query, not Parse. + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = nanoid(alphabet: "abc")`) + require.NoError(t, err) + _, err = exec.Query(nil) + require.Error(t, err) + assert.Contains(t, err.Error(), "length") +} + +func TestBloblangV2UUIDV7Shape(t *testing.T) { + got := runBloblangV2(t, `output = uuid_v7()`, nil).(string) + // Standard UUID string form is 36 characters. + if len(got) != 36 { + t.Fatalf("uuid_v7 length=%d, want 36 (got %q)", len(got), got) + } + // V7 variant char (the first nibble of the third group) is "7". + if !strings.HasPrefix(strings.Split(got, "-")[2], "7") { + t.Fatalf("uuid_v7 third group should start with '7': %q", got) + } +} + +func TestBloblangV2UUIDV7IsRandom(t *testing.T) { + a := runBloblangV2(t, `output = uuid_v7()`, nil).(string) + b := runBloblangV2(t, `output = uuid_v7()`, nil).(string) + assert.NotEqual(t, a, b) +} + +func TestBloblangV2UUIDV7WithTimestamp(t *testing.T) { + // Pass a parsed timestamp to back-date the UUID. V2 ts_parse uses + // strftime-style format strings, not Go's reference-time format. + got := runBloblangV2(t, + `output = uuid_v7("2024-01-01T00:00:00Z".ts_parse("%Y-%m-%dT%H:%M:%SZ"))`, + nil, + ).(string) + assert.Len(t, got, 36) +} diff --git a/internal/impl/io/bloblangv2_time.go b/internal/impl/io/bloblangv2_time.go new file mode 100644 index 000000000..f04ec315f --- /dev/null +++ b/internal/impl/io/bloblangv2_time.go @@ -0,0 +1,62 @@ +// Copyright 2026 Redpanda Data, Inc. + +package io + +import ( + "time" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 timestamp_unix* functions. These read the wall clock and +// are therefore impure; they live in internal/impl/io. + +func init() { + bloblangv2.MustRegisterFunction("timestamp_unix", + bloblangv2.NewPluginSpec(). + Category("Environment"). + Description("Returns the current Unix timestamp in seconds since epoch."). + Impure(), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { + return time.Now().Unix(), nil + }, nil + }, + ) + + bloblangv2.MustRegisterFunction("timestamp_unix_milli", + bloblangv2.NewPluginSpec(). + Category("Environment"). + Description("Returns the current Unix timestamp in milliseconds since epoch."). + Impure(), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { + return time.Now().UnixMilli(), nil + }, nil + }, + ) + + bloblangv2.MustRegisterFunction("timestamp_unix_micro", + bloblangv2.NewPluginSpec(). + Category("Environment"). + Description("Returns the current Unix timestamp in microseconds since epoch."). + Impure(), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { + return time.Now().UnixMicro(), nil + }, nil + }, + ) + + bloblangv2.MustRegisterFunction("timestamp_unix_nano", + bloblangv2.NewPluginSpec(). + Category("Environment"). + Description("Returns the current Unix timestamp in nanoseconds since epoch."). + Impure(), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { + return time.Now().UnixNano(), nil + }, nil + }, + ) +} diff --git a/internal/impl/io/bloblangv2_time_test.go b/internal/impl/io/bloblangv2_time_test.go new file mode 100644 index 000000000..da4d9e5bc --- /dev/null +++ b/internal/impl/io/bloblangv2_time_test.go @@ -0,0 +1,49 @@ +// Copyright 2026 Redpanda Data, Inc. + +package io_test + +import ( + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/io" +) + +func runBloblangV2(t *testing.T, mapping string, input any) any { + t.Helper() + exec, err := bloblangv2.GlobalEnvironment().Parse(mapping) + require.NoError(t, err) + out, err := exec.Query(input) + require.NoError(t, err) + return out +} + +func TestBloblangV2TimestampUnix(t *testing.T) { + now := time.Now().Unix() + got := runBloblangV2(t, `output = timestamp_unix()`, nil).(int64) + // Allow 5s tolerance for test flakiness. + assert.InDelta(t, now, got, 5) +} + +func TestBloblangV2TimestampUnixMilli(t *testing.T) { + now := time.Now().UnixMilli() + got := runBloblangV2(t, `output = timestamp_unix_milli()`, nil).(int64) + assert.InDelta(t, now, got, 5000) +} + +func TestBloblangV2TimestampUnixMicro(t *testing.T) { + now := time.Now().UnixMicro() + got := runBloblangV2(t, `output = timestamp_unix_micro()`, nil).(int64) + assert.InDelta(t, now, got, 5_000_000) +} + +func TestBloblangV2TimestampUnixNano(t *testing.T) { + now := time.Now().UnixNano() + got := runBloblangV2(t, `output = timestamp_unix_nano()`, nil).(int64) + assert.InDelta(t, now, got, 5_000_000_000) +} diff --git a/internal/impl/pure/bloblangv2_arrays.go b/internal/impl/pure/bloblangv2_arrays.go new file mode 100644 index 000000000..192f69dd9 --- /dev/null +++ b/internal/impl/pure/bloblangv2_arrays.go @@ -0,0 +1,271 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "errors" + "fmt" + + "github.com/Jeffail/gabs/v2" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 array / sequence methods that don't require lambda +// arguments (the plugin spec doesn't yet support those; tracked in PARITY). + +func init() { + bloblangv2.MustRegisterMethod("enumerated", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description(`Alias for V2 enumerate(): transforms an array into [{"index": i, "value": v}, ...]. Retained for V1 parity.`), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.ArrayMethod(func(arr []any) (any, error) { + out := make([]any, len(arr)) + for i, v := range arr { + out[i] = map[string]any{"index": int64(i), "value": v} + } + return out, nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("find_all", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Returns the indexes of every element in the receiver array equal to the value argument. Empty array if none match."). + Param(bloblangv2.NewAnyParam("value").Description("The value to search for.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + target, err := args.Get("value") + if err != nil { + return nil, err + } + return bloblangv2.ArrayMethod(func(arr []any) (any, error) { + out := []any{} + for i, elem := range arr { + if valuesLooselyEqual(elem, target) { + out = append(out, int64(i)) + } + } + return out, nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("index", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Returns the element at the given index. Negative indexes count from the end (-1 is the last element). Errors if the index is out of bounds."). + Param(bloblangv2.NewInt64Param("index").Description("The index to extract.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + idx, err := args.GetInt64("index") + if err != nil { + return nil, err + } + return func(v any) (any, error) { + switch arr := v.(type) { + case []any: + i := int(idx) + if i < 0 { + i = len(arr) + i + } + if i < 0 || i >= len(arr) { + return nil, fmt.Errorf("index '%v' was out of bounds for array size: %v", idx, len(arr)) + } + return arr[i], nil + case []byte: + i := int(idx) + if i < 0 { + i = len(arr) + i + } + if i < 0 || i >= len(arr) { + return nil, fmt.Errorf("index '%v' was out of bounds for byte array size: %v", idx, len(arr)) + } + return int64(arr[i]), nil + } + return nil, fmt.Errorf("expected array or bytes receiver, got %T", v) + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("not_empty", + bloblangv2.NewPluginSpec(). + Category("Coercion"). + Description("Returns the receiver unchanged if it is a non-empty string, array, or object, otherwise errors. Useful for asserting required values."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { + switch t := v.(type) { + case string: + if t == "" { + return nil, errors.New("string value is empty") + } + case []any: + if len(t) == 0 { + return nil, errors.New("array value is empty") + } + case map[string]any: + if len(t) == 0 { + return nil, errors.New("object value is empty") + } + default: + return nil, fmt.Errorf("expected string, array, or object receiver, got %T", v) + } + return v, nil + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("collapse", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description(`Flattens a nested object / array into a single-level object whose keys are dot-paths. Empty arrays and objects are excluded by default; pass include_empty: true to keep them.`). + Param(bloblangv2.NewBoolParam("include_empty").Description("Include empty objects and arrays in the result.").Default(false)), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + includeEmpty, err := args.GetBool("include_empty") + if err != nil { + return nil, err + } + return func(v any) (any, error) { + g := gabs.Wrap(v) + if includeEmpty { + return g.FlattenIncludeEmpty() + } + return g.Flatten() + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("key_values", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description(`Converts an object into an array of {"key": ..., "value": ...} entries. Order is unspecified — sort with sort_by(p -> p.key) if needed.`), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.ObjectMethod(func(m map[string]any) (any, error) { + out := make([]any, 0, len(m)) + for k, v := range m { + out = append(out, map[string]any{"key": k, "value": v}) + } + return out, nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("find_by", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Returns the index of the first element of the receiver array for which the predicate returns true, or -1 if none match."). + Param(bloblangv2.NewLambdaParam("query").Description("A predicate evaluated against each element. Must return a bool.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + pred, err := args.GetLambda("query") + if err != nil { + return nil, err + } + return bloblangv2.ArrayMethod(func(arr []any) (any, error) { + for i, elem := range arr { + out, err := pred(elem) + if err != nil { + return nil, fmt.Errorf("predicate failed for index %d: %w", i, err) + } + b, ok := out.(bool) + if !ok { + return nil, fmt.Errorf("predicate returned non-bool value for index %d: %T", i, out) + } + if b { + return int64(i), nil + } + } + return int64(-1), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("find_all_by", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Returns the indexes of every element of the receiver array for which the predicate returns true. Empty array if none match."). + Param(bloblangv2.NewLambdaParam("query").Description("A predicate evaluated against each element. Must return a bool.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + pred, err := args.GetLambda("query") + if err != nil { + return nil, err + } + return bloblangv2.ArrayMethod(func(arr []any) (any, error) { + out := []any{} + for i, elem := range arr { + res, err := pred(elem) + if err != nil { + return nil, fmt.Errorf("predicate failed for index %d: %w", i, err) + } + b, ok := res.(bool) + if !ok { + return nil, fmt.Errorf("predicate returned non-bool value for index %d: %T", i, res) + } + if b { + out = append(out, int64(i)) + } + } + return out, nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("map_each_key", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Returns a new object with each key transformed by the lambda. The lambda receives the original key as a string and must return a new string key."). + Param(bloblangv2.NewLambdaParam("query").Description("A lambda that returns the new key for each entry.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + fn, err := args.GetLambda("query") + if err != nil { + return nil, err + } + return bloblangv2.ObjectMethod(func(m map[string]any) (any, error) { + out := make(map[string]any, len(m)) + for k, v := range m { + newKey, err := fn(k) + if err != nil { + return nil, fmt.Errorf("key %q: %w", k, err) + } + ns, ok := newKey.(string) + if !ok { + return nil, fmt.Errorf("key %q: lambda must return a string, got %T", k, newKey) + } + out[ns] = v + } + return out, nil + }), nil + }, + ) +} + +// valuesLooselyEqual mirrors V1's value.ICompare for find_all: treats numerics +// across families (int / float) as equal when the represented values match. +func valuesLooselyEqual(a, b any) bool { + if a == nil || b == nil { + return a == b + } + if af, aOk := looseAsFloat(a); aOk { + if bf, bOk := looseAsFloat(b); bOk { + return af == bf + } + } + return fmt.Sprintf("%v", a) == fmt.Sprintf("%v", b) && fmt.Sprintf("%T", a) == fmt.Sprintf("%T", b) +} + +func looseAsFloat(v any) (float64, bool) { + switch n := v.(type) { + case float64: + return n, true + case float32: + return float64(n), true + case int64: + return float64(n), true + case int32: + return float64(n), true + case uint64: + return float64(n), true + case uint32: + return float64(n), true + } + return 0, false +} diff --git a/internal/impl/pure/bloblangv2_arrays_test.go b/internal/impl/pure/bloblangv2_arrays_test.go new file mode 100644 index 000000000..b4f098897 --- /dev/null +++ b/internal/impl/pure/bloblangv2_arrays_test.go @@ -0,0 +1,172 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestBloblangV2Enumerated(t *testing.T) { + got := runBloblangV2(t, `output = input.enumerated()`, []any{"a", "b"}) + assert.Equal(t, []any{ + map[string]any{"index": int64(0), "value": "a"}, + map[string]any{"index": int64(1), "value": "b"}, + }, got) +} + +func TestBloblangV2FindAll(t *testing.T) { + got := runBloblangV2(t, `output = input.find_all("bar")`, []any{"foo", "bar", "baz", "bar"}) + assert.Equal(t, []any{int64(1), int64(3)}, got) +} + +func TestBloblangV2FindAllNumericLooseEquality(t *testing.T) { + got := runBloblangV2(t, `output = input.find_all(20)`, []any{10.3, 20.0, "huh", int64(20)}) + assert.Equal(t, []any{int64(1), int64(3)}, got) +} + +func TestBloblangV2FindAllEmpty(t *testing.T) { + got := runBloblangV2(t, `output = input.find_all("nope")`, []any{"a", "b"}) + assert.Equal(t, []any{}, got) +} + +func TestBloblangV2Index(t *testing.T) { + got := runBloblangV2(t, `output = input.index(0)`, []any{"first", "second"}) + assert.Equal(t, "first", got) + + got = runBloblangV2(t, `output = input.index(-1)`, []any{"first", "second"}) + assert.Equal(t, "second", got) +} + +func TestBloblangV2IndexOutOfBounds(t *testing.T) { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.index(10)`) + require.NoError(t, err) + _, err = exec.Query([]any{"a"}) + require.Error(t, err) + assert.Contains(t, err.Error(), "out of bounds") +} + +func TestBloblangV2NotEmpty(t *testing.T) { + got := runBloblangV2(t, `output = input.not_empty()`, "hello") + assert.Equal(t, "hello", got) + + got = runBloblangV2(t, `output = input.not_empty()`, []any{"a"}) + assert.Equal(t, []any{"a"}, got) +} + +func TestBloblangV2NotEmptyErrors(t *testing.T) { + cases := []struct { + input any + want string + }{ + {"", "string value is empty"}, + {[]any{}, "array value is empty"}, + {map[string]any{}, "object value is empty"}, + } + for _, tc := range cases { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.not_empty()`) + require.NoError(t, err) + _, err = exec.Query(tc.input) + require.Error(t, err) + assert.Contains(t, err.Error(), tc.want) + } +} + +func TestBloblangV2Collapse(t *testing.T) { + got := runBloblangV2(t, + `output = input.collapse()`, + map[string]any{"foo": []any{ + map[string]any{"bar": "1"}, + map[string]any{"bar": map[string]any{}}, + map[string]any{"bar": "2"}, + }}, + ) + assert.Equal(t, map[string]any{ + "foo.0.bar": "1", + "foo.2.bar": "2", + }, got) +} + +func TestBloblangV2CollapseIncludeEmpty(t *testing.T) { + got := runBloblangV2(t, + `output = input.collapse(include_empty: true)`, + map[string]any{"foo": map[string]any{"bar": map[string]any{}}}, + ).(map[string]any) + // gabs represents preserved empty containers as struct{} sentinels. + _, ok := got["foo.bar"] + assert.True(t, ok, "expected foo.bar key to be preserved with include_empty: %v", got) +} + +func TestBloblangV2KeyValues(t *testing.T) { + got := runBloblangV2(t, + `output = input.key_values().sort_by(p -> p.key)`, + map[string]any{"bar": int64(1), "baz": int64(2)}, + ).([]any) + assert.Equal(t, []any{ + map[string]any{"key": "bar", "value": int64(1)}, + map[string]any{"key": "baz", "value": int64(2)}, + }, got) +} + +func TestBloblangV2FindBy(t *testing.T) { + got := runBloblangV2(t, + `output = input.find_by(v -> v != "bar")`, + []any{"bar", "foo", "baz"}, + ) + assert.Equal(t, int64(1), got) +} + +func TestBloblangV2FindByObjectPredicate(t *testing.T) { + got := runBloblangV2(t, + `output = input.find_by(u -> u.age >= 18)`, + []any{ + map[string]any{"name": "Alice", "age": int64(15)}, + map[string]any{"name": "Bob", "age": int64(22)}, + map[string]any{"name": "Carol", "age": int64(19)}, + }, + ) + assert.Equal(t, int64(1), got) +} + +func TestBloblangV2FindByNoMatch(t *testing.T) { + got := runBloblangV2(t, + `output = input.find_by(v -> v == "missing")`, + []any{"a", "b", "c"}, + ) + assert.Equal(t, int64(-1), got) +} + +func TestBloblangV2FindAllBy(t *testing.T) { + got := runBloblangV2(t, + `output = input.find_all_by(log -> log.level == "error")`, + []any{ + map[string]any{"level": "info"}, + map[string]any{"level": "error"}, + map[string]any{"level": "warn"}, + map[string]any{"level": "error"}, + }, + ) + assert.Equal(t, []any{int64(1), int64(3)}, got) +} + +func TestBloblangV2MapEachKey(t *testing.T) { + got := runBloblangV2(t, + `output = input.map_each_key(k -> k.uppercase())`, + map[string]any{"keya": "hello", "keyb": "world"}, + ) + assert.Equal(t, map[string]any{"KEYA": "hello", "KEYB": "world"}, got) +} + +func TestBloblangV2MapEachKeyMustReturnString(t *testing.T) { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.map_each_key(k -> 42)`) + require.NoError(t, err) + _, err = exec.Query(map[string]any{"a": "v"}) + require.Error(t, err) + assert.Contains(t, err.Error(), "string") +} diff --git a/internal/impl/pure/bloblangv2_crypto.go b/internal/impl/pure/bloblangv2_crypto.go new file mode 100644 index 000000000..2e7180579 --- /dev/null +++ b/internal/impl/pure/bloblangv2_crypto.go @@ -0,0 +1,162 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "crypto/aes" + "crypto/cipher" + "errors" + "fmt" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of the V1 AES methods. Both are deterministic given (scheme, key, +// iv, plaintext) so they sit with the pure plugins. + +func init() { + bloblangv2.MustRegisterMethod("encrypt_aes", + bloblangv2.NewPluginSpec(). + Category("Encoding"). + Description("Encrypts a string or bytes value using AES under the named scheme and returns the ciphertext as bytes. Schemes: ctr, gcm, ofb, cbc."). + Param(bloblangv2.NewStringParam("scheme").Description("AES scheme: ctr, gcm, ofb, or cbc.")). + Param(bloblangv2.NewStringParam("key").Description("Encryption key. Length must match an AES variant: 16, 24, or 32 bytes.")). + Param(bloblangv2.NewStringParam("iv").Description("Initialization vector / nonce.")), + aesV2Ctor(true), + ) + + bloblangv2.MustRegisterMethod("decrypt_aes", + bloblangv2.NewPluginSpec(). + Category("Encoding"). + Description("Decrypts a string or bytes ciphertext using AES under the named scheme and returns the plaintext as bytes. Schemes: ctr, gcm, ofb, cbc."). + Param(bloblangv2.NewStringParam("scheme").Description("AES scheme: ctr, gcm, ofb, or cbc.")). + Param(bloblangv2.NewStringParam("key").Description("Decryption key. Length must match an AES variant: 16, 24, or 32 bytes.")). + Param(bloblangv2.NewStringParam("iv").Description("Initialization vector / nonce.")), + aesV2Ctor(false), + ) +} + +func aesV2Ctor(encrypt bool) bloblangv2.MethodConstructor { + return func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + scheme, err := args.GetString("scheme") + if err != nil { + return nil, err + } + key, err := args.GetString("key") + if err != nil { + return nil, err + } + iv, err := args.GetString("iv") + if err != nil { + return nil, err + } + + block, err := aes.NewCipher([]byte(key)) + if err != nil { + return nil, err + } + ivBytes := []byte(iv) + switch scheme { + case "ctr", "ofb", "cbc": + if len(ivBytes) != block.BlockSize() { + return nil, errors.New("the iv length must match the AES block size") + } + } + + var fn func([]byte) ([]byte, error) + if encrypt { + fn, err = buildAESEncrypt(scheme, block, ivBytes) + } else { + fn, err = buildAESDecrypt(scheme, block, ivBytes) + } + if err != nil { + return nil, err + } + + return func(v any) (any, error) { + switch t := v.(type) { + case string: + return fn([]byte(t)) + case []byte: + return fn(t) + } + return nil, fmt.Errorf("expected string or bytes receiver, got %T", v) + }, nil + } +} + +func buildAESEncrypt(scheme string, block cipher.Block, iv []byte) (func([]byte) ([]byte, error), error) { + switch scheme { + case "ctr": + return func(b []byte) ([]byte, error) { + out := make([]byte, len(b)) + cipher.NewCTR(block, iv).XORKeyStream(out, b) + return out, nil + }, nil + case "gcm": + return func(b []byte) ([]byte, error) { + s, err := cipher.NewGCM(block) + if err != nil { + return nil, fmt.Errorf("creating gcm failed: %w", err) + } + return s.Seal(nil, iv, b, nil), nil + }, nil + case "ofb": + return func(b []byte) ([]byte, error) { + out := make([]byte, len(b)) + //nolint:staticcheck // SA1019: cipher.NewOFB has been deprecated since Go 1.24 + cipher.NewOFB(block, iv).XORKeyStream(out, b) + return out, nil + }, nil + case "cbc": + return func(b []byte) ([]byte, error) { + if len(b)%aes.BlockSize != 0 { + return nil, errors.New("plaintext is not a multiple of the block size") + } + out := make([]byte, len(b)) + cipher.NewCBCEncrypter(block, iv).CryptBlocks(out, b) + return out, nil + }, nil + } + return nil, fmt.Errorf("unrecognized encryption scheme: %v", scheme) +} + +func buildAESDecrypt(scheme string, block cipher.Block, iv []byte) (func([]byte) ([]byte, error), error) { + switch scheme { + case "ctr": + return func(b []byte) ([]byte, error) { + out := make([]byte, len(b)) + cipher.NewCTR(block, iv).XORKeyStream(out, b) + return out, nil + }, nil + case "gcm": + return func(b []byte) ([]byte, error) { + s, err := cipher.NewGCM(block) + if err != nil { + return nil, fmt.Errorf("creating gcm failed: %w", err) + } + out, err := s.Open(nil, iv, b, nil) + if err != nil { + return nil, fmt.Errorf("gcm decrypting failed: %w", err) + } + return out, nil + }, nil + case "ofb": + return func(b []byte) ([]byte, error) { + out := make([]byte, len(b)) + //nolint:staticcheck // SA1019: cipher.NewOFB has been deprecated since Go 1.24 + cipher.NewOFB(block, iv).XORKeyStream(out, b) + return out, nil + }, nil + case "cbc": + return func(b []byte) ([]byte, error) { + if len(b)%aes.BlockSize != 0 { + return nil, errors.New("ciphertext is not a multiple of the block size") + } + out := make([]byte, len(b)) + cipher.NewCBCDecrypter(block, iv).CryptBlocks(out, b) + return out, nil + }, nil + } + return nil, fmt.Errorf("unrecognized decryption scheme: %v", scheme) +} diff --git a/internal/impl/pure/bloblangv2_crypto_test.go b/internal/impl/pure/bloblangv2_crypto_test.go new file mode 100644 index 000000000..2db664ebe --- /dev/null +++ b/internal/impl/pure/bloblangv2_crypto_test.go @@ -0,0 +1,72 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +// 16 bytes for AES-128, 16 bytes for IV (block size). +const ( + aesTestKey = "0123456789abcdef" + aesTestIV = "fedcba9876543210" +) + +func TestBloblangV2AESCTRRoundTrip(t *testing.T) { + enc := runBloblangV2(t, + `output = input.encrypt_aes("ctr", "`+aesTestKey+`", "`+aesTestIV+`")`, + "hello world", + ).([]byte) + + dec := runBloblangV2(t, + `output = input.decrypt_aes("ctr", "`+aesTestKey+`", "`+aesTestIV+`").string()`, + enc, + ) + assert.Equal(t, "hello world", dec) +} + +func TestBloblangV2AESGCMRoundTrip(t *testing.T) { + // GCM uses a 12-byte nonce. + const nonce = "0123456789ab" + enc := runBloblangV2(t, + `output = input.encrypt_aes("gcm", "`+aesTestKey+`", "`+nonce+`")`, + "secret payload", + ).([]byte) + + dec := runBloblangV2(t, + `output = input.decrypt_aes("gcm", "`+aesTestKey+`", "`+nonce+`").string()`, + enc, + ) + assert.Equal(t, "secret payload", dec) +} + +func TestBloblangV2AESCBCRoundTrip(t *testing.T) { + // CBC requires 16-byte aligned plaintext. + const plain = "0123456789abcdef0123456789abcdef" + enc := runBloblangV2(t, + `output = input.encrypt_aes("cbc", "`+aesTestKey+`", "`+aesTestIV+`")`, + plain, + ).([]byte) + + dec := runBloblangV2(t, + `output = input.decrypt_aes("cbc", "`+aesTestKey+`", "`+aesTestIV+`").string()`, + enc, + ) + assert.Equal(t, plain, dec) +} + +func TestBloblangV2AESUnknownScheme(t *testing.T) { + // Static literal args mean V2 runs the constructor at parse time, so the + // scheme validation surfaces as a parse error. + _, err := bloblangv2.GlobalEnvironment().Parse( + `output = input.encrypt_aes("does_not_exist", "` + aesTestKey + `", "` + aesTestIV + `")`, + ) + assert.Error(t, err) + assert.Contains(t, err.Error(), "unrecognized encryption scheme") +} diff --git a/internal/impl/pure/bloblangv2_encoding.go b/internal/impl/pure/bloblangv2_encoding.go new file mode 100644 index 000000000..d22388aed --- /dev/null +++ b/internal/impl/pure/bloblangv2_encoding.go @@ -0,0 +1,253 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "crypto/hmac" + "crypto/md5" + "crypto/sha1" + "crypto/sha256" + "crypto/sha512" + "fmt" + "hash" + "hash/crc32" + "hash/fnv" + "net/url" + "strconv" + + "github.com/OneOfOne/xxhash" + "github.com/gofrs/uuid/v5" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 encoding-adjacent methods that are deterministic given their +// inputs (no env access, no time, no randomness). See PARITY.md. + +func init() { + bloblangv2.MustRegisterMethod("hash", + bloblangv2.NewPluginSpec(). + Category("Encoding"). + Description("Hashes a string or bytes using the named algorithm and returns the digest as bytes. Available algorithms: hmac_sha1, hmac_sha256, hmac_sha512, md5, sha1, sha256, sha512, xxhash64, crc32, fnv32. The hmac_* algorithms require the key argument; crc32 supports an optional polynomial."). + Param(bloblangv2.NewStringParam("algorithm").Description("The hashing algorithm to use.")). + Param(bloblangv2.NewStringParam("key").Description("Key for HMAC variants.").Default("")). + Param(bloblangv2.NewStringParam("polynomial").Description(`crc32 polynomial: "IEEE", "Castagnoli", or "Koopman".`).Default("IEEE")), + hashV2Ctor, + ) + + bloblangv2.MustRegisterMethod("uuid_v5", + bloblangv2.NewPluginSpec(). + Category("Encoding"). + Description(`Returns a deterministic UUID v5 derived from the receiver string and a namespace. The namespace may be one of "dns", "url", "oid", "x500", or any RFC-9562 UUID. Empty / unset uses the nil namespace.`). + Param(bloblangv2.NewStringParam("ns").Description("Namespace name or UUID.").Default("")), + uuidV5V2Ctor, + ) + + bloblangv2.MustRegisterMethod("compress", + bloblangv2.NewPluginSpec(). + Category("Encoding"). + Description("Compresses the receiver bytes using the named algorithm and returns the compressed bytes. Supported algorithms: flate, gzip, pgzip, lz4, snappy, zlib, zstd."). + Param(bloblangv2.NewStringParam("algorithm").Description("The compression algorithm.")). + Param(bloblangv2.NewInt64Param("level").Description("Compression level (-1 selects the algorithm default).").Default(int64(-1))), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + algStr, err := args.GetString("algorithm") + if err != nil { + return nil, err + } + level, err := args.GetInt64("level") + if err != nil { + return nil, err + } + algFn, err := strToCompressFunc(algStr) + if err != nil { + return nil, err + } + return bloblangv2.BytesMethod(func(data []byte) (any, error) { + return algFn(int(level), data) + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("decompress", + bloblangv2.NewPluginSpec(). + Category("Encoding"). + Description("Decompresses the receiver bytes using the named algorithm and returns the decompressed bytes. Supported algorithms: gzip, pgzip, zlib, bzip2, flate, snappy, lz4, zstd."). + Param(bloblangv2.NewStringParam("algorithm").Description("The decompression algorithm.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + algStr, err := args.GetString("algorithm") + if err != nil { + return nil, err + } + algFn, err := strToDecompressFunc(algStr) + if err != nil { + return nil, err + } + return bloblangv2.BytesMethod(func(data []byte) (any, error) { + return algFn(data) + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("parse_form_url_encoded", + bloblangv2.NewPluginSpec(). + Category("Parsing"). + Description("Parses a url-encoded query string (e.g. an x-www-form-urlencoded request body) and returns an object. Repeated keys are surfaced as arrays."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + values, err := url.ParseQuery(s) + if err != nil { + return nil, fmt.Errorf("failed to parse value as url-encoded data: %w", err) + } + return urlValuesToMap(values), nil + }), nil + }, + ) +} + +func hashV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + algo, err := args.GetString("algorithm") + if err != nil { + return nil, err + } + key, err := args.GetString("key") + if err != nil { + return nil, err + } + poly, err := args.GetString("polynomial") + if err != nil { + return nil, err + } + + hashFn, err := buildHashFn(algo, []byte(key), poly) + if err != nil { + return nil, err + } + return func(v any) (any, error) { + switch t := v.(type) { + case string: + return hashFn([]byte(t)) + case []byte: + return hashFn(t) + } + return nil, fmt.Errorf("expected string or bytes receiver, got %T", v) + }, nil +} + +func buildHashFn(algo string, key []byte, poly string) (func([]byte) ([]byte, error), error) { + requireKey := func() error { + if len(key) == 0 { + return fmt.Errorf("hash algorithm %v requires a key argument", algo) + } + return nil + } + switch algo { + case "hmac_sha1", "hmac-sha1": + if err := requireKey(); err != nil { + return nil, err + } + return func(b []byte) ([]byte, error) { + h := hmac.New(sha1.New, key) + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "hmac_sha256", "hmac-sha256": + if err := requireKey(); err != nil { + return nil, err + } + return func(b []byte) ([]byte, error) { + h := hmac.New(sha256.New, key) + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "hmac_sha512", "hmac-sha512": + if err := requireKey(); err != nil { + return nil, err + } + return func(b []byte) ([]byte, error) { + h := hmac.New(sha512.New, key) + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "md5": + return func(b []byte) ([]byte, error) { + h := md5.New() + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "sha1": + return func(b []byte) ([]byte, error) { + h := sha1.New() + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "sha256": + return func(b []byte) ([]byte, error) { + h := sha256.New() + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "sha512": + return func(b []byte) ([]byte, error) { + h := sha512.New() + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "xxhash64": + return func(b []byte) ([]byte, error) { + h := xxhash.New64() + _, _ = h.Write(b) + return []byte(strconv.FormatUint(h.Sum64(), 10)), nil + }, nil + case "crc32": + return func(b []byte) ([]byte, error) { + var h hash.Hash + switch poly { + case "IEEE": + h = crc32.NewIEEE() + case "Castagnoli": + h = crc32.New(crc32.MakeTable(crc32.Castagnoli)) + case "Koopman": + h = crc32.New(crc32.MakeTable(crc32.Koopman)) + default: + return nil, fmt.Errorf("unsupported crc32 polynomial %q", poly) + } + _, _ = h.Write(b) + return h.Sum(nil), nil + }, nil + case "fnv32": + return func(b []byte) ([]byte, error) { + h := fnv.New32() + _, _ = h.Write(b) + return []byte(strconv.FormatUint(uint64(h.Sum32()), 10)), nil + }, nil + } + return nil, fmt.Errorf("unrecognized hash type: %v", algo) +} + +func uuidV5V2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + ns, err := args.GetString("ns") + if err != nil { + return nil, err + } + var nsUUID uuid.UUID + switch ns { + case "": + nsUUID = uuid.Nil + case "dns", "DNS": + nsUUID = uuid.NamespaceDNS + case "url", "URL": + nsUUID = uuid.NamespaceURL + case "oid", "OID": + nsUUID = uuid.NamespaceOID + case "x500", "X500": + nsUUID = uuid.NamespaceX500 + default: + nsUUID, err = uuid.FromString(ns) + if err != nil { + return nil, fmt.Errorf("invalid ns uuid: %q", ns) + } + } + return bloblangv2.StringMethod(func(s string) (any, error) { + return uuid.NewV5(nsUUID, s).String(), nil + }), nil +} diff --git a/internal/impl/pure/bloblangv2_encoding_test.go b/internal/impl/pure/bloblangv2_encoding_test.go new file mode 100644 index 000000000..f3a43ee00 --- /dev/null +++ b/internal/impl/pure/bloblangv2_encoding_test.go @@ -0,0 +1,93 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestBloblangV2HashSHA256(t *testing.T) { + got := runBloblangV2(t, + `output = input.hash("sha256").encode("hex")`, + "hello world", + ) + assert.Equal(t, "b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9", got) +} + +func TestBloblangV2HashHMACSHA1(t *testing.T) { + got := runBloblangV2(t, + `output = input.hash("hmac_sha1", "static-key").encode("hex")`, + "hello world", + ) + assert.Equal(t, "d87e5f068fa08fe90bb95bc7c8344cb809179d76", got) +} + +func TestBloblangV2HashHMACRequiresKey(t *testing.T) { + // Static args trigger parse-time construction; HMAC missing a key + // surfaces as a parse error. + _, err := bloblangv2.GlobalEnvironment().Parse(`output = input.hash("hmac_sha256")`) + require.Error(t, err) + assert.Contains(t, err.Error(), "key") +} + +func TestBloblangV2HashUnknownAlgorithm(t *testing.T) { + // V2 caches the constructor for static literal args at parse time, so an + // unknown algorithm surfaces as a parse error rather than a runtime one. + _, err := bloblangv2.GlobalEnvironment().Parse(`output = input.hash("does_not_exist")`) + require.Error(t, err) + assert.Contains(t, err.Error(), "unrecognized hash type") +} + +func TestBloblangV2UUIDV5Deterministic(t *testing.T) { + // Same name + namespace must produce the same UUID twice in a row. + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.uuid_v5("dns")`) + require.NoError(t, err) + a, err := exec.Query("example.com") + require.NoError(t, err) + b, err := exec.Query("example.com") + require.NoError(t, err) + assert.Equal(t, a, b) + + // Different namespace must change the result. + exec2, err := bloblangv2.GlobalEnvironment().Parse(`output = input.uuid_v5("url")`) + require.NoError(t, err) + c, err := exec2.Query("example.com") + require.NoError(t, err) + assert.NotEqual(t, a, c) +} + +func TestBloblangV2UUIDV5DefaultNamespace(t *testing.T) { + // With the default empty namespace the result is the nil-namespace UUID. + got := runBloblangV2(t, `output = input.uuid_v5()`, "example") + assert.Equal(t, "feb54431-301b-52bb-a6dd-e1e93e81bb9e", got) +} + +func TestBloblangV2CompressDecompressRoundTrip(t *testing.T) { + for _, algo := range []string{"gzip", "zlib", "flate", "snappy", "lz4"} { + t.Run(algo, func(t *testing.T) { + mapping := `output = input.bytes().compress("` + algo + `").decompress("` + algo + `").string()` + got := runBloblangV2(t, mapping, "hello world I love space") + assert.Equal(t, "hello world I love space", got) + }) + } +} + +func TestBloblangV2ParseFormURLEncoded(t *testing.T) { + got := runBloblangV2(t, + `output = input.parse_form_url_encoded()`, + "noise=meow&animal=cat&fur=orange&fur=fluffy", + ) + want := map[string]any{ + "noise": "meow", + "animal": "cat", + "fur": []any{"orange", "fluffy"}, + } + assert.Equal(t, want, got) +} diff --git a/internal/impl/pure/bloblangv2_numbers.go b/internal/impl/pure/bloblangv2_numbers.go new file mode 100644 index 000000000..518951b9a --- /dev/null +++ b/internal/impl/pure/bloblangv2_numbers.go @@ -0,0 +1,194 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "fmt" + "math" + "strconv" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 numeric methods. All deterministic; live with the existing +// pure plugins. + +func init() { + bloblangv2.MustRegisterMethod("log", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Calculates the natural logarithm (base e) of the receiver number."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.Float64Method(func(f float64) (any, error) { + return math.Log(f), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("log10", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Calculates the base-10 logarithm of the receiver number."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.Float64Method(func(f float64) (any, error) { + return math.Log10(f), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("bitwise_and", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Performs a bitwise AND between the receiver integer and the value argument."). + Param(bloblangv2.NewInt64Param("value").Description("The value to AND with.")), + bitwiseV2Ctor(func(a, b int64) int64 { return a & b }), + ) + + bloblangv2.MustRegisterMethod("bitwise_or", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Performs a bitwise OR between the receiver integer and the value argument."). + Param(bloblangv2.NewInt64Param("value").Description("The value to OR with.")), + bitwiseV2Ctor(func(a, b int64) int64 { return a | b }), + ) + + bloblangv2.MustRegisterMethod("bitwise_xor", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Performs a bitwise XOR between the receiver integer and the value argument."). + Param(bloblangv2.NewInt64Param("value").Description("The value to XOR with.")), + bitwiseV2Ctor(func(a, b int64) int64 { return a ^ b }), + ) + + bloblangv2.MustRegisterMethod("number", + bloblangv2.NewPluginSpec(). + Category("Coercion"). + Description("Attempts to coerce the receiver into a float64. Accepts numbers, numeric strings, and bools. If coercion fails and a default is supplied the default is returned instead."). + Param(bloblangv2.NewFloat64Param("default").Description("Value returned when coercion fails.").Optional()), + numberCoerceV2Ctor, + ) + + bloblangv2.MustRegisterMethod("pow", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Returns the receiver raised to the specified exponent."). + Param(bloblangv2.NewFloat64Param("exponent").Description("The exponent to raise the receiver to.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + exp, err := args.GetFloat64("exponent") + if err != nil { + return nil, err + } + return bloblangv2.Float64Method(func(base float64) (any, error) { + return math.Pow(base, exp), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("sin", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Calculates the sine of the receiver, interpreted as an angle in radians."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.Float64Method(func(f float64) (any, error) { + return math.Sin(f), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("cos", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Calculates the cosine of the receiver, interpreted as an angle in radians."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.Float64Method(func(f float64) (any, error) { + return math.Cos(f), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("tan", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Calculates the tangent of the receiver, interpreted as an angle in radians."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.Float64Method(func(f float64) (any, error) { + return math.Tan(f), nil + }), nil + }, + ) + + bloblangv2.MustRegisterFunction("pi", + bloblangv2.NewPluginSpec(). + Category("Numbers"). + Description("Returns the value of the mathematical constant Pi."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Function, error) { + return func() (any, error) { + return math.Pi, nil + }, nil + }, + ) +} + +func bitwiseV2Ctor(op func(a, b int64) int64) bloblangv2.MethodConstructor { + return func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + rhs, err := args.GetInt64("value") + if err != nil { + return nil, err + } + return bloblangv2.Int64Method(func(lhs int64) (any, error) { + return op(lhs, rhs), nil + }), nil + } +} + +func numberCoerceV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + defaultPtr, err := args.GetOptionalFloat64("default") + if err != nil { + return nil, err + } + return func(v any) (any, error) { + f, ok := coerceToFloat(v) + if ok { + return f, nil + } + if defaultPtr != nil { + return *defaultPtr, nil + } + return nil, fmt.Errorf("could not coerce %T to a number", v) + }, nil +} + +func coerceToFloat(v any) (float64, bool) { + switch n := v.(type) { + case float64: + return n, true + case float32: + return float64(n), true + case int64: + return float64(n), true + case int32: + return float64(n), true + case uint64: + return float64(n), true + case uint32: + return float64(n), true + case bool: + if n { + return 1, true + } + return 0, true + case string: + f, err := strconv.ParseFloat(n, 64) + if err == nil { + return f, true + } + return 0, false + case []byte: + f, err := strconv.ParseFloat(string(n), 64) + if err == nil { + return f, true + } + return 0, false + } + return 0, false +} diff --git a/internal/impl/pure/bloblangv2_numbers_test.go b/internal/impl/pure/bloblangv2_numbers_test.go new file mode 100644 index 000000000..de80d835a --- /dev/null +++ b/internal/impl/pure/bloblangv2_numbers_test.go @@ -0,0 +1,102 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "math" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestBloblangV2Bitwise(t *testing.T) { + cases := []struct { + name string + mapping string + input any + want any + }{ + {name: "and", mapping: `output = input.bitwise_and(6)`, input: int64(12), want: int64(4)}, + {name: "or", mapping: `output = input.bitwise_or(6)`, input: int64(12), want: int64(14)}, + {name: "xor", mapping: `output = input.bitwise_xor(6)`, input: int64(12), want: int64(10)}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := runBloblangV2(t, tc.mapping, tc.input) + assert.Equal(t, tc.want, got) + }) + } +} + +func TestBloblangV2MathPlugins(t *testing.T) { + t.Run("pi", func(t *testing.T) { + got := runBloblangV2(t, `output = pi()`, nil) + assert.InDelta(t, math.Pi, got, 1e-12) + }) + t.Run("pow", func(t *testing.T) { + got := runBloblangV2(t, `output = input.pow(3.0)`, float64(2)) + assert.InDelta(t, 8.0, got, 1e-12) + }) + t.Run("sin", func(t *testing.T) { + got := runBloblangV2(t, `output = input.sin()`, float64(0)) + assert.InDelta(t, 0.0, got, 1e-12) + }) + t.Run("cos zero", func(t *testing.T) { + got := runBloblangV2(t, `output = input.cos()`, float64(0)) + assert.InDelta(t, 1.0, got, 1e-12) + }) + t.Run("tan pi/4", func(t *testing.T) { + got := runBloblangV2(t, `output = input.tan()`, math.Pi/4) + assert.InDelta(t, 1.0, got, 1e-12) + }) + t.Run("trig composition", func(t *testing.T) { + // sin^2 + cos^2 = 1 + got := runBloblangV2(t, `output = input.sin().pow(2.0) + input.cos().pow(2.0)`, math.Pi/3) + assert.InDelta(t, 1.0, got, 1e-12) + }) +} + +func TestBloblangV2Logarithms(t *testing.T) { + got := runBloblangV2(t, `output = input.log()`, math.E) + if got.(float64) < 0.99 || got.(float64) > 1.01 { + t.Fatalf("log(e) = %v, want ~1.0", got) + } + + got = runBloblangV2(t, `output = input.log10()`, int64(1000)) + if got.(float64) < 2.99 || got.(float64) > 3.01 { + t.Fatalf("log10(1000) = %v, want ~3.0", got) + } +} + +func TestBloblangV2NumberCoercion(t *testing.T) { + cases := []struct { + name string + mapping string + input any + want float64 + }{ + {name: "string", mapping: `output = input.number()`, input: "3.14", want: 3.14}, + {name: "int", mapping: `output = input.number()`, input: int64(7), want: 7}, + {name: "bool true", mapping: `output = input.number()`, input: true, want: 1}, + {name: "bool false", mapping: `output = input.number()`, input: false, want: 0}, + {name: "default used", mapping: `output = input.number(42.0)`, input: "not a number", want: 42}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := runBloblangV2(t, tc.mapping, tc.input) + assert.InDelta(t, tc.want, got, 1e-9) + }) + } +} + +func TestBloblangV2NumberFailsWithoutDefault(t *testing.T) { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.number()`) + require.NoError(t, err) + _, err = exec.Query("not a number") + require.Error(t, err) +} diff --git a/internal/impl/pure/bloblangv2_objects.go b/internal/impl/pure/bloblangv2_objects.go new file mode 100644 index 000000000..88a31c915 --- /dev/null +++ b/internal/impl/pure/bloblangv2_objects.go @@ -0,0 +1,240 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "errors" + "fmt" + + "github.com/Jeffail/gabs/v2" + + "github.com/redpanda-data/benthos/v4/internal/value" + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 object methods that don't require lambda arguments. + +func init() { + bloblangv2.MustRegisterMethod("array", + bloblangv2.NewPluginSpec(). + Category("Coercion"). + Description("Returns the receiver wrapped in a single-element array, unless it is already an array, in which case it is returned unchanged."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { + if _, ok := v.([]any); ok { + return v, nil + } + return []any{v}, nil + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("exists", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Returns true when the dot-path argument resolves to a field that is present on the receiver object — even when its value is null. Returns false otherwise."). + Param(bloblangv2.NewStringParam("path").Description("A dot-separated path to the field.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + pathStr, err := args.GetString("path") + if err != nil { + return nil, err + } + path := gabs.DotPathToSlice(pathStr) + return func(v any) (any, error) { + return gabs.Wrap(v).Exists(path...), nil + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("get", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Extracts a field value from the receiver object identified by a dot-path. Returns null when the path does not resolve."). + Param(bloblangv2.NewStringParam("path").Description("A dot-separated path to the field.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + pathStr, err := args.GetString("path") + if err != nil { + return nil, err + } + path := gabs.DotPathToSlice(pathStr) + return func(v any) (any, error) { + return gabs.Wrap(v).S(path...).Data(), nil + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("explode", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description(`Expands a nested array or object at the given path into multiple documents while preserving the surrounding structure. With an array target the result is an array of documents; with an object target the result is an object keyed by the nested keys.`). + Param(bloblangv2.NewStringParam("path").Description("A dot-separated path to the field to explode.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + pathRaw, err := args.GetString("path") + if err != nil { + return nil, err + } + path := gabs.DotPathToSlice(pathRaw) + return func(v any) (any, error) { + rootMap, ok := v.(map[string]any) + if !ok { + return nil, fmt.Errorf("expected object receiver, got %T", v) + } + + target := gabs.Wrap(v).Search(path...) + copyFrom := mapCloneWithoutPath(rootMap, path) + + switch t := target.Data().(type) { + case []any: + result := make([]any, len(t)) + for i, ele := range t { + g := gabs.Wrap(value.IClone(copyFrom)) + _, _ = g.Set(ele, path...) + result[i] = g.Data() + } + return result, nil + case map[string]any: + result := make(map[string]any, len(t)) + for k, ele := range t { + g := gabs.Wrap(value.IClone(copyFrom)) + _, _ = g.Set(ele, path...) + result[k] = g.Data() + } + return result, nil + } + return nil, fmt.Errorf("expected array or object value at path %q, found: %T", pathRaw, target.Data()) + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("assign", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Combines the receiver with the with argument. For objects, source keys overwrite destination keys on conflict. For arrays the with value is concatenated. Use merge() instead for non-overwriting behaviour."). + Param(bloblangv2.NewAnyParam("with").Description("Object or array to assign onto the receiver.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + source, err := args.Get("with") + if err != nil { + return nil, err + } + return func(v any) (any, error) { + source := value.IClone(source) + if root, isArray := v.([]any); isArray { + if rhs, isAlsoArray := source.([]any); isAlsoArray { + return append(root, rhs...), nil + } + return append(root, source), nil + } + if _, isObject := v.(map[string]any); !isObject { + return nil, fmt.Errorf("expected object or array receiver, got %T", v) + } + root := gabs.New() + if err := root.MergeFn(gabs.Wrap(v), assignerOverwrite); err != nil { + return nil, err + } + if err := root.MergeFn(gabs.Wrap(source), assignerOverwrite); err != nil { + return nil, err + } + return root.Data(), nil + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("with", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Returns the receiver object reduced to the listed dot-path keys. Paths missing on the receiver are ignored. Use this as the keep-only counterpart to without."). + Param(bloblangv2.NewAnyParam("paths").Description("Array of dot-separated paths to retain.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + pathsArg, err := args.Get("paths") + if err != nil { + return nil, err + } + paths, ok := pathsArg.([]any) + if !ok { + return nil, fmt.Errorf("expected an array of paths, got %T", pathsArg) + } + includeList := make([][]string, 0, len(paths)) + for i, p := range paths { + s, err := value.IGetString(p) + if err != nil { + return nil, fmt.Errorf("paths[%d]: %w", i, err) + } + includeList = append(includeList, gabs.DotPathToSlice(s)) + } + return bloblangv2.ObjectMethod(func(in map[string]any) (any, error) { + return mapWith(in, includeList), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("zip", + bloblangv2.NewPluginSpec(). + Category("Object & Array"). + Description("Zips the receiver array with one or more argument arrays element-wise. Every array must share the same length. Each output element is an array starting with the receiver value followed by the matching element from each argument."). + Param(bloblangv2.NewAnyParam("others").Description("Array of arrays to zip with the receiver.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + othersArg, err := args.Get("others") + if err != nil { + return nil, err + } + outer, ok := othersArg.([]any) + if !ok { + return nil, fmt.Errorf("expected an array of arrays, got %T", othersArg) + } + if len(outer) == 0 { + return nil, errors.New("zip requires at least one argument array") + } + argSlices := make([][]any, len(outer)) + for i, a := range outer { + inner, ok := a.([]any) + if !ok { + return nil, fmt.Errorf("others[%d]: expected array, got %T", i, a) + } + if i > 0 && len(inner) != len(argSlices[0]) { + return nil, errors.New("zip arrays must match in length") + } + argSlices[i] = inner + } + return bloblangv2.ArrayMethod(func(in []any) (any, error) { + if len(in) != len(argSlices[0]) { + return nil, errors.New("zip arrays must match in length") + } + out := make([]any, 0, len(in)) + for offset, v := range in { + tuple := make([]any, 0, len(argSlices)+1) + tuple = append(tuple, v) + for _, slice := range argSlices { + tuple = append(tuple, slice[offset]) + } + out = append(out, tuple) + } + return out, nil + }), nil + }, + ) +} + +func assignerOverwrite(_, source any) any { return source } + +// mapCloneWithoutPath returns a deep clone of m with the value at path +// removed. Used by explode to derive the carrier document for each exploded +// child. +func mapCloneWithoutPath(m map[string]any, path []string) any { + if len(path) == 0 { + return value.IClone(m) + } + cloned, ok := value.IClone(m).(map[string]any) + if !ok { + return value.IClone(m) + } + cur := cloned + for i := 0; i < len(path)-1; i++ { + next, ok := cur[path[i]].(map[string]any) + if !ok { + return cloned + } + cur = next + } + delete(cur, path[len(path)-1]) + return cloned +} diff --git a/internal/impl/pure/bloblangv2_objects_test.go b/internal/impl/pure/bloblangv2_objects_test.go new file mode 100644 index 000000000..f79dad836 --- /dev/null +++ b/internal/impl/pure/bloblangv2_objects_test.go @@ -0,0 +1,139 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestBloblangV2Array(t *testing.T) { + got := runBloblangV2(t, `output = input.array()`, "hello") + assert.Equal(t, []any{"hello"}, got) + + got = runBloblangV2(t, `output = input.array()`, []any{"already"}) + assert.Equal(t, []any{"already"}, got) +} + +func TestBloblangV2Exists(t *testing.T) { + doc := map[string]any{"foo": map[string]any{"bar": map[string]any{"baz": "yep"}}} + got := runBloblangV2(t, `output = input.exists("foo.bar.baz")`, doc) + assert.Equal(t, true, got) + + got = runBloblangV2(t, `output = input.exists("foo.bar.qux")`, doc) + assert.Equal(t, false, got) +} + +func TestBloblangV2ExistsTrueForNullValue(t *testing.T) { + doc := map[string]any{"data": map[string]any{"optional": nil}} + got := runBloblangV2(t, `output = input.exists("data.optional")`, doc) + assert.Equal(t, true, got) +} + +func TestBloblangV2Get(t *testing.T) { + doc := map[string]any{"foo": map[string]any{"bar": "from bar"}} + got := runBloblangV2(t, `output = input.get("foo.bar")`, doc) + assert.Equal(t, "from bar", got) + + got = runBloblangV2(t, `output = input.get("foo.missing")`, doc) + assert.Nil(t, got) +} + +func TestBloblangV2ExplodeOnArray(t *testing.T) { + doc := map[string]any{"id": int64(1), "value": []any{"foo", "bar", "baz"}} + got := runBloblangV2(t, `output = input.explode("value")`, doc) + assert.Equal(t, []any{ + map[string]any{"id": int64(1), "value": "foo"}, + map[string]any{"id": int64(1), "value": "bar"}, + map[string]any{"id": int64(1), "value": "baz"}, + }, got) +} + +func TestBloblangV2ExplodeOnObject(t *testing.T) { + doc := map[string]any{ + "id": int64(1), + "value": map[string]any{"foo": int64(2), "bar": int64(3)}, + } + got := runBloblangV2(t, `output = input.explode("value")`, doc) + expected := map[string]any{ + "foo": map[string]any{"id": int64(1), "value": int64(2)}, + "bar": map[string]any{"id": int64(1), "value": int64(3)}, + } + assert.Equal(t, expected, got) +} + +func TestBloblangV2Assign(t *testing.T) { + got := runBloblangV2(t, + `output = input.assign({"likes": "foos", "second_name": "barer"})`, + map[string]any{"first_name": "fooer", "likes": "bars"}, + ) + assert.Equal(t, map[string]any{ + "first_name": "fooer", + "likes": "foos", + "second_name": "barer", + }, got) +} + +func TestBloblangV2AssignArray(t *testing.T) { + got := runBloblangV2(t, + `output = input.assign(["c", "d"])`, + []any{"a", "b"}, + ) + assert.Equal(t, []any{"a", "b", "c", "d"}, got) +} + +func TestBloblangV2WithKeepsListedPaths(t *testing.T) { + got := runBloblangV2(t, + `output = input.with(["inner.a", "inner.c", "d"])`, + map[string]any{ + "inner": map[string]any{"a": "first", "b": "second", "c": "third"}, + "d": "fourth", + "e": "fifth", + }, + ) + assert.Equal(t, map[string]any{ + "d": "fourth", + "inner": map[string]any{"a": "first", "c": "third"}, + }, got) +} + +func TestBloblangV2WithMissingPathsIgnored(t *testing.T) { + got := runBloblangV2(t, + `output = input.with(["a", "missing"])`, + map[string]any{"a": int64(1), "b": int64(2)}, + ) + assert.Equal(t, map[string]any{"a": int64(1)}, got) +} + +func TestBloblangV2ZipArrays(t *testing.T) { + got := runBloblangV2(t, + `output = input.foo.zip([input.bar, input.baz])`, + map[string]any{ + "foo": []any{"a", "b", "c"}, + "bar": []any{int64(1), int64(2), int64(3)}, + "baz": []any{int64(4), int64(5), int64(6)}, + }, + ) + want := []any{ + []any{"a", int64(1), int64(4)}, + []any{"b", int64(2), int64(5)}, + []any{"c", int64(3), int64(6)}, + } + assert.Equal(t, want, got) +} + +func TestBloblangV2ZipMismatchedLengthsErrors(t *testing.T) { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.foo.zip([input.bar])`) + require.NoError(t, err) + _, qerr := exec.Query(map[string]any{ + "foo": []any{"a", "b"}, + "bar": []any{int64(1), int64(2), int64(3)}, + }) + assert.Error(t, qerr) +} diff --git a/internal/impl/pure/bloblangv2_parsing.go b/internal/impl/pure/bloblangv2_parsing.go new file mode 100644 index 000000000..38cd61664 --- /dev/null +++ b/internal/impl/pure/bloblangv2_parsing.go @@ -0,0 +1,201 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "bytes" + "encoding/csv" + "errors" + "fmt" + "strings" + + jsonschema "github.com/xeipuuv/gojsonschema" + "gopkg.in/yaml.v3" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 parsing methods that operate purely on the receiver value. + +func init() { + bloblangv2.MustRegisterMethod("parse_yaml", + bloblangv2.NewPluginSpec(). + Category("Parsing"). + Description("Attempts to parse the receiver string as a single YAML document and returns the result."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return parseYAMLV2, nil + }, + ) + + bloblangv2.MustRegisterMethod("format_yaml", + bloblangv2.NewPluginSpec(). + Category("Parsing"). + Description("Serialises the receiver value into a YAML byte array."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return func(v any) (any, error) { + return yaml.Marshal(v) + }, nil + }, + ) + + bloblangv2.MustRegisterMethod("parse_csv", + bloblangv2.NewPluginSpec(). + Category("Parsing"). + Description("Attempts to parse the receiver string as RFC 4180 CSV. With a header row the result is an array of objects keyed by column; without one it is an array of row arrays."). + Param(bloblangv2.NewBoolParam("parse_header_row").Description("Treat the first row as a header. When true the result is an array of objects keyed by column.").Default(true)). + Param(bloblangv2.NewStringParam("delimiter").Description("Single-character field delimiter.").Default(",")). + Param(bloblangv2.NewBoolParam("lazy_quotes").Description(`If true, allow a quote inside an unquoted field and a non-doubled quote in a quoted field.`).Default(false)), + parseCSVV2Ctor, + ) + + bloblangv2.MustRegisterMethod("json_schema", + bloblangv2.NewPluginSpec(). + Category("Parsing"). + Description("Validates the receiver value against a JSON schema and returns it unchanged on success, or an error describing the validation failure."). + Param(bloblangv2.NewStringParam("schema").Description("A JSON schema document.")), + jsonSchemaV2Ctor, + ) +} + +func parseYAMLV2(v any) (any, error) { + var data []byte + switch t := v.(type) { + case string: + data = []byte(t) + case []byte: + data = t + default: + return nil, fmt.Errorf("expected string or bytes receiver, got %T", v) + } + var out any + if err := yaml.Unmarshal(data, &out); err != nil { + return nil, fmt.Errorf("failed to parse value as YAML: %w", err) + } + return normaliseYAMLNumbers(out), nil +} + +// normaliseYAMLNumbers walks a yaml.Unmarshal result and rewrites any Go-int +// values to int64 to match the V2 numeric type discipline. yaml.v3 picks +// platform-sized int when it could fit; the V2 interpreter only knows int64 +// / float64 / etc. +func normaliseYAMLNumbers(v any) any { + switch t := v.(type) { + case int: + return int64(t) + case map[string]any: + for k, vv := range t { + t[k] = normaliseYAMLNumbers(vv) + } + return t + case []any: + for i, vv := range t { + t[i] = normaliseYAMLNumbers(vv) + } + return t + default: + return v + } +} + +func parseCSVV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + parseHeaderRow, err := args.GetBool("parse_header_row") + if err != nil { + return nil, err + } + delimStr, err := args.GetString("delimiter") + if err != nil { + return nil, err + } + delimRunes := []rune(delimStr) + if len(delimRunes) != 1 { + return nil, errors.New("delimiter value must be exactly one character") + } + delimiter := delimRunes[0] + lazyQuotes, err := args.GetBool("lazy_quotes") + if err != nil { + return nil, err + } + + return func(v any) (any, error) { + var data []byte + switch t := v.(type) { + case string: + data = []byte(t) + case []byte: + data = t + default: + return nil, fmt.Errorf("expected string or bytes receiver, got %T", v) + } + r := csv.NewReader(bytes.NewReader(data)) + r.Comma = delimiter + r.LazyQuotes = lazyQuotes + records, err := r.ReadAll() + if err != nil { + return nil, err + } + if len(records) == 0 { + return nil, errors.New("zero records were parsed") + } + if parseHeaderRow { + headers := records[0] + if len(headers) == 0 { + return nil, errors.New("no headers found on first row") + } + out := make([]any, 0, len(records)-1) + for j, rec := range records[1:] { + if len(headers) != len(rec) { + return nil, fmt.Errorf("record on line %d: record mismatch with headers", j) + } + obj := make(map[string]any, len(rec)) + for i, cell := range rec { + obj[headers[i]] = cell + } + out = append(out, obj) + } + return out, nil + } + out := make([]any, 0, len(records)) + for _, rec := range records { + row := make([]any, len(rec)) + for i, cell := range rec { + row[i] = cell + } + out = append(out, row) + } + return out, nil + }, nil +} + +func jsonSchemaV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + schemaStr, err := args.GetString("schema") + if err != nil { + return nil, err + } + schema, err := jsonschema.NewSchema(jsonschema.NewStringLoader(schemaStr)) + if err != nil { + return nil, fmt.Errorf("failed to parse json schema definition: %w", err) + } + return func(v any) (any, error) { + result, err := schema.Validate(jsonschema.NewGoLoader(v)) + if err != nil { + return nil, err + } + if result.Valid() { + return v, nil + } + var b strings.Builder + for i, desc := range result.Errors() { + if i > 0 { + b.WriteByte('\n') + } + description := strings.ToLower(desc.Description()) + if property := desc.Details()["property"]; property != nil { + description = property.(string) + strings.TrimPrefix(description, strings.ToLower(property.(string))) + } + b.WriteString(desc.Field()) + b.WriteByte(' ') + b.WriteString(description) + } + return nil, errors.New(b.String()) + }, nil +} diff --git a/internal/impl/pure/bloblangv2_parsing_test.go b/internal/impl/pure/bloblangv2_parsing_test.go new file mode 100644 index 000000000..5c9acc375 --- /dev/null +++ b/internal/impl/pure/bloblangv2_parsing_test.go @@ -0,0 +1,95 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestBloblangV2ParseYAML(t *testing.T) { + got := runBloblangV2(t, `output = input.parse_yaml()`, "foo: bar\nbaz: 42\n") + assert.Equal(t, map[string]any{"foo": "bar", "baz": int64(42)}, got) +} + +func TestBloblangV2FormatYAML(t *testing.T) { + got := runBloblangV2(t, + `output = input.format_yaml().string()`, + map[string]any{"foo": "bar"}, + ) + assert.Equal(t, "foo: bar\n", got) +} + +func TestBloblangV2ParseCSVHeaderRow(t *testing.T) { + got := runBloblangV2(t, + `output = input.parse_csv()`, + "name,age\nalice,30\nbob,40", + ) + expected := []any{ + map[string]any{"name": "alice", "age": "30"}, + map[string]any{"name": "bob", "age": "40"}, + } + assert.Equal(t, expected, got) +} + +func TestBloblangV2ParseCSVNoHeader(t *testing.T) { + got := runBloblangV2(t, + `output = input.parse_csv(parse_header_row: false)`, + "a,b\nc,d", + ) + expected := []any{ + []any{"a", "b"}, + []any{"c", "d"}, + } + assert.Equal(t, expected, got) +} + +func TestBloblangV2ParseCSVCustomDelimiter(t *testing.T) { + got := runBloblangV2(t, + `output = input.parse_csv(delimiter: "|")`, + "foo|bar\n1|2", + ) + expected := []any{ + map[string]any{"foo": "1", "bar": "2"}, + } + assert.Equal(t, expected, got) +} + +func TestBloblangV2ParseCSVRejectsMultiCharDelimiter(t *testing.T) { + // Named args bypass V2's static-arg folding so the constructor runs per + // query rather than at parse time. The error therefore surfaces from + // Query, not Parse. + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.parse_csv(delimiter: "::")`) + require.NoError(t, err) + _, err = exec.Query("a,b\n1,2") + require.Error(t, err) + assert.Contains(t, err.Error(), "exactly one character") +} + +func TestBloblangV2JSONSchemaPasses(t *testing.T) { + schema := `{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}}}` + mapping := `output = input.json_schema("` + schema + `")` + got := runBloblangV2(t, mapping, map[string]any{"name": "alice"}) + assert.Equal(t, map[string]any{"name": "alice"}, got) +} + +func TestBloblangV2JSONSchemaFails(t *testing.T) { + schema := `{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"}},\"required\":[\"name\"]}` + mapping := `output = input.json_schema("` + schema + `")` + exec, err := bloblangv2.GlobalEnvironment().Parse(mapping) + require.NoError(t, err) + _, err = exec.Query(map[string]any{"age": int64(30)}) + require.Error(t, err) + assert.True(t, + strings.Contains(err.Error(), "name") || + strings.Contains(err.Error(), "required"), + "expected error mentioning name or required, got: %v", err, + ) +} diff --git a/internal/impl/pure/bloblangv2_regex.go b/internal/impl/pure/bloblangv2_regex.go new file mode 100644 index 000000000..2a75cab10 --- /dev/null +++ b/internal/impl/pure/bloblangv2_regex.go @@ -0,0 +1,128 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "fmt" + "regexp" + "strconv" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 regex methods that operate on a string receiver. + +func init() { + bloblangv2.MustRegisterMethod("re_replace", + bloblangv2.NewPluginSpec(). + Category("Regex"). + Description("Alias for re_replace_all — replaces every regex match with the given value, supporting capture-group references like $1, $2."). + Param(bloblangv2.NewStringParam("pattern").Description("Regular expression pattern.")). + Param(bloblangv2.NewStringParam("value").Description("Replacement string. Supports $1, $2, ... capture-group references.")), + reReplaceAllV2Ctor, + ) + + bloblangv2.MustRegisterMethod("re_find_object", + bloblangv2.NewPluginSpec(). + Category("Regex"). + Description(`Finds the first regex match and returns it as an object whose keys are named capture groups (or numeric indices when the group has no name). Key "0" is the full match.`). + Param(bloblangv2.NewStringParam("pattern").Description("Regular expression pattern.")), + reFindObjectV2Ctor(false), + ) + + bloblangv2.MustRegisterMethod("re_find_all_object", + bloblangv2.NewPluginSpec(). + Category("Regex"). + Description("Finds every regex match and returns an array of objects keyed by named capture groups (or numeric indices). Each object's `0` key is the full match."). + Param(bloblangv2.NewStringParam("pattern").Description("Regular expression pattern.")), + reFindObjectV2Ctor(true), + ) + + bloblangv2.MustRegisterMethod("re_find_all_submatch", + bloblangv2.NewPluginSpec(). + Category("Regex"). + Description("Finds every regex match and returns an array of arrays — each inner array is the full match followed by its capture groups."). + Param(bloblangv2.NewStringParam("pattern").Description("Regular expression pattern.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + re, err := compileRegexArg(args) + if err != nil { + return nil, err + } + return bloblangv2.StringMethod(func(s string) (any, error) { + groups := re.FindAllStringSubmatch(s, -1) + out := make([]any, 0, len(groups)) + for _, m := range groups { + row := make([]any, len(m)) + for i, v := range m { + row[i] = v + } + out = append(out, row) + } + return out, nil + }), nil + }, + ) +} + +func reReplaceAllV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + re, err := compileRegexArg(args) + if err != nil { + return nil, err + } + with, err := args.GetString("value") + if err != nil { + return nil, err + } + return bloblangv2.StringMethod(func(s string) (any, error) { + return re.ReplaceAllString(s, with), nil + }), nil +} + +func reFindObjectV2Ctor(all bool) bloblangv2.MethodConstructor { + return func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + re, err := compileRegexArg(args) + if err != nil { + return nil, err + } + groups := re.SubexpNames() + for i, k := range groups { + if k == "" { + groups[i] = strconv.Itoa(i) + } + } + if all { + return bloblangv2.StringMethod(func(s string) (any, error) { + matches := re.FindAllStringSubmatch(s, -1) + out := make([]any, 0, len(matches)) + for _, m := range matches { + obj := make(map[string]any, len(groups)) + for i, v := range m { + obj[groups[i]] = v + } + out = append(out, obj) + } + return out, nil + }), nil + } + return bloblangv2.StringMethod(func(s string) (any, error) { + match := re.FindStringSubmatch(s) + obj := make(map[string]any, len(groups)) + for i, v := range match { + obj[groups[i]] = v + } + return obj, nil + }), nil + } +} + +func compileRegexArg(args *bloblangv2.ParsedParams) (*regexp.Regexp, error) { + pat, err := args.GetString("pattern") + if err != nil { + return nil, err + } + re, err := regexp.Compile(pat) + if err != nil { + return nil, fmt.Errorf("invalid regex pattern: %w", err) + } + return re, nil +} diff --git a/internal/impl/pure/bloblangv2_regex_test.go b/internal/impl/pure/bloblangv2_regex_test.go new file mode 100644 index 000000000..cd43b4279 --- /dev/null +++ b/internal/impl/pure/bloblangv2_regex_test.go @@ -0,0 +1,57 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestBloblangV2ReReplace(t *testing.T) { + got := runBloblangV2(t, + `output = input.re_replace("ADD ([0-9]+)", "+($1)")`, + "foo ADD 70 ADD 1", + ) + assert.Equal(t, "foo +(70) +(1)", got) +} + +func TestBloblangV2ReFindObject(t *testing.T) { + got := runBloblangV2(t, + `output = input.re_find_object("a(?Px*)b")`, + "-axxb-ab-", + ) + assert.Equal(t, map[string]any{"0": "axxb", "foo": "xx"}, got) +} + +func TestBloblangV2ReFindAllObject(t *testing.T) { + got := runBloblangV2(t, + `output = input.re_find_all_object("a(?Px*)b")`, + "-axxb-ab-", + ) + assert.Equal(t, []any{ + map[string]any{"0": "axxb", "foo": "xx"}, + map[string]any{"0": "ab", "foo": ""}, + }, got) +} + +func TestBloblangV2ReFindAllSubmatch(t *testing.T) { + got := runBloblangV2(t, + `output = input.re_find_all_submatch("a(x*)b")`, + "-axxb-ab-", + ) + assert.Equal(t, []any{ + []any{"axxb", "xx"}, + []any{"ab", ""}, + }, got) +} + +func TestBloblangV2ReFindAllSubmatchEmpty(t *testing.T) { + got := runBloblangV2(t, + `output = input.re_find_all_submatch("a(x*)b")`, + "nothing matches here", + ) + assert.Equal(t, []any{}, got) +} diff --git a/internal/impl/pure/bloblangv2_string.go b/internal/impl/pure/bloblangv2_string.go new file mode 100644 index 000000000..43b048cb5 --- /dev/null +++ b/internal/impl/pure/bloblangv2_string.go @@ -0,0 +1,250 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "fmt" + "html" + "net/url" + "path/filepath" + "strconv" + "strings" + + "golang.org/x/text/cases" + "golang.org/x/text/language" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 string methods that don't depend on outside state. The +// public/bloblangv2 typed wrappers are strict about receiver types — callers +// whose upstream value isn't already a string should chain .string() first. +// See PARITY.md for the broader plan. + +func init() { + bloblangv2.MustRegisterMethod("capitalize", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description("Converts the first letter of each word in a string to uppercase (title case)."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + titler := cases.Title(language.English) + return bloblangv2.StringMethod(func(s string) (any, error) { + return titler.String(s), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("escape_html", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description(`Escapes special HTML characters ("<", ">", "&", "'", "\"") to make a string safe for HTML output.`), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return html.EscapeString(s), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("unescape_html", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description("Converts HTML entities back to their original characters. Handles named, decimal, and hexadecimal entities."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return html.UnescapeString(s), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("escape_url_query", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description(`Encodes a string for safe use in URL query parameters. Converts spaces to "+" and special characters to percent-encoded values.`), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return url.QueryEscape(s), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("unescape_url_query", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description(`Decodes URL query parameter encoding, converting "+" to spaces and percent-encoded characters to their original values.`), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return url.QueryUnescape(s) + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("quote", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description("Wraps a string in double quotes and escapes special characters (newlines, tabs, etc.) using Go escape sequences."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return strconv.Quote(s), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("unquote", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description(`Removes surrounding quotes and interprets escape sequences (\n, \t, etc.) to their literal characters.`), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + return strconv.Unquote(s) + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("replace", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description(`Replaces all occurrences of a substring with another string. Equivalent to replace_all, retained for V1 parity.`). + Param(bloblangv2.NewStringParam("old").Description("A string to match against.")). + Param(bloblangv2.NewStringParam("new").Description("A string to replace with.")), + replaceAllV2Ctor, + ) + + bloblangv2.MustRegisterMethod("replace_many", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description("Performs multiple find-and-replace operations in sequence using an array of [old, new] pairs (alternating elements)."). + Param(bloblangv2.NewAnyParam("values").Description("An array of strings — each even-indexed entry is replaced with the following odd-indexed entry.")), + replaceManyV2Ctor, + ) + + bloblangv2.MustRegisterMethod("replace_all_many", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description("Performs multiple find-and-replace operations in sequence. Equivalent to replace_many, retained for V1 parity."). + Param(bloblangv2.NewAnyParam("values").Description("An array of strings — each even-indexed entry is replaced with the following odd-indexed entry.")), + replaceManyV2Ctor, + ) + + bloblangv2.MustRegisterMethod("filepath_join", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description("Combines an array of path components into a single OS-specific file path."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.ArrayMethod(func(arr []any) (any, error) { + parts := make([]string, 0, len(arr)) + for i, ele := range arr { + s, ok := ele.(string) + if !ok { + return nil, fmt.Errorf("path element %d: expected string, got %T", i, ele) + } + parts = append(parts, s) + } + return filepath.Join(parts...), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("filepath_split", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description("Separates a file path into a [directory, filename] pair."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + dir, file := filepath.Split(s) + return []any{dir, file}, nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("parse_url", + bloblangv2.NewPluginSpec(). + Category("Parsing"). + Description("Parses a URL string into a structured result with fields scheme, host, path, raw_query, fragment, etc., and an optional user object."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + u, err := url.Parse(s) + if err != nil { + return nil, err + } + out := map[string]any{ + "scheme": u.Scheme, + "opaque": u.Opaque, + "host": u.Host, + "path": u.Path, + "raw_path": u.RawPath, + "raw_query": u.RawQuery, + "fragment": u.Fragment, + "raw_fragment": u.RawFragment, + } + if u.User != nil { + user := map[string]any{"name": u.User.Username()} + if pass, ok := u.User.Password(); ok { + user["password"] = pass + } + out["user"] = user + } + return out, nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("format", + bloblangv2.NewPluginSpec(). + Category("Strings"). + Description(`Formats the receiver string with Go's printf-style verbs (%s, %d, %v, ...) using the supplied argument array. V2 takes a single array argument because variadic parameters are not part of the V2 spec.`). + Param(bloblangv2.NewAnyParam("args").Description("Array of arguments to substitute into the format verbs.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + raw, err := args.Get("args") + if err != nil { + return nil, err + } + vals, ok := raw.([]any) + if !ok { + return nil, fmt.Errorf("expected an array of format arguments, got %T", raw) + } + return bloblangv2.StringMethod(func(format string) (any, error) { + return fmt.Sprintf(format, vals...), nil + }), nil + }, + ) +} + +func replaceAllV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + oldStr, err := args.GetString("old") + if err != nil { + return nil, err + } + newStr, err := args.GetString("new") + if err != nil { + return nil, err + } + return bloblangv2.StringMethod(func(s string) (any, error) { + return strings.ReplaceAll(s, oldStr, newStr), nil + }), nil +} + +func replaceManyV2Ctor(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + raw, err := args.Get("values") + if err != nil { + return nil, err + } + items, ok := raw.([]any) + if !ok { + return nil, fmt.Errorf("expected array argument, got %T", raw) + } + if len(items)%2 != 0 { + return nil, fmt.Errorf("invalid arg, replacements should be in [old, new] pairs and must therefore be even: %v", items) + } + pairs := make([]string, 0, len(items)) + for i, ele := range items { + s, ok := ele.(string) + if !ok { + return nil, fmt.Errorf("replacement value at index %d: expected string, got %T", i, ele) + } + pairs = append(pairs, s) + } + rep := strings.NewReplacer(pairs...) + return bloblangv2.StringMethod(func(s string) (any, error) { + return rep.Replace(s), nil + }), nil +} diff --git a/internal/impl/pure/bloblangv2_string_test.go b/internal/impl/pure/bloblangv2_string_test.go new file mode 100644 index 000000000..8e8475093 --- /dev/null +++ b/internal/impl/pure/bloblangv2_string_test.go @@ -0,0 +1,178 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func runBloblangV2(t *testing.T, mapping string, input any) any { + t.Helper() + exec, err := bloblangv2.GlobalEnvironment().Parse(mapping) + require.NoError(t, err) + out, err := exec.Query(input) + require.NoError(t, err) + return out +} + +func TestBloblangV2StringPlugins(t *testing.T) { + cases := []struct { + name string + mapping string + input any + want any + }{ + { + name: "capitalize", + mapping: `output = input.capitalize()`, + input: "the foo bar", + want: "The Foo Bar", + }, + { + name: "escape_html", + mapping: `output = input.escape_html()`, + input: "foo & bar", + want: "foo & bar", + }, + { + name: "unescape_html", + mapping: `output = input.unescape_html()`, + input: "foo & bar", + want: "foo & bar", + }, + { + name: "escape_url_query", + mapping: `output = input.escape_url_query()`, + input: "foo & bar", + want: "foo+%26+bar", + }, + { + name: "unescape_url_query", + mapping: `output = input.unescape_url_query()`, + input: "foo+%26+bar", + want: "foo & bar", + }, + { + name: "quote", + mapping: `output = input.quote()`, + input: "foo\nbar", + want: `"foo\nbar"`, + }, + { + name: "unquote", + mapping: `output = input.unquote()`, + input: `"foo\nbar"`, + want: "foo\nbar", + }, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := runBloblangV2(t, tc.mapping, tc.input) + assert.Equal(t, tc.want, got) + }) + } +} + +func TestBloblangV2StringPluginsExtended(t *testing.T) { + cases := []struct { + name string + mapping string + input any + want any + }{ + { + name: "replace", + mapping: `output = input.replace("foo", "bar")`, + input: "foo and foo", + want: "bar and bar", + }, + { + name: "replace_many", + mapping: `output = input.replace_many(["", "", "", ""])`, + input: "hi", + want: "hi", + }, + { + name: "replace_all_many", + mapping: `output = input.replace_all_many(["a", "A", "b", "B"])`, + input: "abab", + want: "ABAB", + }, + { + name: "filepath_split", + mapping: `output = input.filepath_split()`, + input: "/etc/hosts", + want: []any{"/etc/", "hosts"}, + }, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + got := runBloblangV2(t, tc.mapping, tc.input) + assert.Equal(t, tc.want, got) + }) + } +} + +func TestBloblangV2FilepathJoin(t *testing.T) { + got := runBloblangV2(t, `output = input.filepath_join()`, []any{"/etc", "hosts"}) + assert.Equal(t, "/etc/hosts", got) +} + +func TestBloblangV2ParseURL(t *testing.T) { + got := runBloblangV2(t, + `output = input.parse_url()`, + "amqp://foo:bar@127.0.0.1:5672/path?q=1#frag", + ) + m, ok := got.(map[string]any) + require.True(t, ok) + assert.Equal(t, "amqp", m["scheme"]) + assert.Equal(t, "127.0.0.1:5672", m["host"]) + assert.Equal(t, "/path", m["path"]) + assert.Equal(t, "q=1", m["raw_query"]) + assert.Equal(t, "frag", m["fragment"]) + user, ok := m["user"].(map[string]any) + require.True(t, ok) + assert.Equal(t, "foo", user["name"]) + assert.Equal(t, "bar", user["password"]) +} + +func TestBloblangV2ReplaceManyOddArgs(t *testing.T) { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.replace_many(["a", "b", "c"])`) + require.NoError(t, err) + _, err = exec.Query("hello") + require.Error(t, err) +} + +func TestBloblangV2StringPluginsRejectNonString(t *testing.T) { + // V2 typed wrappers are strict — non-string receivers should error + // rather than be silently coerced. Mirrors the public/bloblangv2 + // StringMethod contract documented on its godoc. + _, err := bloblangv2.GlobalEnvironment().Parse(`output = input.capitalize()`) + require.NoError(t, err) + exec, _ := bloblangv2.GlobalEnvironment().Parse(`output = input.capitalize()`) + _, err = exec.Query(int64(42)) + require.Error(t, err) + assert.Contains(t, err.Error(), "string") +} + +func TestBloblangV2FormatPrintf(t *testing.T) { + got := runBloblangV2(t, + `output = "%s(%v): %v".format([input.name, input.age, input.fingers])`, + map[string]any{"name": "lance", "age": int64(37), "fingers": int64(13)}, + ) + assert.Equal(t, "lance(37): 13", got) +} + +func TestBloblangV2FormatRejectsNonArray(t *testing.T) { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = "%s".format(input)`) + require.NoError(t, err) + _, qerr := exec.Query("not-an-array") + require.Error(t, qerr) +} diff --git a/internal/impl/pure/bloblangv2_time.go b/internal/impl/pure/bloblangv2_time.go new file mode 100644 index 000000000..a82bbaa0f --- /dev/null +++ b/internal/impl/pure/bloblangv2_time.go @@ -0,0 +1,113 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure + +import ( + "fmt" + "time" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" +) + +// V2 ports of V1 pure timestamp/duration helpers. The V2 stdlib already +// covers ts_format, ts_parse, ts_unix*, ts_add, ts_from_unix*; this file +// fills in the V1 plugin-registered remainders that don't read the wall +// clock or randomness. Wall-clock helpers stay in internal/impl/io. + +func init() { + bloblangv2.MustRegisterMethod("parse_duration", + bloblangv2.NewPluginSpec(). + Category("Time"). + Description("Parses a Go duration string (e.g. \"1h30m\") and returns the duration as int64 nanoseconds."), + func(_ *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + return bloblangv2.StringMethod(func(s string) (any, error) { + d, err := time.ParseDuration(s) + if err != nil { + return nil, err + } + return d.Nanoseconds(), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("ts_round", + bloblangv2.NewPluginSpec(). + Category("Time"). + Description("Rounds a timestamp to the nearest multiple of the supplied duration in nanoseconds. Halfway values round up."). + Param(bloblangv2.NewInt64Param("duration").Description("Duration in nanoseconds.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + d, err := args.GetInt64("duration") + if err != nil { + return nil, err + } + dur := time.Duration(d) + return bloblangv2.TimestampMethod(func(t time.Time) (any, error) { + return t.Round(dur), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("ts_tz", + bloblangv2.NewPluginSpec(). + Category("Time"). + Description("Returns the receiver timestamp expressed in a different timezone (the moment in time is preserved). Use \"UTC\", \"Local\", or an IANA Time Zone name like \"America/New_York\"."). + Param(bloblangv2.NewStringParam("tz").Description("Target timezone name.")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + tzName, err := args.GetString("tz") + if err != nil { + return nil, err + } + loc, err := time.LoadLocation(tzName) + if err != nil { + return nil, fmt.Errorf("failed to parse timezone %q: %w", tzName, err) + } + return bloblangv2.TimestampMethod(func(t time.Time) (any, error) { + return t.In(loc), nil + }), nil + }, + ) + + bloblangv2.MustRegisterMethod("ts_sub", + bloblangv2.NewPluginSpec(). + Category("Time"). + Description("Returns the duration in nanoseconds between the receiver timestamp and the t2 argument (receiver - t2). Positive when the receiver is after t2; negative otherwise. Use .abs() for absolute duration."). + Param(bloblangv2.NewAnyParam("t2").Description("Timestamp to subtract from the receiver. Accepts a time.Time, an RFC 3339 string, or a unix timestamp in seconds (int64 or float64).")), + func(args *bloblangv2.ParsedParams) (bloblangv2.Method, error) { + rawT2, err := args.Get("t2") + if err != nil { + return nil, err + } + t2, err := coerceTimestamp(rawT2) + if err != nil { + return nil, fmt.Errorf("t2: %w", err) + } + return bloblangv2.TimestampMethod(func(t time.Time) (any, error) { + return t.Sub(t2).Nanoseconds(), nil + }), nil + }, + ) +} + +// coerceTimestamp accepts the common timestamp surface forms and returns +// a time.Time. RFC 3339 strings and unix-seconds numerics are honoured; +// already-parsed time.Time passes through. +func coerceTimestamp(v any) (time.Time, error) { + switch n := v.(type) { + case time.Time: + return n, nil + case string: + t, err := time.Parse(time.RFC3339Nano, n) + if err != nil { + return time.Time{}, fmt.Errorf("expected RFC 3339 timestamp, got %q", n) + } + return t, nil + case int64: + return time.Unix(n, 0).UTC(), nil + case int: + return time.Unix(int64(n), 0).UTC(), nil + case float64: + whole, frac := int64(n), n-float64(int64(n)) + return time.Unix(whole, int64(frac*1e9)).UTC(), nil + } + return time.Time{}, fmt.Errorf("expected timestamp value, got %T", v) +} diff --git a/internal/impl/pure/bloblangv2_time_test.go b/internal/impl/pure/bloblangv2_time_test.go new file mode 100644 index 000000000..1dffbfb5d --- /dev/null +++ b/internal/impl/pure/bloblangv2_time_test.go @@ -0,0 +1,81 @@ +// Copyright 2026 Redpanda Data, Inc. + +package pure_test + +import ( + "testing" + "time" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/redpanda-data/benthos/v4/public/bloblangv2" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestBloblangV2ParseDuration(t *testing.T) { + got := runBloblangV2(t, `output = input.parse_duration()`, "1h30m") + assert.Equal(t, (1*time.Hour + 30*time.Minute).Nanoseconds(), got) +} + +func TestBloblangV2ParseDurationInvalid(t *testing.T) { + exec, err := bloblangv2.GlobalEnvironment().Parse(`output = input.parse_duration()`) + require.NoError(t, err) + _, qerr := exec.Query("not a duration") + assert.Error(t, qerr) +} + +func TestBloblangV2TsRoundToNearestHour(t *testing.T) { + ts := time.Date(2020, 8, 14, 5, 54, 23, 0, time.UTC) + got := runBloblangV2(t, `output = input.ts_round("1h".parse_duration())`, ts) + want := time.Date(2020, 8, 14, 6, 0, 0, 0, time.UTC) + assert.Equal(t, want, got) +} + +func TestBloblangV2TsTZConvertsTimezone(t *testing.T) { + ts := time.Date(2021, 2, 3, 16, 5, 6, 0, time.UTC) + got := runBloblangV2(t, `output = input.ts_tz("America/New_York")`, ts) + gotTime, ok := got.(time.Time) + require.True(t, ok, "expected time.Time, got %T", got) + // The instant in time is preserved. + assert.True(t, gotTime.Equal(ts)) + // And the wall-clock representation moves to the new zone. + assert.Equal(t, "America/New_York", gotTime.Location().String()) +} + +func TestBloblangV2TsTZUnknownZoneErrors(t *testing.T) { + // V2 folds plugin constructors at parse time when the args are + // literals, so an unknown zone surfaces as a parse error rather + // than a runtime error. + _, err := bloblangv2.GlobalEnvironment().Parse(`output = input.ts_tz("Not/A_Zone")`) + assert.Error(t, err) +} + +func TestBloblangV2TsSubReturnsNanoseconds(t *testing.T) { + end := time.Date(2020, 8, 14, 11, 30, 0, 0, time.UTC) + got := runBloblangV2(t, + `output = input.ts_sub("2020-08-14T10:00:00Z")`, + end, + ) + assert.Equal(t, (90 * time.Minute).Nanoseconds(), got) +} + +func TestBloblangV2TsSubAcceptsTimestamp(t *testing.T) { + end := time.Date(2020, 8, 14, 11, 30, 0, 0, time.UTC) + start := time.Date(2020, 8, 14, 10, 0, 0, 0, time.UTC) + got := runBloblangV2(t, + `output = input.end_time.ts_sub(input.start_time)`, + map[string]any{"end_time": end, "start_time": start}, + ) + assert.Equal(t, (90 * time.Minute).Nanoseconds(), got) +} + +func TestBloblangV2TsSubNegativeWhenReceiverEarlier(t *testing.T) { + earlier := time.Date(2020, 8, 13, 5, 54, 23, 0, time.UTC) + got := runBloblangV2(t, + `output = input.ts_sub("2020-08-14T05:54:23Z")`, + earlier, + ) + assert.Equal(t, -(24 * time.Hour).Nanoseconds(), got) +} From 5574232a3301105f60fba8c0cdeb745899a04590 Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Tue, 5 May 2026 15:47:27 +0100 Subject: [PATCH 19/20] bloblang(v2): Add public/service config migrator Adds public/service/migrator/, a config-level migrator that walks parsed Benthos YAML configurations, locates bloblang fields and the bloblang processor (including instances nested inside switch, branch, processor_resources, and cache_resources), and applies the public/bloblangv2 mapping migrator to each. Includes a lazy import resolver so transitively-imported V1 mapping files are translated once and rewritten in place, a from-only rule that guards against silently broken rewrites, support for emitting the bloblang_v2_file processor for file-backed mappings, a structured report API surfacing per-field outcomes, and pluggable rules for plugin-specific rewrites. Integration tests exercise multi-config migration, diamond imports, mixed V1/V2 configs, and partial-failure behaviour. --- public/service/migrator/builtins.go | 102 +++ public/service/migrator/component.go | 72 ++ public/service/migrator/context.go | 94 +++ public/service/migrator/doc.go | 42 ++ public/service/migrator/example_test.go | 56 ++ public/service/migrator/imports_test.go | 308 +++++++++ public/service/migrator/integration_test.go | 703 ++++++++++++++++++++ public/service/migrator/migrator.go | 138 ++++ public/service/migrator/migrator_test.go | 654 ++++++++++++++++++ public/service/migrator/options.go | 58 ++ public/service/migrator/report.go | 144 ++++ public/service/migrator/rule.go | 61 ++ public/service/migrator/walker.go | 176 +++++ 13 files changed, 2608 insertions(+) create mode 100644 public/service/migrator/builtins.go create mode 100644 public/service/migrator/component.go create mode 100644 public/service/migrator/context.go create mode 100644 public/service/migrator/doc.go create mode 100644 public/service/migrator/example_test.go create mode 100644 public/service/migrator/imports_test.go create mode 100644 public/service/migrator/integration_test.go create mode 100644 public/service/migrator/migrator.go create mode 100644 public/service/migrator/migrator_test.go create mode 100644 public/service/migrator/options.go create mode 100644 public/service/migrator/report.go create mode 100644 public/service/migrator/rule.go create mode 100644 public/service/migrator/walker.go diff --git a/public/service/migrator/builtins.go b/public/service/migrator/builtins.go new file mode 100644 index 000000000..255bfc964 --- /dev/null +++ b/public/service/migrator/builtins.go @@ -0,0 +1,102 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// targetBloblangV2 is the V2 processor name registered in +// internal/impl/pure/processor_bloblang_v2.go. It uses an underscore +// (matching Benthos's snake_case plugin naming convention) — not +// "bloblangv2". +const targetBloblangV2 = "bloblang_v2" + +// targetBloblangV2File is the V2 file-backed processor name +// registered in internal/impl/pure/processor_bloblang_v2_file.go. +// V1 `from "path"` bodies migrate to this processor with the path +// rewritten via Options.BloblangV2ImportPathRewriter. +const targetBloblangV2File = "bloblang_v2_file" + +// builtInRules returns the rules registered automatically on every +// Migrator constructed via New. They migrate the three V1 mapping +// processors to bloblang_v2 (or bloblang_v2_file when the body is a +// single `from "path"` statement), threading each mapping body +// through the bundled Bloblang V1->V2 translator. +func builtInRules() map[Target]Rule { + return map[Target]Rule{ + {ComponentType: "processor", Name: "bloblang"}: bloblangProcessorRule(bloblmig.ModeMapping), + {ComponentType: "processor", Name: "mapping"}: bloblangProcessorRule(bloblmig.ModeMapping), + {ComponentType: "processor", Name: "mutation"}: bloblangProcessorRule(bloblmig.ModeMutation), + } +} + +// bloblangProcessorRule is the shared rule body for the three V1 +// mapping processors. The only per-processor variation is the +// translator Mode: `mapping` and `bloblang` use ModeMapping (the +// translator prepends `output = input` so unwritten fields pass +// through unchanged), while `mutation` uses ModeMutation (no prelude +// — V2's empty `output` matches V1's `mutation` semantics). +// +// Bodies that consist of a single `from "path"` statement are +// rewritten to bloblang_v2_file, preserving the user's file-factoring +// intent. The bloblang migrator still walks the file's V1 content +// (via the closure walker) so the referenced file gets translated to +// V2 and surfaced in Report.BloblangV2Files. +func bloblangProcessorRule(mode bloblmig.Mode) Rule { + return func(ctx *Context, c *Component) Result { + body, ok := c.BodyString() + if !ok { + return ctx.Unsupported("expected scalar string body for " + c.Name + " processor") + } + if v1Path, ok := bloblmig.IsFromOnly(body); ok { + // Migrate the body through the bloblang migrator so the + // closure walker pulls in the referenced file and emits + // its V2 translation in Report.V2Files. The migrated body + // itself is not used; we replace the processor with the + // file-backed form pointing at the rewritten path. + _, rep, err := ctx.MigrateBloblang(body, mode) + if err != nil { + return ctx.Unsupported(err.Error()) + } + // Even when MigrateBloblang returns success, the report + // can still carry a RuleFromStatement Error change when + // the target file couldn't be resolved AND coverage + // gating didn't escalate it (e.g. callers running with + // MinCoverage=0). Without this guard the rule would emit + // `bloblang_v2_file: ` pointing at a file that + // won't exist in Report.BloblangV2Files — a silently + // broken migrated config. + if reportHasUnresolvedFrom(rep) { + return ctx.Unsupported("from target " + v1Path + " could not be resolved") + } + v2Path := v1Path + if rewriter := ctx.bloblangOpts.V2ImportPathRewriter; rewriter != nil { + v2Path = rewriter(v1Path) + } + return ctx.ReplaceWithBloblangReport(targetBloblangV2File, v2Path, rep) + } + v2Source, rep, err := ctx.MigrateBloblang(body, mode) + if err != nil { + return ctx.Unsupported(err.Error()) + } + return ctx.ReplaceWithBloblangReport(targetBloblangV2, v2Source, rep) + } +} + +// reportHasUnresolvedFrom reports whether rep contains an +// Error-severity RuleFromStatement change — the marker the bloblang +// migrator emits when a from-only target couldn't be resolved. +// Errors are never filtered by Verbose so this signal is reliable +// regardless of the caller's BloblangOptions. +func reportHasUnresolvedFrom(rep *bloblmig.Report) bool { + if rep == nil { + return false + } + for _, c := range rep.Changes { + if c.RuleID == bloblmig.RuleFromStatement && c.Severity == bloblmig.SeverityError { + return true + } + } + return false +} diff --git a/public/service/migrator/component.go b/public/service/migrator/component.go new file mode 100644 index 000000000..17263c70e --- /dev/null +++ b/public/service/migrator/component.go @@ -0,0 +1,72 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + "fmt" + + "gopkg.in/yaml.v3" +) + +// Component describes a matched plugin instance. It is passed to a +// Rule as a read-only view; mutation happens through the Result +// returned by the rule (Context.Replace). +type Component struct { + // Type is the core component family, e.g. "processor". + Type string + // Name is the plugin name within Type, e.g. "bloblang". + Name string + // Path is the dotted location of this component within the config + // (e.g. "pipeline.processors.0"). + Path string + // Label is the YAML `label` value of the component, or "" if absent. + Label string + // LineStart and LineEnd are the 1-indexed line span of the + // component container in the source YAML. + LineStart, LineEnd int + + // container is the YAML mapping node holding the plugin's + // `name: body` pair (and optional sibling fields like `label`). + // Owned by the migrator; rules MUST NOT mutate it directly. + container *yaml.Node +} + +// BodyString returns the plugin's body as a string when the body is a +// YAML scalar (e.g. the `bloblang`, `mapping`, `mutation` and +// `bloblang_v2` processors all take a scalar string). Returns ok=false +// if the body is structured. +func (c *Component) BodyString() (string, bool) { + v := c.bodyNode() + if v == nil || v.Kind != yaml.ScalarNode { + return "", false + } + return v.Value, true +} + +// BodyAny decodes the plugin's body into an arbitrary Go value. +func (c *Component) BodyAny() (any, error) { + v := c.bodyNode() + if v == nil { + return nil, fmt.Errorf("plugin %q has no body", c.Name) + } + var out any + if err := v.Decode(&out); err != nil { + return nil, err + } + return out, nil +} + +// bodyNode returns the YAML value node for the plugin's body, or nil +// if not present. The component container is a mapping with key/value +// pairs; we look for the entry whose key matches Name. +func (c *Component) bodyNode() *yaml.Node { + if c.container == nil { + return nil + } + for i := 0; i+1 < len(c.container.Content); i += 2 { + if c.container.Content[i].Value == c.Name { + return c.container.Content[i+1] + } + } + return nil +} diff --git a/public/service/migrator/context.go b/public/service/migrator/context.go new file mode 100644 index 000000000..82692e9a8 --- /dev/null +++ b/public/service/migrator/context.go @@ -0,0 +1,94 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// Context is the helper handle a Rule receives. It exposes the +// migrator's bundled Bloblang V1->V2 translator (so rules can rewrite +// embedded mapping bodies the same way the built-in rules do) plus +// the Result constructors. +type Context struct { + bloblang *bloblmig.Migrator + bloblangOpts bloblmig.Options +} + +// Bloblang returns the Bloblang V1->V2 migrator wired into this +// Migrator (either the default or the one supplied via +// Options.BloblangMigrator). Use it from a custom rule when the +// plugin's config contains a Bloblang V1 mapping that should be +// translated to V2 alongside the plugin rename. +func (c *Context) Bloblang() *bloblmig.Migrator { + return c.bloblang +} + +// MigrateBloblang is a convenience that runs the bundled Bloblang +// V1->V2 migrator with the supplied mode and the per-call options +// configured on this Migrator (Verbose, MinCoverage, ...). It returns +// the V2 source and the Bloblang report; the report is attached to +// the outer Change when the result is used in a Replace. +func (c *Context) MigrateBloblang(v1Source string, mode bloblmig.Mode) (string, *bloblmig.Report, error) { + opts := c.bloblangOpts + opts.Mode = mode + rep, err := c.bloblang.Migrate(v1Source, opts) + if err != nil { + return "", nil, err + } + return rep.V2Mapping, rep, nil +} + +// Replace returns a Result that swaps the matched plugin for a new +// one. newName is the plugin name to register under (e.g. +// "bloblang_v2") and newBody is the scalar body for the new plugin. +// +// For structured replacements (where the new plugin's config is not a +// scalar string), use ReplaceStructured. +func (c *Context) Replace(newName, newBody string) Result { + return Result{ + kind: resultReplace, + replacement: replacement{name: newName, body: newBody}, + } +} + +// ReplaceWithBloblangReport is the same as Replace but additionally +// attaches a Bloblang V1->V2 *Report. The outer Migrator surfaces it +// on the per-component Change record. +func (c *Context) ReplaceWithBloblangReport(newName, newBody string, report *bloblmig.Report) Result { + return Result{ + kind: resultReplace, + replacement: replacement{ + name: newName, + body: newBody, + bloblangReport: report, + }, + } +} + +// ReplaceStructured returns a Result that swaps the matched plugin +// for a new one whose body is a structured Go value (encoded to YAML +// when the migration is rendered). Use Replace for the common case of +// a scalar string body. +func (c *Context) ReplaceStructured(newName string, body any) Result { + return Result{ + kind: resultReplace, + replacement: replacement{name: newName, body: body}, + } +} + +// Skip declines to migrate the matched plugin and falls through to +// the next rule (or leaves the component untouched if no other rule +// matches). reason, if non-empty, is recorded as an Info-severity +// note on the Report. +func (c *Context) Skip(reason string) Result { + return Result{kind: resultSkip, reason: reason} +} + +// Unsupported declares that the matched plugin cannot be migrated and +// records an Error-severity Change. The component is left untouched +// so the rewritten YAML remains parseable; the user is alerted via +// the Report. +func (c *Context) Unsupported(reason string) Result { + return Result{kind: resultUnsupported, reason: reason} +} diff --git a/public/service/migrator/doc.go b/public/service/migrator/doc.go new file mode 100644 index 000000000..0bf799caa --- /dev/null +++ b/public/service/migrator/doc.go @@ -0,0 +1,42 @@ +// Copyright 2026 Redpanda Data, Inc. + +// Package migrator rewrites Benthos stream configs by replacing one +// plugin instance with another, optionally translating any embedded +// configuration (e.g. Bloblang V1 mappings) along the way. +// +// The package ships with built-in rules that rewrite the V1 +// `bloblang`, `mapping` and `mutation` processors to the V2 +// `bloblang_v2` processor, threading each mapping body through +// public/bloblangv2/migrator. Downstream repositories with their own +// plugins can register additional rules keyed by (component type, +// plugin name). +// +// # Usage +// +// mig := migrator.New() +// report, err := mig.Migrate(streamYAML, migrator.Options{}) +// if err != nil { +// return err +// } +// fmt.Println(report.OutputYAML) +// +// # Custom rules +// +// Register a rule keyed by the (ComponentType, Name) of the plugin to +// replace. The rule receives a Context and a Component describing the +// matched node, and returns a Result describing the outcome. Custom +// rules win on collision with the built-ins (the downstream rule fully +// replaces the built-in for that key). +// +// mig.RegisterRule(migrator.Target{ComponentType: "processor", Name: "old_widget"}, +// func(ctx *migrator.Context, c *migrator.Component) migrator.Result { +// body, _ := c.BodyString() +// return ctx.Replace("new_widget", body) +// }) +// +// # Stability +// +// Public types and methods follow semantic versioning. The walker +// implementation is private to the package and may evolve +// independently of the public surface. +package migrator diff --git a/public/service/migrator/example_test.go b/public/service/migrator/example_test.go new file mode 100644 index 000000000..d5c83855f --- /dev/null +++ b/public/service/migrator/example_test.go @@ -0,0 +1,56 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "fmt" + "strings" + + "github.com/redpanda-data/benthos/v4/public/service/migrator" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +// ExampleMigrate demonstrates the package-level Migrate helper +// rewriting a stream config's `bloblang` processor as `bloblang_v2`, +// with the embedded mapping translated through the bundled Bloblang +// V1->V2 migrator. +func ExampleMigrate() { + in := ` +pipeline: + processors: + - bloblang: | + root.id = this.id +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + fmt.Println("migrate:", err) + return + } + fmt.Println(strings.TrimSpace(rep.OutputYAML)) + + // Output: + // pipeline: + // processors: + // - bloblang_v2: | + // output = input + // output.id = input?.id +} + +// ExampleMigrator_RegisterRule demonstrates a downstream registering +// a rule for a fictional `old_widget` processor that has been ported +// to a new `new_widget` plugin in V2. +func ExampleMigrator_RegisterRule() { + mig := migrator.New() + mig.RegisterRule( + migrator.Target{ComponentType: "processor", Name: "old_widget"}, + func(ctx *migrator.Context, c *migrator.Component) migrator.Result { + body, ok := c.BodyString() + if !ok { + return ctx.Unsupported("expected scalar body") + } + return ctx.Replace("new_widget", body) + }, + ) + _ = mig +} diff --git a/public/service/migrator/imports_test.go b/public/service/migrator/imports_test.go new file mode 100644 index 000000000..6eda73e27 --- /dev/null +++ b/public/service/migrator/imports_test.go @@ -0,0 +1,308 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "strings" + "testing" + + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" + "github.com/redpanda-data/benthos/v4/public/service/migrator" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +// TestBloblangFileResolverForwarded — a top-level BloblangFileResolver +// is consulted when a migrated bloblang processor body contains an +// import. The resolved file appears in Report.BloblangV2Files keyed by +// canonical key. +func TestBloblangFileResolverForwarded(t *testing.T) { + helpers := `map double { root = this * 2 }` + in := ` +pipeline: + processors: + - mapping: | + import "./helpers.blobl" + root.x = 21.apply("double") +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + if importPath == "./helpers.blobl" && parentKey == "" { + return "/abs/helpers.blobl", helpers, true + } + return "", "", false + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected processor migrated to bloblang_v2:\n%s", rep.OutputYAML) + } + if rep.BloblangV2Files == nil { + t.Fatalf("expected BloblangV2Files to be populated") + } + if _, ok := rep.BloblangV2Files["/abs/helpers.blobl"]; !ok { + t.Fatalf("expected canonical-keyed import in BloblangV2Files, got: %v", v2FileKeys(rep.BloblangV2Files)) + } +} + +// TestBloblangV2ImportPathRewriterForwarded — a top-level rewriter is +// applied to the import statements emitted into the V2 mapping body, +// while BloblangV2Files keys remain canonical. +func TestBloblangV2ImportPathRewriterForwarded(t *testing.T) { + helpers := `map double { root = this * 2 }` + in := ` +pipeline: + processors: + - mapping: | + import "./helpers.blobl" + root.x = 21.apply("double") +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + return "/abs/helpers.blobl", helpers, true + }, + BloblangV2ImportPathRewriter: func(p string) string { + return strings.TrimSuffix(p, ".blobl") + ".v5.blobl" + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, `import "./helpers.v5.blobl"`) { + t.Fatalf("expected rewritten import in migrated body, got:\n%s", rep.OutputYAML) + } + if _, ok := rep.BloblangV2Files["/abs/helpers.blobl"]; !ok { + t.Fatalf("expected canonical-keyed BloblangV2Files (rewriter affects emitted source only), got: %v", v2FileKeys(rep.BloblangV2Files)) + } +} + +// TestBloblangV2FilesAggregatedAcrossComponents — two processors with +// the same import contribute one entry to BloblangV2Files (deduped by +// canonical key). +func TestBloblangV2FilesAggregatedAcrossComponents(t *testing.T) { + helpers := `map double { root = this * 2 }` + in := ` +pipeline: + processors: + - mapping: | + import "./helpers.blobl" + root.x = 21.apply("double") + - mutation: | + import "./helpers.blobl" + root.y = 42.apply("double") +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + return "/abs/helpers.blobl", helpers, true + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if got := len(rep.BloblangV2Files); got != 1 { + t.Fatalf("expected exactly 1 import file (deduped across components), got %d: %v", got, v2FileKeys(rep.BloblangV2Files)) + } +} + +// TestBloblangFileResolverHonoursExplicitBloblangOptions — when the +// caller pre-populates BloblangOptions.FileResolver directly (instead +// of using the top-level hook), it still works. Useful for callers +// who construct a fully custom *bloblmig.Options. +func TestBloblangFileResolverHonoursExplicitBloblangOptions(t *testing.T) { + helpers := `map double { root = this * 2 }` + in := ` +pipeline: + processors: + - bloblang: | + import "./helpers.blobl" + root.x = 21.apply("double") +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangOptions: bloblmig.Options{ + FileResolver: func(parentKey, importPath string) (string, string, bool) { + return "/abs/helpers.blobl", helpers, true + }, + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if _, ok := rep.BloblangV2Files["/abs/helpers.blobl"]; !ok { + t.Fatalf("expected import resolved via BloblangOptions.FileResolver, got: %v", v2FileKeys(rep.BloblangV2Files)) + } +} + +// TestTopLevelResolverShadowsBloblangOptions — if both the top-level +// hook and BloblangOptions.FileResolver are set, the top-level hook +// wins. Documents the precedence so callers know which to set. +func TestTopLevelResolverShadowsBloblangOptions(t *testing.T) { + in := ` +pipeline: + processors: + - mapping: | + import "./helpers.blobl" + root.x = "ok" +` + var topLevelCalled, embeddedCalled bool + _, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + topLevelCalled = true + return "/abs/helpers.blobl", `map noop { root = this }`, true + }, + BloblangOptions: bloblmig.Options{ + FileResolver: func(parentKey, importPath string) (string, string, bool) { + embeddedCalled = true + return "/should-not-happen", "", true + }, + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !topLevelCalled { + t.Fatalf("expected top-level BloblangFileResolver to be called") + } + if embeddedCalled { + t.Fatalf("BloblangOptions.FileResolver should be shadowed by the top-level hook") + } +} + +// TestFromOnlyBodyRewritesToBloblangV2File — a `mapping` processor +// whose body is a single `from "path"` statement is rewritten to the +// new `bloblang_v2_file` processor. The referenced file is migrated +// V1->V2 and surfaces in Report.BloblangV2Files. +func TestFromOnlyBodyRewritesToBloblangV2File(t *testing.T) { + helpers := `root.id = this.id +root.upper_name = this.name.uppercase() +` + in := ` +pipeline: + processors: + - mapping: 'from "./helpers.blobl"' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + if importPath == "./helpers.blobl" { + return "/abs/helpers.blobl", helpers, true + } + return "", "", false + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2_file:") { + t.Fatalf("expected from-only mapping to migrate to bloblang_v2_file:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "./helpers.blobl") { + t.Fatalf("expected file path preserved (no rewriter set), got:\n%s", rep.OutputYAML) + } + if rep.BloblangV2Files == nil { + t.Fatalf("expected BloblangV2Files to be populated") + } + v2 := rep.BloblangV2Files["/abs/helpers.blobl"] + if !strings.Contains(v2, "output.id") { + t.Fatalf("expected helpers.blobl translated to V2, got:\n%s", v2) + } +} + +// TestFromOnlyBodyAppliesPathRewriter — when a rewriter is set, the +// path emitted into the bloblang_v2_file processor reflects the V2 +// path (e.g. helpers.blobl -> helpers.v5.blobl). +func TestFromOnlyBodyAppliesPathRewriter(t *testing.T) { + helpers := `root.id = this.id` + in := ` +pipeline: + processors: + - bloblang: 'from "./helpers.blobl"' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + return "/abs/helpers.blobl", helpers, true + }, + BloblangV2ImportPathRewriter: func(p string) string { + return strings.TrimSuffix(p, ".blobl") + ".v5.blobl" + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "./helpers.v5.blobl") { + t.Fatalf("expected rewritten path in bloblang_v2_file processor:\n%s", rep.OutputYAML) + } +} + +// TestFromOnlyBodyMutationProcessor — same handling for the mutation +// processor (uses ModeMutation but the from-only rewrite is identical). +func TestFromOnlyBodyMutationProcessor(t *testing.T) { + helpers := `root.id = this.id` + in := ` +pipeline: + processors: + - mutation: 'from "./helpers.blobl"' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + return "/abs/helpers.blobl", helpers, true + }, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2_file:") { + t.Fatalf("expected mutation/from-only to rewrite to bloblang_v2_file:\n%s", rep.OutputYAML) + } +} + +// TestFromOnlyBodyWithUnresolvedTargetEmitsUnsupported — if the +// resolver can't satisfy the from path, the rule MUST NOT rewrite to +// bloblang_v2_file (the resulting config would point at a file that +// won't be in Report.BloblangV2Files). Instead the rule emits +// Unsupported and leaves the V1 processor untouched. +// +// This test exercises the foundation-bug case where MigrateBloblang +// returns success because BloblangOptions.MinCoverage was set to 0, +// which would otherwise let the rule blindly emit a broken rewrite. +func TestFromOnlyBodyWithUnresolvedTargetEmitsUnsupported(t *testing.T) { + in := ` +pipeline: + processors: + - mapping: 'from "./missing.blobl"' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{ + BloblangFileResolver: func(parentKey, importPath string) (string, string, bool) { + return "", "", false + }, + // Effectively disable the bloblang coverage gate (the + // migrator's applyDefaults clobbers 0 to 0.75, so we set a + // positive-but-tiny floor). This makes MigrateBloblang + // return success-with-an-unsupported-change instead of an + // error — the bug-shaped case the rule must guard. + BloblangOptions: bloblmig.Options{MinCoverage: 0.01}, + }) + if err != nil { + t.Fatalf("migrate should not fail outright: %v", err) + } + if strings.Contains(rep.OutputYAML, "bloblang_v2_file:") { + t.Fatalf("rule must not emit bloblang_v2_file when from target is unresolved:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "mapping:") { + t.Fatalf("expected V1 mapping processor preserved on Unsupported, got:\n%s", rep.OutputYAML) + } + if len(rep.Changes) != 1 || rep.Changes[0].Outcome != migrator.OutcomeUnsupported { + t.Fatalf("expected exactly one Unsupported change, got %+v", rep.Changes) + } + if !strings.Contains(rep.Changes[0].Reason, "missing.blobl") { + t.Fatalf("expected change reason to name the unresolved path, got %q", rep.Changes[0].Reason) + } +} + +func v2FileKeys(m map[string]string) []string { + out := make([]string, 0, len(m)) + for k := range m { + out = append(out, k) + } + return out +} diff --git a/public/service/migrator/integration_test.go b/public/service/migrator/integration_test.go new file mode 100644 index 000000000..3f341d7fd --- /dev/null +++ b/public/service/migrator/integration_test.go @@ -0,0 +1,703 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "os" + "path/filepath" + "strings" + "testing" + + "github.com/redpanda-data/benthos/v4/public/service" + "github.com/redpanda-data/benthos/v4/public/service/migrator" + + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +// TestEndToEndMultiConfigWithMixedImports is the full-shape integration +// test for the connect CLI's migrate-v5 use case: +// +// - multiple YAML config files in their own directory +// - each containing several bloblang/mapping/mutation processors +// - bodies span all three patterns: inline mappings with no +// imports, inline mappings with `import "path"` for named maps, +// and from-only bodies (`from "path"`) +// - import paths are relative to each YAML's directory (which is +// not the test's CWD) +// - imported `.blobl` files transitively import other `.blobl` +// files (relative to their own directory, not the YAML's) +// +// The resolver mirrors what the connect CLI will install: anchor the +// main source's imports to the YAML's directory; anchor transitive +// imports to the parent file's directory (parentKey carries this). +func TestEndToEndMultiConfigWithMixedImports(t *testing.T) { + root := t.TempDir() + configsDir := filepath.Join(root, "configs") + mappingsDir := filepath.Join(root, "mappings") + require := func(err error) { + t.Helper() + if err != nil { + t.Fatalf("unexpected error: %v", err) + } + } + require(os.MkdirAll(configsDir, 0o755)) + require(os.MkdirAll(mappingsDir, 0o755)) + + // Mapping files. helpers.blobl declares one map; identifiers.blobl + // imports helpers.blobl from a sibling directory and adds another + // map; main.blobl is a whole-mapping body intended to be `from`'d + // in to a processor body, with its own `import` of identifiers. + writeFile := func(path, content string) { + require(os.WriteFile(path, []byte(content), 0o644)) + } + + writeFile(filepath.Join(mappingsDir, "helpers.blobl"), `map upper { + root = this.uppercase() +} +`) + writeFile(filepath.Join(mappingsDir, "identifiers.blobl"), `import "./helpers.blobl" + +map normalise { + root = this.upper.apply() +} +`) + writeFile(filepath.Join(mappingsDir, "main.blobl"), `import "./identifiers.blobl" + +root.id = this.id.normalise.apply() +root.tag = this.tag +`) + + // Two YAML configs in configs/. Each references the mappings + // directory via "../mappings/", a path that is meaningless + // from the test's CWD — only valid when resolved relative to the + // YAML's own directory. + app1Path := filepath.Join(configsDir, "app1.yaml") + app1YAML := ` +pipeline: + processors: + - mutation: 'root.flag = true' + - mapping: | + import "../mappings/identifiers.blobl" + root.id = this.id.normalise.apply() + root.kind = "user" + - bloblang: 'from "../mappings/main.blobl"' +` + writeFile(app1Path, app1YAML) + + app2Path := filepath.Join(configsDir, "app2.yaml") + app2YAML := ` +pipeline: + processors: + - mapping: | + import "../mappings/helpers.blobl" + root.upper_name = this.name.upper.apply() + - mutation: 'from "../mappings/main.blobl"' +` + writeFile(app2Path, app2YAML) + + // resolverFor returns a resolver that anchors importPath to the + // supplied configDir for top-level imports (parentKey == "") and + // to the parent file's directory for transitive imports. This is + // the resolution policy the connect CLI will install. + resolverFor := func(configDir string) func(string, string) (string, string, bool) { + return func(parentKey, importPath string) (string, string, bool) { + base := configDir + if parentKey != "" { + base = filepath.Dir(parentKey) + } + abs, err := filepath.Abs(filepath.Join(base, importPath)) + if err != nil { + return "", "", false + } + content, err := os.ReadFile(abs) + if err != nil { + return "", "", false + } + return abs, string(content), true + } + } + + rewriter := func(p string) string { + return strings.TrimSuffix(p, ".blobl") + ".v5.blobl" + } + + // Migrate each YAML independently — the connect CLI iterates + // targets the same way. + type result struct { + path string + rep *migrator.Report + } + results := make([]result, 0, 2) + for _, yamlPath := range []string{app1Path, app2Path} { + yamlBytes, err := os.ReadFile(yamlPath) + require(err) + rep, err := migrator.Migrate(yamlBytes, migrator.Options{ + BloblangFileResolver: resolverFor(filepath.Dir(yamlPath)), + BloblangV2ImportPathRewriter: rewriter, + Verbose: true, + }) + require(err) + results = append(results, result{path: yamlPath, rep: rep}) + } + + app1Rep := results[0].rep + app2Rep := results[1].rep + + // app1 expectations: 3 processors all rewritten. + if got := strings.Count(app1Rep.OutputYAML, "bloblang_v2:"); got != 2 { + t.Errorf("app1: expected 2 inline bloblang_v2 processors, got %d in:\n%s", got, app1Rep.OutputYAML) + } + if !strings.Contains(app1Rep.OutputYAML, "bloblang_v2_file:") { + t.Errorf("app1: expected from-only body to migrate to bloblang_v2_file:\n%s", app1Rep.OutputYAML) + } + if !strings.Contains(app1Rep.OutputYAML, "../mappings/main.v5.blobl") { + t.Errorf("app1: expected rewritten file path on bloblang_v2_file processor:\n%s", app1Rep.OutputYAML) + } + if !strings.Contains(app1Rep.OutputYAML, "../mappings/identifiers.v5.blobl") { + t.Errorf("app1: expected rewritten import path inside V2 mapping body:\n%s", app1Rep.OutputYAML) + } + if app1Rep.Coverage.Rewritten != 3 || app1Rep.Coverage.Unsupported != 0 { + t.Errorf("app1: unexpected coverage %+v", app1Rep.Coverage) + } + + // app2 expectations: 2 processors, 1 inline + 1 file-backed. + if got := strings.Count(app2Rep.OutputYAML, "bloblang_v2:"); got != 1 { + t.Errorf("app2: expected 1 inline bloblang_v2 processor, got %d in:\n%s", got, app2Rep.OutputYAML) + } + if !strings.Contains(app2Rep.OutputYAML, "bloblang_v2_file:") { + t.Errorf("app2: expected from-only body to migrate to bloblang_v2_file:\n%s", app2Rep.OutputYAML) + } + if !strings.Contains(app2Rep.OutputYAML, "../mappings/main.v5.blobl") { + t.Errorf("app2: expected rewritten file path on bloblang_v2_file processor:\n%s", app2Rep.OutputYAML) + } + if !strings.Contains(app2Rep.OutputYAML, "../mappings/helpers.v5.blobl") { + t.Errorf("app2: expected rewritten import path inside V2 mapping body:\n%s", app2Rep.OutputYAML) + } + + // Each report's BloblangV2Files should contain canonical-keyed V2 + // translations for every .blobl file the YAML reaches transitively. + // app1 reaches: identifiers (via import), main + identifiers + helpers (via from -> import -> import). + // app2 reaches: helpers (via import), main + identifiers + helpers (via from -> import -> import). + // In both cases the closure is {helpers, identifiers, main}. + wantClosure := []string{ + filepath.Join(mappingsDir, "helpers.blobl"), + filepath.Join(mappingsDir, "identifiers.blobl"), + filepath.Join(mappingsDir, "main.blobl"), + } + for i, res := range results { + for _, want := range wantClosure { + absWant, err := filepath.Abs(want) + require(err) + if _, ok := res.rep.BloblangV2Files[absWant]; !ok { + t.Errorf("yaml[%d] %s: expected BloblangV2Files to contain %q, got keys: %v", + i, res.path, absWant, v2FileKeysSorted(res.rep.BloblangV2Files)) + } + } + } + + // The migrated identifiers.v5.blobl should contain the rewritten + // import to helpers.v5.blobl — confirming the rewriter applies + // inside transitively-translated files too. + identifiersV2 := app1Rep.BloblangV2Files[filepath.Join(mappingsDir, "identifiers.blobl")] + if !strings.Contains(identifiersV2, `import "./helpers.v5.blobl"`) { + t.Errorf("identifiers.v5.blobl should rewrite its import to helpers.v5.blobl, got:\n%s", identifiersV2) + } + + // main.blobl is a regular V1 mapping (import + root assignments), + // not a from-only file — so its V2 translation preserves the + // import structure with paths rewritten, plus assignments + // translated to the output.* form. + mainV2 := app1Rep.BloblangV2Files[filepath.Join(mappingsDir, "main.blobl")] + if !strings.Contains(mainV2, `import "./identifiers.v5.blobl"`) { + t.Errorf("main.v5.blobl should rewrite its import to identifiers.v5.blobl, got:\n%s", mainV2) + } + if !strings.Contains(mainV2, "output.id") || !strings.Contains(mainV2, "output.tag") { + t.Errorf("main.v5.blobl should translate root.* assignments to output.*, got:\n%s", mainV2) + } + + // Sanity: the test's working directory is not the configs dir. + // All path resolution went through the resolver's anchoring on + // each YAML's directory; CWD never entered the picture. + cwd, err := os.Getwd() + require(err) + if filepath.Clean(cwd) == filepath.Clean(configsDir) { + t.Fatalf("test setup invariant: cwd should differ from configsDir to prove resolver-based resolution") + } +} + +func v2FileKeysSorted(m map[string]string) []string { + out := make([]string, 0, len(m)) + for k := range m { + out = append(out, k) + } + return out +} + +// integrationEnv is a shared helper for the integration tests that +// follow. Each test gets its own tmpdir with configs/ and mappings/ +// subdirectories; the resolver mirrors the connect CLI's policy. +type integrationEnv struct { + t *testing.T + root string + configsDir string + mappingsDir string +} + +func newIntegrationEnv(t *testing.T) *integrationEnv { + t.Helper() + root := t.TempDir() + configsDir := filepath.Join(root, "configs") + mappingsDir := filepath.Join(root, "mappings") + if err := os.MkdirAll(configsDir, 0o755); err != nil { + t.Fatalf("mkdir configs: %v", err) + } + if err := os.MkdirAll(mappingsDir, 0o755); err != nil { + t.Fatalf("mkdir mappings: %v", err) + } + return &integrationEnv{t: t, root: root, configsDir: configsDir, mappingsDir: mappingsDir} +} + +func (e *integrationEnv) writeMapping(name, content string) string { + e.t.Helper() + p := filepath.Join(e.mappingsDir, name) + if err := os.WriteFile(p, []byte(content), 0o644); err != nil { + e.t.Fatalf("write mapping %s: %v", name, err) + } + return p +} + +func (e *integrationEnv) writeConfig(name, content string) string { + e.t.Helper() + p := filepath.Join(e.configsDir, name) + if err := os.WriteFile(p, []byte(content), 0o644); err != nil { + e.t.Fatalf("write config %s: %v", name, err) + } + return p +} + +// resolverFor anchors importPath to configDir for top-level imports +// (parentKey == "") and to the parent file's directory for +// transitive imports. Mirrors the resolver the connect CLI installs. +func (e *integrationEnv) resolverFor(configDir string) func(string, string) (string, string, bool) { + return func(parentKey, importPath string) (string, string, bool) { + base := configDir + if parentKey != "" { + base = filepath.Dir(parentKey) + } + abs, err := filepath.Abs(filepath.Join(base, importPath)) + if err != nil { + return "", "", false + } + content, err := os.ReadFile(abs) + if err != nil { + return "", "", false + } + return abs, string(content), true + } +} + +func v5SuffixRewriter(p string) string { + return strings.TrimSuffix(p, ".blobl") + ".v5.blobl" +} + +// TestEndToEndResourceFile confirms the migrator handles top-level +// resource definitions (processor_resources, cache_resources) the +// same way it handles pipeline.processors. Resource files are a +// distinct connect config shape — users factor shared processors +// into them — so this exercises the walker through the manager +// fields rather than the stream pipeline fields. +func TestEndToEndResourceFile(t *testing.T) { + env := newIntegrationEnv(t) + + env.writeMapping("enrich.blobl", `map enricher { + root = this.uppercase() +} +`) + env.writeMapping("tag.blobl", `root.tagged = true +root.kind = "event" +`) + + yamlPath := env.writeConfig("resources.yaml", ` +processor_resources: + - label: enrich_user + mapping: | + import "../mappings/enrich.blobl" + root.id = this.id.apply("enricher") + - label: tag_event + bloblang: 'from "../mappings/tag.blobl"' + +cache_resources: + - label: my_cache + memory: {} +`) + + yamlBytes, err := os.ReadFile(yamlPath) + if err != nil { + t.Fatalf("read yaml: %v", err) + } + rep, err := migrator.Migrate(yamlBytes, migrator.Options{ + BloblangFileResolver: env.resolverFor(filepath.Dir(yamlPath)), + BloblangV2ImportPathRewriter: v5SuffixRewriter, + Verbose: true, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Errorf("expected mapping resource to migrate to bloblang_v2:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2_file:") { + t.Errorf("expected from-only resource to migrate to bloblang_v2_file:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "../mappings/tag.v5.blobl") { + t.Errorf("expected rewritten file path on bloblang_v2_file resource:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "../mappings/enrich.v5.blobl") { + t.Errorf("expected rewritten import path inside V2 mapping body:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "label: enrich_user") || !strings.Contains(rep.OutputYAML, "label: tag_event") { + t.Errorf("expected resource labels preserved:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "label: my_cache") || !strings.Contains(rep.OutputYAML, "memory:") { + t.Errorf("expected cache_resources entry preserved untouched:\n%s", rep.OutputYAML) + } + + for _, want := range []string{"enrich.blobl", "tag.blobl"} { + abs := filepath.Join(env.mappingsDir, want) + if _, ok := rep.BloblangV2Files[abs]; !ok { + t.Errorf("expected BloblangV2Files to contain %q, got: %v", abs, v2FileKeysSorted(rep.BloblangV2Files)) + } + } + if rep.Coverage.Rewritten != 2 { + t.Errorf("expected 2 rewrites, got %+v", rep.Coverage) + } +} + +// TestEndToEndMixedV1V2Processors covers the realistic mid-migration +// state: a single config containing both V1 (bloblang/mapping/ +// mutation) and V2 (bloblang_v2) processors. V1 entries migrate; +// V2 entries are left strictly alone (no rule registered for the +// bloblang_v2 target) and the resulting YAML lints cleanly under +// StreamConfigLinter. +func TestEndToEndMixedV1V2Processors(t *testing.T) { + env := newIntegrationEnv(t) + + // Self-contained bodies (no imports) so the lint check below + // doesn't have to chase relative-path V2 import resolution. The + // test's value is showing V1 entries migrate while V2 entries + // stay put — import handling is exercised in the other + // integration tests. + yamlPath := env.writeConfig("mixed.yaml", ` +pipeline: + processors: + - mapping: | + root.x = this.x.uppercase() + - bloblang_v2: | + output = input + output.kind = "v2-already" + - mutation: | + root.flag = true +`) + + yamlBytes, err := os.ReadFile(yamlPath) + if err != nil { + t.Fatalf("read yaml: %v", err) + } + rep, err := migrator.Migrate(yamlBytes, migrator.Options{ + BloblangFileResolver: env.resolverFor(filepath.Dir(yamlPath)), + BloblangV2ImportPathRewriter: v5SuffixRewriter, + Verbose: true, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + + // Two V1 entries become bloblang_v2; the existing bloblang_v2 + // stays bloblang_v2. So the output should contain three + // bloblang_v2 keys total, no V1 processor names. + if got := strings.Count(rep.OutputYAML, "bloblang_v2:"); got != 3 { + t.Errorf("expected 3 bloblang_v2 entries in output (2 migrated + 1 untouched), got %d:\n%s", got, rep.OutputYAML) + } + for _, v1Name := range []string{"mapping:", "mutation:"} { + if strings.Contains(rep.OutputYAML, " - "+v1Name) || strings.Contains(rep.OutputYAML, " - "+v1Name) { + t.Errorf("V1 processor name %q leaked into output:\n%s", v1Name, rep.OutputYAML) + } + } + // Confirm the untouched V2 processor's body still says "v2-already". + if !strings.Contains(rep.OutputYAML, `output.kind = "v2-already"`) { + t.Errorf("expected pre-existing bloblang_v2 body untouched, got:\n%s", rep.OutputYAML) + } + if rep.Coverage.Rewritten != 2 { + t.Errorf("expected 2 rewrites (V2 entry shouldn't match a rule), got %+v", rep.Coverage) + } + + // Output should lint cleanly under StreamConfigLinter. + schema := service.GlobalEnvironment().FullConfigSchema("", "") + linter := schema.NewStreamConfigLinter() + lints, err := linter.LintYAML([]byte(rep.OutputYAML)) + if err != nil { + t.Fatalf("lint output: %v", err) + } + for _, l := range lints { + t.Errorf("unexpected lint on migrated config: %+v", l) + } +} + +// TestEndToEndImportsInsideSwitchAndBranch confirms the walker +// descends into control-flow processor types (switch, branch) and +// that bloblang processors nested inside them have their imports +// resolved against the YAML's directory (parentKey stays "" for any +// depth of nesting in the same YAML body). Also confirms that +// bloblang STRING fields like branch.request_map are NOT touched. +func TestEndToEndImportsInsideSwitchAndBranch(t *testing.T) { + env := newIntegrationEnv(t) + + env.writeMapping("users.blobl", `map normalise { + root = this.uppercase() +} +`) + env.writeMapping("fallback.blobl", `root.fallback = true +root.kind = "fallback" +`) + + yamlPath := env.writeConfig("nested.yaml", ` +pipeline: + processors: + - switch: + - check: this.kind == "user" + processors: + - mapping: | + import "../mappings/users.blobl" + root.id = this.id.apply("normalise") + - processors: + - branch: + request_map: 'root = this.payload' + processors: + - bloblang: 'from "../mappings/fallback.blobl"' + result_map: 'root.enriched = this' +`) + + yamlBytes, err := os.ReadFile(yamlPath) + if err != nil { + t.Fatalf("read yaml: %v", err) + } + rep, err := migrator.Migrate(yamlBytes, migrator.Options{ + BloblangFileResolver: env.resolverFor(filepath.Dir(yamlPath)), + BloblangV2ImportPathRewriter: v5SuffixRewriter, + Verbose: true, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Errorf("expected nested mapping inside switch to migrate:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2_file:") { + t.Errorf("expected nested from-only inside branch to migrate to bloblang_v2_file:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "../mappings/users.v5.blobl") { + t.Errorf("expected rewritten import in switch-nested mapping body:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "../mappings/fallback.v5.blobl") { + t.Errorf("expected rewritten path on branch-nested bloblang_v2_file:\n%s", rep.OutputYAML) + } + + // branch.request_map / branch.result_map are bloblang STRING + // fields, not component instances. The walker should not have + // migrated them; the original V1 strings should survive verbatim. + if !strings.Contains(rep.OutputYAML, "root = this.payload") { + t.Errorf("expected branch.request_map (string field) to be untouched:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "root.enriched = this") { + t.Errorf("expected branch.result_map (string field) to be untouched:\n%s", rep.OutputYAML) + } + + for _, want := range []string{"users.blobl", "fallback.blobl"} { + abs := filepath.Join(env.mappingsDir, want) + if _, ok := rep.BloblangV2Files[abs]; !ok { + t.Errorf("expected BloblangV2Files to contain %q, got: %v", abs, v2FileKeysSorted(rep.BloblangV2Files)) + } + } + if rep.Coverage.Rewritten != 2 { + t.Errorf("expected 2 nested processors rewritten, got %+v", rep.Coverage) + } +} + +// TestEndToEndDiamondImports exercises closure-walker dedup under a +// non-tree shape: A imports B and C, both B and C import D. D should +// appear exactly once in BloblangV2Files (deduped by canonical key) +// and the resolver should fire once per unique import site, not once +// per visit-via-some-path. +func TestEndToEndDiamondImports(t *testing.T) { + env := newIntegrationEnv(t) + + env.writeMapping("d.blobl", `map d_helper { + root = this.uppercase() +} +`) + env.writeMapping("b.blobl", `import "./d.blobl" + +map from_b_helper { + root = this.apply("d_helper") +} +`) + env.writeMapping("c.blobl", `import "./d.blobl" + +map from_c_helper { + root = this.apply("d_helper") +} +`) + env.writeMapping("a.blobl", `import "./b.blobl" +import "./c.blobl" + +map from_a { + root = { + "from_b": this.apply("from_b_helper"), + "from_c": this.apply("from_c_helper") + } +} +`) + + yamlPath := env.writeConfig("diamond.yaml", ` +pipeline: + processors: + - mapping: | + import "../mappings/a.blobl" + root = this.x.apply("from_a") +`) + + // Wrap the resolver with a per-site call counter. + innerResolver := env.resolverFor(filepath.Dir(yamlPath)) + siteCalls := map[string]int{} + resolver := func(parentKey, importPath string) (string, string, bool) { + key := parentKey + "::" + importPath + siteCalls[key]++ + return innerResolver(parentKey, importPath) + } + + yamlBytes, err := os.ReadFile(yamlPath) + if err != nil { + t.Fatalf("read yaml: %v", err) + } + rep, err := migrator.Migrate(yamlBytes, migrator.Options{ + BloblangFileResolver: resolver, + BloblangV2ImportPathRewriter: v5SuffixRewriter, + }) + if err != nil { + t.Fatalf("migrate: %v", err) + } + + // The closure should contain exactly four files (A, B, C, D), + // not five (D should not be duplicated). + if got := len(rep.BloblangV2Files); got != 4 { + t.Errorf("expected 4 V2 files in closure (A, B, C, D), got %d: %v", got, v2FileKeysSorted(rep.BloblangV2Files)) + } + for _, want := range []string{"a.blobl", "b.blobl", "c.blobl", "d.blobl"} { + abs := filepath.Join(env.mappingsDir, want) + if _, ok := rep.BloblangV2Files[abs]; !ok { + t.Errorf("expected BloblangV2Files to contain %q, got: %v", abs, v2FileKeysSorted(rep.BloblangV2Files)) + } + } + + // Resolver firing pattern: each unique (parentKey, importPath) + // site fires once. Five sites total — main->a, a->b, a->c, b->d, + // c->d — and crucially the (b->d) and (c->d) calls are both + // counted (different parentKey) but the closure walker dedupes + // the resulting canonical so D only translates once. + if got := len(siteCalls); got != 5 { + t.Errorf("expected 5 unique resolver sites (main->a, a->b, a->c, b->d, c->d), got %d: %v", got, siteCalls) + } + for site, n := range siteCalls { + if n != 1 { + t.Errorf("resolver fired %d times for site %q (should be once per unique site)", n, site) + } + } + + // Both b.v5.blobl and c.v5.blobl should rewrite their D import + // to d.v5.blobl — confirming the rewriter applies in + // transitively-translated files. + bV2 := rep.BloblangV2Files[filepath.Join(env.mappingsDir, "b.blobl")] + cV2 := rep.BloblangV2Files[filepath.Join(env.mappingsDir, "c.blobl")] + for name, content := range map[string]string{"b": bV2, "c": cV2} { + if !strings.Contains(content, `import "./d.v5.blobl"`) { + t.Errorf("%s.v5.blobl should rewrite its d import to d.v5.blobl, got:\n%s", name, content) + } + } +} + +// TestEndToEndPartialFailure covers the realistic case where one +// config contains a mix of resolvable and unresolvable imports. +// The migrator must rewrite the resolvable processor cleanly, +// flag the unresolvable one as Unsupported (leaving the V1 in +// place), and surface both outcomes in the report — without +// aborting the whole migration. +func TestEndToEndPartialFailure(t *testing.T) { + env := newIntegrationEnv(t) + + env.writeMapping("exists.blobl", `map noop { + root = this +} +`) + // Note: missing.blobl is NOT created on disk. + + yamlPath := env.writeConfig("partial.yaml", ` +pipeline: + processors: + - mapping: | + import "../mappings/exists.blobl" + root = this.apply("noop") + - bloblang: 'from "../mappings/missing.blobl"' +`) + + yamlBytes, err := os.ReadFile(yamlPath) + if err != nil { + t.Fatalf("read yaml: %v", err) + } + rep, err := migrator.Migrate(yamlBytes, migrator.Options{ + BloblangFileResolver: env.resolverFor(filepath.Dir(yamlPath)), + BloblangV2ImportPathRewriter: v5SuffixRewriter, + Verbose: true, + }) + if err != nil { + t.Fatalf("migrate should not fail outright on a partial-failure config: %v", err) + } + + // First processor migrated successfully. + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Errorf("expected first (resolvable) processor migrated:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "../mappings/exists.v5.blobl") { + t.Errorf("expected rewritten import for resolved file:\n%s", rep.OutputYAML) + } + + // Second processor left in place — original V1 should survive. + if strings.Contains(rep.OutputYAML, "bloblang_v2_file:") { + t.Errorf("rule must not emit bloblang_v2_file when from target is unresolved:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "from \"../mappings/missing.blobl\"") { + t.Errorf("expected V1 from body preserved on Unsupported:\n%s", rep.OutputYAML) + } + + // BloblangV2Files holds exactly the resolved file's translation. + if got := len(rep.BloblangV2Files); got != 1 { + t.Errorf("expected exactly 1 entry in BloblangV2Files (the resolved file), got %d: %v", got, v2FileKeysSorted(rep.BloblangV2Files)) + } + existsAbs := filepath.Join(env.mappingsDir, "exists.blobl") + if _, ok := rep.BloblangV2Files[existsAbs]; !ok { + t.Errorf("expected BloblangV2Files to contain %q, got: %v", existsAbs, v2FileKeysSorted(rep.BloblangV2Files)) + } + missingAbs := filepath.Join(env.mappingsDir, "missing.blobl") + if _, ok := rep.BloblangV2Files[missingAbs]; ok { + t.Errorf("BloblangV2Files should NOT contain unresolved %q", missingAbs) + } + + // Coverage / changes: 1 rewritten + 1 unsupported. + if rep.Coverage.Rewritten != 1 { + t.Errorf("expected 1 rewrite, got %+v", rep.Coverage) + } + if rep.Coverage.Unsupported != 1 { + t.Errorf("expected 1 unsupported, got %+v", rep.Coverage) + } +} diff --git a/public/service/migrator/migrator.go b/public/service/migrator/migrator.go new file mode 100644 index 000000000..ded4dc696 --- /dev/null +++ b/public/service/migrator/migrator.go @@ -0,0 +1,138 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// Migrator rewrites Benthos stream configs by replacing one plugin +// instance with another. Construct one with New, register any custom +// rules with RegisterRule, then call Migrate. The Migrator is not safe +// for concurrent registration but is safe for concurrent Migrate +// calls once registration is complete. +// +// The built-in rules — bloblang -> bloblang_v2, mapping -> +// bloblang_v2, mutation -> bloblang_v2 — are always active. Custom +// rules layer on top and shadow built-ins on Target collision. +type Migrator struct { + rules map[Target]Rule +} + +// New creates a Migrator with the built-in plugin migration rules +// registered. Custom rules can be layered on top with RegisterRule. +func New() *Migrator { + m := &Migrator{rules: map[Target]Rule{}} + for t, r := range builtInRules() { + m.rules[t] = r + } + return m +} + +// RegisterRule registers a custom rule for the given Target. If a +// rule is already registered for the same Target the new rule +// replaces it (so downstream rules can override the built-ins). +func (m *Migrator) RegisterRule(target Target, rule Rule) { + m.rules[target] = rule +} + +// Migrate rewrites the supplied stream config YAML by applying every +// registered rule whose Target matches a component instance found in +// the config. Returns a *Report on success. +// +// Returns *CoverageError when the resulting Coverage.Ratio falls +// below opts.MinCoverage; the Report is reachable via the error. +func (m *Migrator) Migrate(yamlBytes []byte, opts Options) (*Report, error) { + bm := opts.BloblangMigrator + if bm == nil { + bm = bloblmig.New() + } + + // Hoist the top-level resolver hooks into BloblangOptions so they + // reach the bloblang migrator on every per-component call. + bloblangOpts := opts.BloblangOptions + if opts.BloblangFileResolver != nil { + bloblangOpts.FileResolver = opts.BloblangFileResolver + } + if opts.BloblangV2ImportPathRewriter != nil { + bloblangOpts.V2ImportPathRewriter = opts.BloblangV2ImportPathRewriter + } + + ctx := &Context{ + bloblang: bm, + bloblangOpts: bloblangOpts, + } + + out, changes, err := walk(yamlBytes, m.rules, ctx, opts.Verbose) + if err != nil { + return nil, err + } + + cov := computeCoverage(changes) + rep := &Report{ + OutputYAML: out, + Changes: changes, + Coverage: cov, + BloblangV2Files: aggregateBloblangV2Files(changes), + } + if opts.MinCoverage > 0 && cov.Ratio < opts.MinCoverage { + return nil, &CoverageError{ + Coverage: cov, + Min: opts.MinCoverage, + Report: rep, + } + } + return rep, nil +} + +// aggregateBloblangV2Files unions every component's bloblang +// Report.V2Files into a single map keyed by canonical key. Conflicting +// canonical keys (same key produced by different components) keep the +// first content seen — but in practice canonical keys identify +// fully-resolved files so duplicates carry the same content. +func aggregateBloblangV2Files(changes []Change) map[string]string { + var out map[string]string + for _, ch := range changes { + if ch.BloblangReport == nil { + continue + } + for canonical, content := range ch.BloblangReport.V2Files { + if out == nil { + out = map[string]string{} + } + if _, exists := out[canonical]; !exists { + out[canonical] = content + } + } + } + return out +} + +// Migrate is a package-level convenience that builds a default +// Migrator (built-in rules only) and runs it against the supplied +// YAML. Equivalent to `New().Migrate(src, opts)`. +func Migrate(yamlBytes []byte, opts Options) (*Report, error) { + return New().Migrate(yamlBytes, opts) +} + +func computeCoverage(changes []Change) Coverage { + var c Coverage + for _, ch := range changes { + c.Matched++ + switch ch.Outcome { + case OutcomeRewritten: + c.Rewritten++ + case OutcomeSkipped: + c.Skipped++ + case OutcomeUnsupported: + c.Unsupported++ + } + } + denom := c.Rewritten + c.Unsupported + if denom == 0 { + c.Ratio = 1 + } else { + c.Ratio = float64(c.Rewritten) / float64(denom) + } + return c +} diff --git a/public/service/migrator/migrator_test.go b/public/service/migrator/migrator_test.go new file mode 100644 index 000000000..1acc0869c --- /dev/null +++ b/public/service/migrator/migrator_test.go @@ -0,0 +1,654 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator_test + +import ( + "strings" + "testing" + + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" + "github.com/redpanda-data/benthos/v4/public/service/migrator" + + // Register the bundled processors so the schema resolves + // `bloblang`, `mapping`, `mutation` and `bloblang_v2` during + // walking. + _ "github.com/redpanda-data/benthos/v4/public/components/pure" +) + +func TestMigrateBloblangProcessor(t *testing.T) { + in := ` +pipeline: + processors: + - bloblang: | + root.id = this.id +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected bloblang_v2 in output:\n%s", rep.OutputYAML) + } + if strings.Contains(rep.OutputYAML, "bloblang:") { + t.Fatalf("V1 bloblang key leaked into output:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "output.id") { + t.Fatalf("expected V2 output.id rewrite, got:\n%s", rep.OutputYAML) + } + if rep.Coverage.Rewritten != 1 || rep.Coverage.Matched != 1 { + t.Fatalf("unexpected coverage: %+v", rep.Coverage) + } + if len(rep.Changes) != 1 { + t.Fatalf("expected 1 change, got %d: %+v", len(rep.Changes), rep.Changes) + } + if rep.Changes[0].Outcome != migrator.OutcomeRewritten { + t.Fatalf("expected rewritten outcome, got %v", rep.Changes[0].Outcome) + } + if rep.Changes[0].NewName != "bloblang_v2" { + t.Fatalf("expected NewName bloblang_v2, got %q", rep.Changes[0].NewName) + } + if rep.Changes[0].BloblangReport == nil { + t.Fatalf("expected attached BloblangReport on change") + } +} + +func TestMigrateMappingProcessor(t *testing.T) { + in := ` +pipeline: + processors: + - mapping: | + root.id = this.id +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected bloblang_v2 in output:\n%s", rep.OutputYAML) + } + if strings.Contains(rep.OutputYAML, "mapping:") { + t.Fatalf("V1 mapping key leaked into output:\n%s", rep.OutputYAML) + } +} + +func TestMigrateMutationProcessor(t *testing.T) { + in := ` +pipeline: + processors: + - mutation: | + root.id = this.id +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected bloblang_v2 in output:\n%s", rep.OutputYAML) + } + if strings.Contains(rep.OutputYAML, "mutation:") { + t.Fatalf("V1 mutation key leaked into output:\n%s", rep.OutputYAML) + } +} + +func TestMigratePreservesLabel(t *testing.T) { + in := ` +pipeline: + processors: + - label: my_proc + bloblang: | + root = this +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "label: my_proc") { + t.Fatalf("expected label preserved:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected bloblang_v2 alongside label:\n%s", rep.OutputYAML) + } + if rep.Changes[0].Label != "my_proc" { + t.Fatalf("expected label captured on change, got %q", rep.Changes[0].Label) + } +} + +func TestMigrateMultipleProcessors(t *testing.T) { + in := ` +pipeline: + processors: + - bloblang: 'root.a = this.a' + - mutation: 'root.b = this.b' + - mapping: 'root.c = this.c' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if got := strings.Count(rep.OutputYAML, "bloblang_v2:"); got != 3 { + t.Fatalf("expected 3 bloblang_v2 keys, got %d in:\n%s", got, rep.OutputYAML) + } + if rep.Coverage.Rewritten != 3 { + t.Fatalf("expected 3 rewritten, got %+v", rep.Coverage) + } +} + +func TestMigrateNestedInsideSwitch(t *testing.T) { + in := ` +pipeline: + processors: + - switch: + - check: this.kind == "user" + processors: + - mapping: | + root.id = this.user_id + - processors: + - mutation: | + root.fallback = true +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if got := strings.Count(rep.OutputYAML, "bloblang_v2:"); got != 2 { + t.Fatalf("expected 2 bloblang_v2 keys, got %d in:\n%s", got, rep.OutputYAML) + } +} + +func TestMigrateNoOpWhenNoMatch(t *testing.T) { + in := ` +pipeline: + processors: + - log: + message: hello +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if len(rep.Changes) != 0 { + t.Fatalf("expected no changes, got %+v", rep.Changes) + } + if rep.Coverage.Matched != 0 { + t.Fatalf("expected no matches, got %+v", rep.Coverage) + } +} + +func TestRegisterCustomRuleOverridesBuiltin(t *testing.T) { + mig := migrator.New() + mig.RegisterRule(migrator.Target{ComponentType: "processor", Name: "bloblang"}, + func(ctx *migrator.Context, c *migrator.Component) migrator.Result { + return ctx.Unsupported("custom rule says no") + }) + + in := ` +pipeline: + processors: + - bloblang: 'root = this' +` + rep, err := mig.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang:") { + t.Fatalf("expected V1 bloblang preserved (rule was unsupported):\n%s", rep.OutputYAML) + } + if strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("custom rule should have blocked the rewrite:\n%s", rep.OutputYAML) + } + if len(rep.Changes) != 1 || rep.Changes[0].Outcome != migrator.OutcomeUnsupported { + t.Fatalf("expected one unsupported change, got %+v", rep.Changes) + } + if rep.Coverage.Unsupported != 1 || rep.Coverage.Ratio != 0 { + t.Fatalf("unexpected coverage: %+v", rep.Coverage) + } +} + +func TestRegisterCustomRuleNewTarget(t *testing.T) { + mig := migrator.New() + mig.RegisterRule(migrator.Target{ComponentType: "processor", Name: "log"}, + func(ctx *migrator.Context, c *migrator.Component) migrator.Result { + return ctx.ReplaceStructured("log", map[string]any{"message": "rewritten"}) + }) + + in := ` +pipeline: + processors: + - log: + message: hello +` + rep, err := mig.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "rewritten") { + t.Fatalf("expected rewritten message, got:\n%s", rep.OutputYAML) + } +} + +func TestMinCoverageGate(t *testing.T) { + mig := migrator.New() + mig.RegisterRule(migrator.Target{ComponentType: "processor", Name: "bloblang"}, + func(ctx *migrator.Context, c *migrator.Component) migrator.Result { + return ctx.Unsupported("nope") + }) + + in := ` +pipeline: + processors: + - bloblang: 'root = this' +` + _, err := mig.Migrate([]byte(in), migrator.Options{MinCoverage: 0.5}) + if err == nil { + t.Fatalf("expected coverage error") + } + cerr, ok := err.(*migrator.CoverageError) + if !ok { + t.Fatalf("expected *CoverageError, got %T: %v", err, err) + } + if cerr.Report == nil { + t.Fatalf("CoverageError should expose the Report") + } +} + +// TestModeMappingPrependsOutputInput verifies that the `mapping` +// processor is migrated using ModeMapping — the bloblang translator +// prepends `output = input` so unwritten fields pass through, matching +// V1 mapping semantics. +func TestModeMappingPrependsOutputInput(t *testing.T) { + in := ` +pipeline: + processors: + - mapping: 'root.id = this.id' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "output = input") { + t.Fatalf("ModeMapping should prepend `output = input`, got:\n%s", rep.OutputYAML) + } +} + +// TestModeMappingProcessorBloblangAlsoPrepends — the `bloblang` +// processor shares semantics with `mapping`, so it must also use +// ModeMapping. +func TestModeMappingProcessorBloblangAlsoPrepends(t *testing.T) { + in := ` +pipeline: + processors: + - bloblang: 'root.id = this.id' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "output = input") { + t.Fatalf("bloblang processor should use ModeMapping (prepend `output = input`), got:\n%s", rep.OutputYAML) + } +} + +// TestModeMutationDoesNotPrepend verifies that the `mutation` +// processor uses ModeMutation — V2's empty `output` aligns with V1's +// `mutation` semantics, so no prelude should be inserted. +func TestModeMutationDoesNotPrepend(t *testing.T) { + in := ` +pipeline: + processors: + - mutation: 'root.id = this.id' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if strings.Contains(rep.OutputYAML, "output = input") { + t.Fatalf("ModeMutation should NOT prepend `output = input`, got:\n%s", rep.OutputYAML) + } +} + +// TestMigrateInsideInputProcessors verifies the walker descends into +// input.processors. It also asserts that the `generate` input's +// `mapping` field — a Bloblang STRING field, not a plugin instance — +// is left untouched (string-field migration is out of scope for this +// component-level migrator). +func TestMigrateInsideInputProcessors(t *testing.T) { + in := ` +input: + generate: + mapping: 'root = "hello"' + interval: 1s + processors: + - mapping: 'root.upper = this.uppercase()' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected input.processors mapping to be migrated:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, `mapping: 'root = "hello"'`) && + !strings.Contains(rep.OutputYAML, `mapping: root = "hello"`) { + t.Fatalf("generate.mapping (string field) should be untouched, got:\n%s", rep.OutputYAML) + } + if rep.Coverage.Rewritten != 1 { + t.Fatalf("expected exactly 1 rewrite (only the processor, not the string field), got %+v", rep.Coverage) + } +} + +// TestMigrateInsideOutputProcessors verifies the walker descends into +// output.processors. +func TestMigrateInsideOutputProcessors(t *testing.T) { + in := ` +output: + drop: {} + processors: + - mutation: 'root.flag = true' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected output.processors mutation to be migrated:\n%s", rep.OutputYAML) + } +} + +// TestMigrateInsideBranchProcessors verifies nested processors inside +// a `branch` processor get migrated, and that branch.request_map / +// result_map (Bloblang STRING fields) are left untouched. +func TestMigrateInsideBranchProcessors(t *testing.T) { + in := ` +pipeline: + processors: + - branch: + request_map: 'root = this.payload' + processors: + - mapping: 'root.id = this.id' + result_map: 'root.enriched = this' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected branch.processors mapping to be migrated:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "request_map") || !strings.Contains(rep.OutputYAML, "result_map") { + t.Fatalf("branch string fields should be preserved:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "root = this.payload") { + t.Fatalf("branch.request_map (string field) should be untouched, got:\n%s", rep.OutputYAML) + } + if rep.Coverage.Rewritten != 1 { + t.Fatalf("expected exactly 1 rewrite (only the nested processor), got %+v", rep.Coverage) + } +} + +// TestMigrateResourcesFile verifies the walker handles top-level +// resource definitions (cache_resources, processor_resources, etc.) — +// not just the stream pipeline. +func TestMigrateResourcesFile(t *testing.T) { + in := ` +processor_resources: + - label: my_resource + bloblang: 'root.x = this.x' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if !strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("expected processor_resources entry to be migrated:\n%s", rep.OutputYAML) + } + if !strings.Contains(rep.OutputYAML, "label: my_resource") { + t.Fatalf("expected resource label preserved:\n%s", rep.OutputYAML) + } + if rep.Coverage.Rewritten != 1 { + t.Fatalf("expected 1 rewrite, got %+v", rep.Coverage) + } +} + +// TestMigratePreservesComments verifies that YAML comments adjacent +// to a migrated component survive the rewrite. This is a load-bearing +// guarantee for users running the migrator on hand-curated configs. +func TestMigratePreservesComments(t *testing.T) { + in := ` +pipeline: + processors: + # head comment on the processor + - bloblang: 'root.id = this.id' # inline comment + # comment between processors + - mutation: 'root.flag = true' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + for _, want := range []string{ + "head comment on the processor", + "inline comment", + "comment between processors", + } { + if !strings.Contains(rep.OutputYAML, want) { + t.Fatalf("comment %q lost in migration:\n%s", want, rep.OutputYAML) + } + } +} + +// TestInvalidBloblangBodyMarkedUnsupported verifies that a syntax +// error in the V1 mapping body produces an Unsupported Change rather +// than failing the whole Migrate call. The original component is left +// untouched so the rewritten YAML remains valid for the user to fix. +func TestInvalidBloblangBodyMarkedUnsupported(t *testing.T) { + in := ` +pipeline: + processors: + - bloblang: '@@@ not valid bloblang @@@' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate should not fail outright on bad body: %v", err) + } + if len(rep.Changes) != 1 { + t.Fatalf("expected 1 change, got %d: %+v", len(rep.Changes), rep.Changes) + } + if rep.Changes[0].Outcome != migrator.OutcomeUnsupported { + t.Fatalf("expected OutcomeUnsupported, got %v", rep.Changes[0].Outcome) + } + if rep.Changes[0].Severity != migrator.SeverityError { + t.Fatalf("expected SeverityError, got %v", rep.Changes[0].Severity) + } + if !strings.Contains(rep.OutputYAML, "bloblang:") { + t.Fatalf("V1 plugin should be left in place on Unsupported:\n%s", rep.OutputYAML) + } + if strings.Contains(rep.OutputYAML, "bloblang_v2:") { + t.Fatalf("V1 plugin should NOT be rewritten on Unsupported:\n%s", rep.OutputYAML) + } +} + +// TestVerboseEmitsSkipChange verifies that a Skip(reason) result is +// silent in the default report and emitted as an info Change in +// verbose mode. +func TestVerboseEmitsSkipChange(t *testing.T) { + rule := func(ctx *migrator.Context, c *migrator.Component) migrator.Result { + return ctx.Skip("just because") + } + + build := func() *migrator.Migrator { + mig := migrator.New() + mig.RegisterRule(migrator.Target{ComponentType: "processor", Name: "bloblang"}, rule) + return mig + } + + in := ` +pipeline: + processors: + - bloblang: 'root = this' +` + + quiet, err := build().Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("quiet migrate: %v", err) + } + if len(quiet.Changes) != 0 { + t.Fatalf("non-verbose Skip should not emit Changes, got %+v", quiet.Changes) + } + if quiet.Coverage.Skipped != 0 { + t.Fatalf("non-verbose Skip should not count toward Coverage.Skipped, got %+v", quiet.Coverage) + } + + loud, err := build().Migrate([]byte(in), migrator.Options{Verbose: true}) + if err != nil { + t.Fatalf("verbose migrate: %v", err) + } + if len(loud.Changes) != 1 { + t.Fatalf("verbose Skip should emit one Change, got %+v", loud.Changes) + } + if loud.Changes[0].Outcome != migrator.OutcomeSkipped { + t.Fatalf("expected OutcomeSkipped, got %v", loud.Changes[0].Outcome) + } + if loud.Changes[0].Reason != "just because" { + t.Fatalf("Skip reason lost, got %q", loud.Changes[0].Reason) + } + if loud.Changes[0].Severity != migrator.SeverityInfo { + t.Fatalf("Skip should be SeverityInfo, got %v", loud.Changes[0].Severity) + } +} + +// TestInvalidYAMLReturnsError verifies that malformed YAML is +// rejected with an error rather than panicking or silently producing +// empty output. +func TestInvalidYAMLReturnsError(t *testing.T) { + _, err := migrator.Migrate([]byte("not: valid: yaml: ::\n - oops"), migrator.Options{}) + if err == nil { + t.Fatalf("expected error for invalid YAML") + } +} + +// TestZeroResultLeavesComponentUntouched verifies the defensive path +// in buildChange: if a custom rule returns the zero Result (no kind +// set), the component is left untouched and no Change is recorded. +func TestZeroResultLeavesComponentUntouched(t *testing.T) { + mig := migrator.New() + mig.RegisterRule( + migrator.Target{ComponentType: "processor", Name: "log"}, + func(ctx *migrator.Context, c *migrator.Component) migrator.Result { + return migrator.Result{} + }, + ) + + in := ` +pipeline: + processors: + - log: + message: hello +` + rep, err := mig.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if len(rep.Changes) != 0 { + t.Fatalf("zero Result should not emit Changes, got %+v", rep.Changes) + } + if !strings.Contains(rep.OutputYAML, "message: hello") { + t.Fatalf("config should be unchanged:\n%s", rep.OutputYAML) + } +} + +// TestUnchangedConfigWhenNoRulesMatch verifies the byte stability of +// configs that have nothing to migrate. A round-trip through the +// migrator should yield a YAML document equivalent to the input. +func TestUnchangedConfigWhenNoRulesMatch(t *testing.T) { + in := ` +input: + generate: + mapping: 'root = "hello"' + interval: 1s +output: + drop: {} +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if len(rep.Changes) != 0 { + t.Fatalf("expected zero Changes, got %+v", rep.Changes) + } + for _, want := range []string{ + `mapping: 'root = "hello"'`, + "interval: 1s", + "drop: {}", + } { + if !strings.Contains(rep.OutputYAML, want) { + t.Fatalf("expected %q in unchanged output, got:\n%s", want, rep.OutputYAML) + } + } +} + +// TestBloblangMigratorOptionThreadedThrough verifies that a custom +// *bloblmig.Migrator supplied via Options.BloblangMigrator is the one +// that translates embedded mapping bodies — confirming the built-in +// rules consult the per-call sub-migrator rather than constructing +// their own. +func TestBloblangMigratorOptionThreadedThrough(t *testing.T) { + bloblMig := bloblmig.New() + var fired int + bloblMig.RegisterMethodRule("widget_encode", func(ctx *bloblmig.Context, m *bloblmig.V1MethodCall) bloblmig.Result { + fired++ + return ctx.Replace(&bloblmig.V2MethodCallExpr{ + Receiver: ctx.Translate(m.Receiver), + Method: "widget_encode_v2", + }) + }) + + in := ` +pipeline: + processors: + - bloblang: 'root.encoded = this.payload.widget_encode()' +` + rep, err := migrator.Migrate([]byte(in), migrator.Options{BloblangMigrator: bloblMig}) + if err != nil { + t.Fatalf("migrate: %v", err) + } + if fired != 1 { + t.Fatalf("expected custom bloblang rule to fire exactly once, got %d", fired) + } + if !strings.Contains(rep.OutputYAML, "widget_encode_v2") { + t.Fatalf("expected V2 rewrite from custom rule, got:\n%s", rep.OutputYAML) + } +} + +// TestRegisterRuleNoOpAccessors covers Component accessors used by +// custom rules: BodyString for scalar bodies, BodyAny for structured. +func TestComponentAccessors(t *testing.T) { + mig := migrator.New() + var sawBodyString, sawBodyAny bool + mig.RegisterRule( + migrator.Target{ComponentType: "processor", Name: "log"}, + func(ctx *migrator.Context, c *migrator.Component) migrator.Result { + if _, ok := c.BodyString(); ok { + sawBodyString = true + } + if v, err := c.BodyAny(); err == nil && v != nil { + sawBodyAny = true + } + return ctx.Skip("introspect only") + }, + ) + in := ` +pipeline: + processors: + - log: + message: hello +` + if _, err := mig.Migrate([]byte(in), migrator.Options{Verbose: true}); err != nil { + t.Fatalf("migrate: %v", err) + } + if sawBodyString { + t.Fatalf("log body is structured; BodyString should not have reported ok") + } + if !sawBodyAny { + t.Fatalf("BodyAny should have decoded the structured body") + } +} diff --git a/public/service/migrator/options.go b/public/service/migrator/options.go new file mode 100644 index 000000000..e964e7e69 --- /dev/null +++ b/public/service/migrator/options.go @@ -0,0 +1,58 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// Options controls a single Migrate call. Per-instance configuration +// (registered rules) lives on the Migrator; per-call configuration +// (verbosity, coverage threshold, embedded-bloblang migrator) lives +// here. +type Options struct { + // BloblangMigrator is the Bloblang V1->V2 migrator the built-in + // processor rules (and any custom rules that consult + // Context.Bloblang) thread embedded mapping bodies through. If nil + // a fresh migrator with built-in rules only is used. Supply a + // custom one to register plugin-specific Bloblang method/function + // rules ahead of the call. + BloblangMigrator *bloblmig.Migrator + + // BloblangOptions is forwarded to the Bloblang V1->V2 migrator on + // each call. The Mode field is overridden per built-in rule + // (ModeMapping for `bloblang`/`mapping`, ModeMutation for + // `mutation`); other fields (Verbose, MinCoverage, Files, + // TreatWarningsAsErrors) pass through unchanged. + // + // Note: BloblangFileResolver and BloblangV2ImportPathRewriter + // below are forwarded into BloblangOptions on each call. They are + // hoisted to the top level of Options because they are the typical + // hooks a CLI caller wants to set; setting them directly on + // BloblangOptions also works but is less discoverable. + BloblangOptions bloblmig.Options + + // BloblangFileResolver is forwarded to BloblangOptions.FileResolver + // for every component the migrator translates. This is the single + // hook a caller needs to enable transitive import migration — + // path discovery, the closure walk, translation and emission all + // happen inside the bloblang migrator. See bloblmig.FileResolver + // for the contract. + BloblangFileResolver bloblmig.FileResolver + + // BloblangV2ImportPathRewriter is forwarded to + // BloblangOptions.V2ImportPathRewriter for every component. See + // bloblmig.V2ImportPathRewriter for the contract. + BloblangV2ImportPathRewriter bloblmig.V2ImportPathRewriter + + // MinCoverage is the minimum aggregate coverage ratio required + // across all migrated plugin instances before Migrate returns + // successfully. The ratio is computed as (Rewritten) / + // (Rewritten + Unsupported); plugins skipped or untouched do not + // affect it. Default 0 (no gate). + MinCoverage float64 + + // Verbose emits Info-severity Changes (e.g. Skip notes). Without + // it, only Warning and Error Changes are recorded. + Verbose bool +} diff --git a/public/service/migrator/report.go b/public/service/migrator/report.go new file mode 100644 index 000000000..ce99a30f7 --- /dev/null +++ b/public/service/migrator/report.go @@ -0,0 +1,144 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + "fmt" + + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// Severity classifies a Change record. Info means the rewrite was +// purely mechanical; Warning flags a divergence the user should +// audit; Error signals an Unsupported plugin that produced no +// equivalent output (the plugin is left untouched). +type Severity int + +// Severity values. +const ( + SeverityInfo Severity = iota + SeverityWarning + SeverityError +) + +// String satisfies fmt.Stringer. +func (s Severity) String() string { + switch s { + case SeverityInfo: + return "info" + case SeverityWarning: + return "warning" + case SeverityError: + return "error" + } + return fmt.Sprintf("severity(%d)", s) +} + +// Outcome classifies the disposition of a single matched component. +type Outcome int + +// Outcome values. +const ( + // OutcomeRewritten — a rule matched and replaced the plugin. + OutcomeRewritten Outcome = iota + // OutcomeSkipped — a rule matched but declined to rewrite. + OutcomeSkipped + // OutcomeUnsupported — a rule matched but flagged the plugin as + // untranslatable; the plugin is left in place. + OutcomeUnsupported +) + +// String satisfies fmt.Stringer. +func (o Outcome) String() string { + switch o { + case OutcomeRewritten: + return "rewritten" + case OutcomeSkipped: + return "skipped" + case OutcomeUnsupported: + return "unsupported" + } + return fmt.Sprintf("outcome(%d)", o) +} + +// Change records the disposition of one matched component. +type Change struct { + // Target identifies the (ComponentType, Name) the rule was + // registered against. + Target Target + // Path is the dotted location of the component within the config. + Path string + // Label, if non-empty, is the YAML `label` of the component. + Label string + // LineStart, LineEnd is the 1-indexed line span of the component + // in the source YAML. + LineStart, LineEnd int + // Outcome is the disposition of the component. + Outcome Outcome + // Severity classifies the Change for filtering / CI gating. + Severity Severity + // NewName is the plugin name the rule rewrote the component into, + // or "" if the component was not rewritten. + NewName string + // Reason carries the explanation supplied by Skip / Unsupported, + // or a short summary of the rewrite. + Reason string + // BloblangReport, if non-nil, is the V1->V2 translation report + // for the embedded Bloblang body that was rewritten. Inspect it + // for per-mapping coverage and warnings. + BloblangReport *bloblmig.Report +} + +// Coverage summarises the migrator's progress over the input config. +// Only matched components are counted; components without a registered +// rule are ignored. +type Coverage struct { + // Matched is the number of components for which a rule fired. + Matched int + // Rewritten is the number of components with OutcomeRewritten. + Rewritten int + // Skipped is the number of components with OutcomeSkipped. + Skipped int + // Unsupported is the number of components with OutcomeUnsupported. + Unsupported int + // Ratio is Rewritten / (Rewritten + Unsupported), or 1 when there + // are no Rewritten or Unsupported components. + Ratio float64 +} + +// Report is the result of a successful Migrate call. +type Report struct { + // OutputYAML is the rewritten config. When no rule fires, this + // equals the input. + OutputYAML string + // Changes records every component a rule fired against, in the + // order the migrator visited them. + Changes []Change + // Coverage aggregates Changes into a coverage ratio. + Coverage Coverage + // BloblangV2Files is the union of every component's bloblang + // Report.V2Files, keyed by canonical key. Populated when callers + // supply a BloblangFileResolver and any migrated mapping body + // contained imports. The caller is expected to write each entry to + // disk (typically with BloblangV2ImportPathRewriter applied to the + // canonical key to derive the on-disk path). + BloblangV2Files map[string]string +} + +// CoverageError is returned by Migrate when the resulting +// Coverage.Ratio falls below Options.MinCoverage. The Report is +// reachable through the error. +type CoverageError struct { + Coverage Coverage + Min float64 + Report *Report +} + +// Error satisfies the error interface. +func (e *CoverageError) Error() string { + return fmt.Sprintf( + "migrator: coverage %.2f is below threshold %.2f (rewritten=%d unsupported=%d skipped=%d)", + e.Coverage.Ratio, e.Min, + e.Coverage.Rewritten, e.Coverage.Unsupported, e.Coverage.Skipped, + ) +} diff --git a/public/service/migrator/rule.go b/public/service/migrator/rule.go new file mode 100644 index 000000000..5441225fd --- /dev/null +++ b/public/service/migrator/rule.go @@ -0,0 +1,61 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + bloblmig "github.com/redpanda-data/benthos/v4/public/bloblangv2/migrator" +) + +// Target identifies the plugin a Rule applies to. ComponentType is the +// core component family (e.g. "processor", "input", "output", "cache", +// "buffer", "rate_limit", "metrics", "tracer", "scanner") and Name is +// the plugin name registered for that type (e.g. "bloblang", +// "mutation", "mapping"). +type Target struct { + ComponentType string + Name string +} + +// Rule is the callback shape for a custom plugin migration. Rules are +// registered with Migrator.RegisterRule, keyed by Target. The callback +// receives a Context (helpers + Result constructors) and a Component +// describing the matched plugin instance, and returns a Result +// describing the outcome. +// +// Custom rules win on collision with the built-ins (the downstream +// rule fully replaces the built-in for that Target). +type Rule func(ctx *Context, c *Component) Result + +// resultKind is the discriminant for Result. +type resultKind int + +const ( + resultUnset resultKind = iota + resultReplace + resultSkip + resultUnsupported +) + +// Result is the outcome of a Rule. Construct via Context.Replace, +// Context.Skip, or Context.Unsupported — the zero value is invalid. +type Result struct { + kind resultKind + + // replacement holds the new plugin name and body for resultReplace. + replacement replacement + + // reason carries the explanation for resultSkip / resultUnsupported. + reason string +} + +// replacement is the payload of a resultReplace Result. +type replacement struct { + name string + body any // string for scalar bodies, structured Go value otherwise. + + // bloblangReport, when non-nil, is the report produced by the + // bundled Bloblang V1->V2 translation that produced this body. The + // outer migrator surfaces it on the per-component Change record so + // callers can inspect mapping-level coverage and warnings. + bloblangReport *bloblmig.Report +} diff --git a/public/service/migrator/walker.go b/public/service/migrator/walker.go new file mode 100644 index 000000000..1aba455c1 --- /dev/null +++ b/public/service/migrator/walker.go @@ -0,0 +1,176 @@ +// Copyright 2026 Redpanda Data, Inc. + +package migrator + +import ( + "errors" + "fmt" + + "gopkg.in/yaml.v3" + + "github.com/redpanda-data/benthos/v4/internal/bundle" + "github.com/redpanda-data/benthos/v4/internal/config" + "github.com/redpanda-data/benthos/v4/internal/docs" +) + +// walk parses the input YAML, traverses every plugin instance in the +// stream config, applies any matching rule, and returns the rewritten +// document plus the per-component changes. The input bytes are not +// mutated; the returned tree is a fresh allocation. +func walk(yamlBytes []byte, rules map[Target]Rule, ctx *Context, verbose bool) (string, []Change, error) { + root, err := docs.UnmarshalYAML(yamlBytes) + if err != nil { + return "", nil, fmt.Errorf("parse config: %w", err) + } + + spec := config.Spec() + provider := bundle.GlobalEnvironment + + var changes []Change + + walkConf := docs.WalkComponentConfig{ + Provider: provider, + Func: func(wc docs.WalkedComponent) error { + coreType, ok := wc.Field.Type.IsCoreComponent() + if !ok { + return nil + } + target := Target{ + ComponentType: string(coreType), + Name: wc.Name, + } + rule, found := rules[target] + if !found { + return nil + } + + container, ok := wc.Value.(*yaml.Node) + if !ok { + return nil + } + + comp := &Component{ + Type: target.ComponentType, + Name: target.Name, + Path: wc.Path, + Label: wc.Label, + LineStart: wc.LineStart, + LineEnd: wc.LineEnd, + container: container, + } + res := rule(ctx, comp) + ch := buildChange(target, comp, res, verbose) + if ch != nil { + changes = append(changes, *ch) + } + if res.kind == resultReplace { + if err := applyReplacement(container, target.Name, res.replacement); err != nil { + return fmt.Errorf("%s/%s: %w", target.ComponentType, target.Name, err) + } + } + return nil + }, + } + + if err := spec.WalkComponentsYAML(walkConf, root); err != nil { + return "", nil, err + } + + out, err := docs.MarshalYAML(*root) + if err != nil { + return "", nil, fmt.Errorf("marshal config: %w", err) + } + return string(out), changes, nil +} + +// applyReplacement mutates the container mapping node in place, +// renaming the plugin's key and replacing its value with the supplied +// body. The body may be a string (assigned to the existing scalar +// node, preserving its style) or an arbitrary Go value (encoded into +// a fresh yaml.Node). +func applyReplacement(container *yaml.Node, oldName string, r replacement) error { + if container == nil || container.Kind != yaml.MappingNode { + return errors.New("container is not a mapping node") + } + for i := 0; i+1 < len(container.Content); i += 2 { + if container.Content[i].Value != oldName { + continue + } + container.Content[i].Value = r.name + valueNode := container.Content[i+1] + switch body := r.body.(type) { + case string: + if valueNode.Kind != yaml.ScalarNode { + newScalar := &yaml.Node{Kind: yaml.ScalarNode, Value: body} + preserveScalarStyle(newScalar, body) + container.Content[i+1] = newScalar + return nil + } + valueNode.Value = body + valueNode.Tag = "" + preserveScalarStyle(valueNode, body) + return nil + default: + var encoded yaml.Node + if err := encoded.Encode(body); err != nil { + return fmt.Errorf("encode replacement body: %w", err) + } + container.Content[i+1] = &encoded + return nil + } + } + return fmt.Errorf("plugin %q not found in container", oldName) +} + +// preserveScalarStyle picks a sensible scalar style for a string body. +// Multi-line bodies render best as literal-block scalars (`|`), single +// lines fall back to whatever yaml.v3 chooses (usually plain or +// double-quoted depending on content). +func preserveScalarStyle(n *yaml.Node, body string) { + for _, r := range body { + if r == '\n' { + n.Style = yaml.LiteralStyle + return + } + } + n.Style = 0 +} + +// buildChange materialises the Change record for a rule outcome. +// Returns nil for Skip results when verbose is false, since silent +// skips are noise in non-verbose reports. +func buildChange(target Target, c *Component, res Result, verbose bool) *Change { + ch := &Change{ + Target: target, + Path: c.Path, + Label: c.Label, + LineStart: c.LineStart, + LineEnd: c.LineEnd, + } + switch res.kind { + case resultReplace: + ch.Outcome = OutcomeRewritten + ch.Severity = SeverityInfo + ch.NewName = res.replacement.name + ch.Reason = fmt.Sprintf("rewrote %s/%s -> %s", target.ComponentType, target.Name, res.replacement.name) + ch.BloblangReport = res.replacement.bloblangReport + return ch + case resultUnsupported: + ch.Outcome = OutcomeUnsupported + ch.Severity = SeverityError + ch.Reason = res.reason + return ch + case resultSkip: + if !verbose && res.reason == "" { + return nil + } + ch.Outcome = OutcomeSkipped + ch.Severity = SeverityInfo + ch.Reason = res.reason + if !verbose { + return nil + } + return ch + } + return nil +} From 7bc0e52725137cd1880495c7c81ddaec25976bec Mon Sep 17 00:00:00 2001 From: Ashley Jeffs Date: Fri, 24 Apr 2026 20:23:07 +0100 Subject: [PATCH 20/20] bloblang(v2): Update go.mod and split unit tests into fast/full Adds the google/uuid dependency required by the V2 standard library (uuid_v5). Restructures taskfiles/test.yml so the default unit and unit-race tasks pass -short and skip long-running corpus and benchmark tests, keeping the per-PR loop under a minute. Adds a new unit-full task (alias ut-full) with a longer timeout that runs the full suite, including the migrator corpus, fuzz seeds, and benchmark smoke tests. --- go.mod | 1 + taskfiles/test.yml | 17 +++++++++++++---- 2 files changed, 14 insertions(+), 4 deletions(-) diff --git a/go.mod b/go.mod index 0f0367d59..027e5f6b1 100644 --- a/go.mod +++ b/go.mod @@ -13,6 +13,7 @@ require ( github.com/fatih/color v1.19.0 github.com/fsnotify/fsnotify v1.10.1 github.com/golang-jwt/jwt/v5 v5.3.1 + github.com/google/uuid v1.6.0 github.com/gorilla/handlers v1.5.2 github.com/gorilla/mux v1.8.1 github.com/gorilla/websocket v1.5.3 diff --git a/taskfiles/test.yml b/taskfiles/test.yml index 359a22146..9f4da8494 100644 --- a/taskfiles/test.yml +++ b/taskfiles/test.yml @@ -2,22 +2,31 @@ version: '3' tasks: unit: - desc: Run unit tests + desc: Run unit tests (fast — skips tests marked as long-running via testing.Short) aliases: - ut vars: TIMEOUT: '{{if .CI}}3m{{else}}1m{{end}}' cmds: - - go test {{.GO_FLAGS}} -ldflags "{{.LD_FLAGS}}" -timeout {{.TIMEOUT}} -shuffle=on ./... + - go test {{.GO_FLAGS}} -ldflags "{{.LD_FLAGS}}" -timeout {{.TIMEOUT}} -short -shuffle=on ./... unit-race: - desc: Run unit tests with race detection + desc: Run unit tests with race detection (fast — skips long-running tests) aliases: - ut-race vars: TIMEOUT: '{{if .CI}}3m{{else}}1m{{end}}' cmds: - - go test {{.GO_FLAGS}} -ldflags "{{.LD_FLAGS}}" -timeout {{.TIMEOUT}} -shuffle=on -race ./... + - go test {{.GO_FLAGS}} -ldflags "{{.LD_FLAGS}}" -timeout {{.TIMEOUT}} -short -shuffle=on -race ./... + + unit-full: + desc: Run unit tests including long-running corpus and benchmark tests + aliases: + - ut-full + vars: + TIMEOUT: '{{if .CI}}15m{{else}}10m{{end}}' + cmds: + - go test {{.GO_FLAGS}} -ldflags "{{.LD_FLAGS}}" -timeout {{.TIMEOUT}} -shuffle=on ./... template: desc: Run template tests