Corpus parity: formatter-enum expansion, raw:false flag, try/catch for binary chains by kevinelliott · Pull Request #6 · airframesio/acars-decoder

kevinelliott · 2026-06-10T06:02:41Z

Summary

The new TS corpus runner (acars-decoder-typescript@031f096) deep-equals 289 production-extracted samples against the generated decoder. First run failed 41 — all real parity gaps the unit tests never caught. This PR fixes every one at the spec/emitter level; the corpus now passes 289/289 byte-for-byte.

Three root-cause classes

1. Formatter enum too narrow (10 samples, 44-family)

The v1 enum forced specs to express month/day/eta/off/on/in/fuel-remaining as generic timestamp/fuel items, flattening distinct legacy items (MSG_MON, MSG_DAY, ON/"Landing Time", ' FUEL_REM') into TIMESTAMP/FOB.

Schema + IR: formatter_call.type adds eta/out/off/on/in/day/month/departure_day/arrival_day/fuel_remaining/remaining_fields
TS emitter: methodMap additions + remaining_fields emission (if (data.length > N) ResultFormatter.unknownArr(result, data.slice(N)) — closes the bulk-port agents' documented addRemainingFields caveat)
Specs 44/{ON,IN,OFF,ETA,POS} updated to the precise types; 44/POS drops its fuel item (legacy sets raw.fuel_in_tons with no formatted item)

2. Success-conditional descriptions (30 samples: arinc_702, 16/AUTPOS, H1/Paren)

Legacy defaultResult()-style plugins leave description: "Unknown" on failed decodes and set the real description only on success. Specs now declare "Unknown"; the hatches' success paths already set the real description (verified).

3. Uncaught throw on malformed binary input (1 sample, H1/OHMA)

Legacy wrapped base64→inflate chains in try/catch → graceful fail. The TS emitter now wraps the decode body in try/catch (→ failUnknown) whenever parse steps include a binary kind. Rust/C unaffected (their helpers return empty buffers rather than throwing).

Plus: field-level `raw: false` flag

Intermediate fields (e.g. 44/POS's flight_level_raw) that feed later fields but were never stored in raw by the legacy plugin can now suppress the auto-raw emit.

Runtime (TS): `remainingFuel` tolerates undefined/NaN

Mirrors the legacy parseFuel isNaN guard and the currentFuel precedent.

Verification

All 68 specs validate
acars-decoder-typescript regenerated against this branch: 696 tests pass (407 unit + 289 corpus), 0 failures
Follow-up: mirror the methodMap additions into emit-rust / emit-c (their generated trees aren't dispatcher-wired yet, so no behavioral impact today)

Why this matters

This is the corpus doing exactly its job: the 407 unit tests passed throughout while 41 samples diverged. Byte-parity claims for Rust/C in Stage 3 now have a proven, executable definition.

🤖 Generated with Claude Code

… success-conditional descriptions The new TS corpus runner (acars-decoder-typescript, 289 samples) failed 41 samples on first run, splitting into three root-cause classes. This commit fixes all three at the spec/emitter level: Class B — formatter enum too narrow (10 samples, 44-family): The v1 enum forced specs to map month/day/eta/off/on/in/fuel-remaining through generic `timestamp`/`fuel`, flattening distinct legacy items (MSG_MON, MSG_DAY, ON "Landing Time", ' FUEL_REM') into TIMESTAMP/FOB items. The unit tests never caught it; the corpus deep-equal did. - schema: formatter_call.type adds eta/out/off/on/in/day/month/ departure_day/arrival_day/fuel_remaining/remaining_fields - ir.ts: FormatterCall union extended to match - emit-typescript: methodMap additions + `remaining_fields` emission (`if (data.length > N) ResultFormatter.unknownArr(result, data.slice(N))` — the Label_44_Base.addRemainingFields pattern, closing the bulk-port agent's documented trailing-fields caveat) - specs 44/{ON,IN,OFF,ETA,POS}: items now use the precise types; 44/POS drops its fuel item entirely (legacy sets raw.fuel_in_tons with NO formatted item — the field's auto-raw emit covers it) Class A — success-conditional descriptions (30 samples; arinc_702, 16/AUTPOS, H1/Paren): Legacy plugins built on defaultResult() leave description "Unknown" on failed decodes and set the real description only on success (in-body or via Arinc702Helper). The generated wrapper set the spec description unconditionally. Fixed by setting those three specs' descriptions to "Unknown"; the existing hatches already set the real description on their success paths (verified), so successful samples are unaffected. Class C — uncaught throw in binary chains (1 sample, H1/OHMA): Legacy wrapped base64→inflate→decode chains in try/catch and failed gracefully; the generated chain threw. emit-typescript now wraps the decode body in try/catch (→ failUnknown) whenever the parse steps include a binary kind (ascii85/base64/deflate/text_decode/hex_decode). Rust/C are unaffected: their helpers return empty buffers rather than throwing. Runtime (TS): remainingFuel now tolerates undefined/NaN (mirrors the legacy parseFuel isNaN guard and the currentFuel precedent) so when-gated spec fields can call it unconditionally. All 68 specs validate; TS target regenerates clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fields like Label_44_POS's flight_level_raw exist only to feed a later field/formatter; the legacy plugin never stored them in result.raw. The auto-raw emit was adding a divergent raw key (caught by the last 2 corpus failures). 'raw: false' suppresses the auto-emit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-06-10T06:02:50Z

Warning

Review limit reached

@kevinelliott, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 59 minutes and 33 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 54dffeeb-b406-40fb-89ad-7af0edbd6d59

📥 Commits

Reviewing files that changed from the base of the PR and between e7c66e1 and 83908ca.

📒 Files selected for processing (13)

codegen/src/emit-typescript.ts
codegen/src/ir.ts
codegen/src/parse-spec.ts
runtimes/typescript/utils/result_formatter.ts
schema/ads-v1.schema.json
spec/labels/16/AUTPOS.yaml
spec/labels/44/ETA.yaml
spec/labels/44/IN.yaml
spec/labels/44/OFF.yaml
spec/labels/44/ON.yaml
spec/labels/44/POS.yaml
spec/labels/H1/Paren.yaml
spec/wildcards/arinc_702.yaml

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/corpus-parity

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83908caabd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-10T06:05:25Z

+    | "eta"
+    | "out"
+    | "off"
+    | "on"
+    | "in"


Update non-TS emitters for new formatter types

When ads-gen is run with --target rust or --target c, these shared formatter types are now accepted by the schema/IR and used by the label 44 specs, but codegen/src/emit-rust.ts and codegen/src/emit-c.ts still only map the old formatter names and otherwise emit TODO formatter comments. Because those same emitters suppress auto-raw for fields referenced by formatters, generated Rust/C decoders for these specs will silently omit the formatted items and the canonical raw values for fields such as ETA/out/off/on/in/month/day/fuel_remaining rather than matching the updated TypeScript output.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-10T06:05:25Z

+    # Intermediate value feeding `altitude`; legacy never stored it in raw.
+    raw: false


Honor raw:false outside the TypeScript emitter

For --target rust and --target c, this new raw: false flag is lowered into the shared IR but the Rust/C emitters' skipAutoRaw checks still only look at consumedByFormatter, so regenerating Label_44_POS in those targets will continue to write the intermediate flight_level_raw into result.raw. This makes the new spec flag target-dependent and preserves the parity bug the flag is intended to fix whenever the non-TS codegen paths are used.

Useful? React with 👍 / 👎.

kevinelliott and others added 2 commits June 9, 2026 22:58

chatgpt-codex-connector Bot reviewed Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corpus parity: formatter-enum expansion, raw:false flag, try/catch for binary chains#6

Corpus parity: formatter-enum expansion, raw:false flag, try/catch for binary chains#6
kevinelliott wants to merge 2 commits into
mainfrom
fix/corpus-parity

kevinelliott commented Jun 10, 2026

Uh oh!

coderabbitai Bot commented Jun 10, 2026

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 10, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		# Intermediate value feeding `altitude`; legacy never stored it in raw.
		raw: false

Conversation

kevinelliott commented Jun 10, 2026

Summary

Three root-cause classes

1. Formatter enum too narrow (10 samples, 44-family)

2. Success-conditional descriptions (30 samples: arinc_702, 16/AUTPOS, H1/Paren)

3. Uncaught throw on malformed binary input (1 sample, H1/OHMA)

Plus: field-level raw: false flag

Runtime (TS): remainingFuel tolerates undefined/NaN

Verification

Why this matters

Uh oh!

coderabbitai Bot commented Jun 10, 2026

Review limit reached

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Plus: field-level `raw: false` flag

Runtime (TS): `remainingFuel` tolerates undefined/NaN