Skip to content

Fix BuiltinCasing crash on GHC.Prim.Addr# (#7716)#7719

Open
Unisay wants to merge 10 commits intomasterfrom
yura/fix-builtin-casing-addr-joinpoint
Open

Fix BuiltinCasing crash on GHC.Prim.Addr# (#7716)#7719
Unisay wants to merge 10 commits intomasterfrom
yura/fix-builtin-casing-addr-joinpoint

Conversation

@Unisay
Copy link
Copy Markdown
Contributor

@Unisay Unisay commented Apr 14, 2026

Summary

  • Add unreachable second constructor to BuiltinData to prevent GHC's case-of-single-constructor optimization from exposing PlutusCore.Data.Data in join point types
  • Add COMPLETE pragma so existing pattern matches remain exhaustive
  • Add failing test (BuiltinCasing/Lib.hs) that reproduces the crash

Closes #7716

The problem

data BuiltinData = BuiltinData ~Data has one constructor. GHC's simplifier always unwraps single-constructor types, producing case bd of { BuiltinData d -> ... } in Core. This leaks d :: Data into join point type signatures. When the plugin compiles with BuiltinCasing, it tries to compile Data as a regular ADT, follows B ByteString to BS Addr# Int, and crashes — Addr# has no Plutus Core equivalent.

The fix

Add a second constructor BuiltinDataUnreachable that is never constructed. GHC won't case-simplify multi-constructor types, so Data stays behind the BuiltinData wrapper. A COMPLETE pragma marks the original constructor as exhaustive.

Details in Note [Opaque builtin types] in PlutusTx.Builtins.Internal.

Alternatives I tried

  • Hiding the constructor via export lists — GHC includes all constructors in .hi files regardless of what the module exports
  • newtype BuiltinData = UnsafeBuiltinData Any — newtypes are transparent in Core (coercions), so the plugin just crashes on Kind: forall k. k instead
  • Mapping Data to the builtin type in the plugin — fixes the Addr# crash but exposes a second error (Cannot construct BuiltinData); would need deeper changes to the expression compiler

@Unisay Unisay self-assigned this Apr 16, 2026
@Unisay Unisay requested review from a team and zliu41 April 16, 2026 08:07
@zliu41
Copy link
Copy Markdown
Member

zliu41 commented Apr 16, 2026

I'll need to understand what happened a bit more. There may be simpler solutions.

The issue description says "But the plugin's own simplifier pass (mkSimplPass in Plugin/Common.hs) can produce join points" - that's false. The plugin cannot produce join points.

In general I'd recommend against copy-pasting large amount of AI text. I generally find it verbose, with low signal-to-noise ratio, and unpleasant to read. It's better to use AI to understand the issue, then write the description on your own.

In this particular case, what would be a useful issue description is: "for this Haskell function, without builtin casing, it generates this GHC Core, which the plugin can compile. But with builtin casing, the GHC Core becomes this, which is problematic". It's very useful to include GHC Core, and anything else is unnecessary.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 17, 2026

Execution Budget Golden Diff

90dc9d6 (master) vs 6d710a2

output

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/geq1.golden.eval

Metric Old New Δ%
CPU 337_547_895 341_387_895 +1.14%
Memory 988_745 1_012_745 +2.43%
Flat Size 949 939 -1.05%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/geq2.golden.eval

Metric Old New Δ%
CPU 355_097_594 359_033_594 +1.11%
Memory 1_055_015 1_079_615 +2.33%
Flat Size 1_000 990 -1.00%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/geq3.golden.eval

Metric Old New Δ%
CPU 368_720_887 372_656_887 +1.07%
Memory 1_100_329 1_124_929 +2.24%
Flat Size 1_000 990 -1.00%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/geq4.golden.eval

Metric Old New Δ%
CPU 331_212_553 335_148_553 +1.19%
Memory 946_336 970_936 +2.60%
Flat Size 956 946 -1.05%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/geq5.golden.eval

Metric Old New Δ%
CPU 349_612_383 353_548_383 +1.13%
Memory 1_021_500 1_046_100 +2.41%
Flat Size 956 946 -1.05%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/gt1.golden.eval

Metric Old New Δ%
CPU 388_148_300 391_988_300 +0.99%
Memory 1_166_660 1_190_660 +2.06%
Flat Size 1_324 1_314 -0.76%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/gt2.golden.eval

Metric Old New Δ%
CPU 355_465_594 359_401_594 +1.11%
Memory 1_057_315 1_081_915 +2.33%
Flat Size 1_375 1_365 -0.73%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/gt3.golden.eval

Metric Old New Δ%
CPU 420_042_992 423_978_992 +0.94%
Memory 1_282_209 1_306_809 +1.92%
Flat Size 1_375 1_365 -0.73%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/gt4.golden.eval

Metric Old New Δ%
CPU 331_580_553 335_516_553 +1.19%
Memory 948_636 973_236 +2.59%
Flat Size 1_331 1_321 -0.75%

plutus-tx-plugin/test-ledger-api/Spec/Data/Budget/9.12/gt5.golden.eval

Metric Old New Δ%
CPU 373_819_011 377_755_011 +1.05%
Memory 1_110_324 1_134_924 +2.22%
Flat Size 1_331 1_321 -0.75%

plutus-tx-plugin/test/Budget/9.12/map2.golden.eval

Metric Old New Δ%
CPU 67_827_382 68_307_382 +0.71%
Memory 197_790 200_790 +1.52%
Flat Size 458 453 -1.09%

plutus-tx-plugin/test/Budget/9.12/map3.golden.eval

Metric Old New Δ%
CPU 111_907_732 112_771_732 +0.77%
Memory 333_684 339_084 +1.62%
Flat Size 705 695 -1.42%

This comment will get updated when changes are made.

@Unisay
Copy link
Copy Markdown
Contributor Author

Unisay commented Apr 17, 2026

Thanks for the feedback — you're right, I was wrong about mkSimplPass producing join points. Fixed the issue and PR description.

Here's the minimal Core showing the problem. Test function:

useTwiceData :: BuiltinData -> BuiltinUnit
useTwiceData bd =
  case toBuiltinData (firstOf items) of
    _ -> case toBuiltinData (firstOf items) of
      _ -> unitval
  where
    items = unsafeFromBuiltinData bd
    firstOf = caseList' Nothing (\(h :: BuiltinData) _t -> Just h)

GHC Core produced (without BuiltinCasing, stored in the .hi file):

useTwiceData :: BuiltinData -> BuiltinUnit
useTwiceData
  = \ (bd :: BuiltinData) ->
      case bd of bd1 { BuiltinData ipv ->        -- ← BuiltinData unwrapped
      case unsafeDataAsList of g1 { __DEFAULT ->
      case g1 bd1 of nt { BuiltinList ipv1 ->
      join {
        $j :: Data -> BuiltinUnit                 -- ← Data in join point type
        $j _
          = case caseList'
                   @BuiltinData
                   @(Maybe BuiltinData)
                   (Nothing @BuiltinData)
                   (\ (x :: BuiltinData) (eta :: BuiltinList BuiltinData) ->
                      case x of x1 { BuiltinData ipv3 ->
                      case eta of { BuiltinList ipv4 ->
                      Just @BuiltinData x1
                      }})
                   nt
            of {
              Nothing -> case mkConstr (IS 1#) (mkNilData unitval) of
                           { BuiltinData ipv3 -> unitval };
              Just arg -> case mkConstr (IS 0#) (mkCons @BuiltinData arg ...) of
                            { BuiltinData ipv3 -> unitval }
            }
      } in ...

The plugin (with BuiltinCasing) reads this Core via the interface file and tries to compile the type Data that appears naked in $j :: Data -> BuiltinUnit. It hits B ByteString -> BS Addr# Int and crashes.

Without the second constructor, GHC's simplifier unwraps BuiltinData via case-of-single-constructor at every use site, and $j's captured free variables pull the unwrapped Data into the join point's type signature. With two constructors, GHC can't apply case-of-single-constructor, so Data stays hidden behind the BuiltinData wrapper and the plugin never sees it.

Unisay added 10 commits April 20, 2026 15:24
…type

GHC optimizer can produce join points with naked PlutusCore.Data.Data
in the type signature. The plugin with BuiltinCasing tries to compile
Data as a regular ADT, walks B ByteString -> BS Addr# and crashes.

Minimal trigger: caseList' applied twice to the same value in a module
without BuiltinCasing, then compiled from a BuiltinCasing module.
GHC's simplifier unconditionally unwraps single-constructor types.
For `data BuiltinData = BuiltinData ~Data`, this exposes `Data` in
join point type signatures. The plugin with BuiltinCasing then tries
to compile Data as a regular ADT, walks B ByteString -> BS Addr#,
and crashes.

Fix: add a second (unreachable) constructor to BuiltinData so GHC
cannot apply case-of-single-constructor. A COMPLETE pragma ensures
existing pattern matches remain exhaustive without warnings.
Rename failsToCompile -> caseListTwice (it compiles now).
Update golden files affected by the BuiltinData second constructor.
When a local variable (e.g. from a where-clause) has no unfolding,
show the stage violation help message instead of a generic
FreeVariableError.
Same pattern as caseListTwice but with BuiltinByteString.
Currently does not crash (ByteString is handled differently by the
simplifier), but kept as a regression test for future GHC changes.
BuiltinByteString and BuiltinString don't have UnsafeFromData instances,
so tests use BuiltinList and Builtins.Internal.caseList' directly
instead of Data.List.caseList'.
useTwiceData (was caseListTwice), useTwiceByteString, useTwiceString.
ByteString and String tests pass opaque types directly as arguments
instead of wrapping in BuiltinList.
The BuiltinData second constructor affects PIR output and budget values
slightly (CPU +0.7-1.5%, memory +1.5%, but AST/Flat size decreases).
Also updates stage violation error messages and renames from fourmolu.
@Unisay Unisay force-pushed the yura/fix-builtin-casing-addr-joinpoint branch from 0fd4f71 to 6d710a2 Compare April 20, 2026 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BuiltinCasing crashes on GHC.Prim.Addr# when a join point exposes Data in its type

2 participants