Summary
Deterministic WASM trap in cpython!py_gl_call with code=None, causes=[] — no structured GenVM error is produced, only a Fingerprint { module_instances: {"cpython": ..., "softfloat": ...} } snapshot. Repros every time for one specific (contract, calldata) pair in production Studio.
Because the error carries no structured code, the consensus worker classifies it as retryable and loops forever — one poisoned tx produced ~6,500 identical errors before we cancelled it manually.
Environment
- Executor:
v0.2.12 (prd) — note: current release is v0.2.16, so this may already be fixed. First question: does this still repro on v0.2.16? If yes, details below. If no, we'll just upgrade.
- Studio backend:
yeagerai/simulator-jsonrpc@sha256:7217dfadd020… (prd, 2026-04-15)
- Reference: Sentry
GENLAYER-STUDIO-11X, stuck tx hash 0x451098a355fe114f89575d720595cf87ed34a065e425a0aa3c56adbd14e6b1d5 (now CANCELED in prd DB), contract 0xf074a62BBfd331e62221a159853D536EA2ca9733.
Error output
ERROR backend.consensus.worker:_transaction_context:633 GenVM internal error during transaction 0x451098…:
code=None,
causes=[],
is_fatal=False,
is_leader=True,
message=GenVM internal error,
detail=Fingerprint {
module_instances: {
"cpython": ModuleFingerprint { memories: [MemoryFingerprint([17, 168, 94, 202, 147, 39, 85, 220, 86, 96, 211, 83, 178, 29, 159, 37, 93, 53, 82, 163, 229, 19, 40, 219, 75, 163, 203, 76, 203, 98, 94, 248])] },
"softfloat": ModuleFingerprint { memories: [MemoryFingerprint([252, 158, 79, 163, 68, 141, 165, 13, 198, 255, 81, 74, 186, 5, 104, 186, 4, 82, 118, 245, 141, 60, 96, 253, 244, 197, 195, 210, 139, 22, 172, 200])] }
}
}
Caused by:
0: error while executing at wasm backtrace:
0: 0x1039b - cpython!py_gl_call
1: 0xa3f111 - cpython!cfunction_call
2: 0x8fead3 - cpython!_PyObject_MakeTpCall
3: 0x8ff81b - cpython!PyObject_Vectorcall
4: 0x9172b6 - cpython!_PyEval_EvalFrameDefault
5: 0x906093 - cpython!PyEval_EvalCode
6: 0xad3d5b - cpython!run_eval_code_obj
7: 0xad3bfc - cpython!run_mod.llvm.8421269000175133780
8: 0xad1c83 - cpython!_PyRun_SimpleFileObject
The trap is inside py_gl_call (the cpython module's host-call dispatcher), before any Lua code in llm.lua runs — that's why no structured cause bubbles up.
Repro contract
# { "Depends": "py-genlayer:1jb45aa8ynh2a9c9xn3b7qqh8sm5q93hwfp7jqmwsfhh8jpz09h6" }
from genlayer import *
import base64
import json
class MarketSignalOracle(gl.Contract):
market_name: str
last_evaluation: str
evaluations_by_candidate: str
recent_candidate_ids: str
def __init__(self, market_name: str):
self.market_name = market_name
self.last_evaluation = json.dumps({"candidateId":"","verdict":"ignore","confidence":0.0,"reasoningSummary":"No candidate has been evaluated yet.","alertDecision":False,"tags":[],"metadata":{"market":market_name,"symbol":"","version":"v1","dominantSignal":"none","riskBias":"neutral"}}, sort_keys=True)
self.evaluations_by_candidate = json.dumps({}, sort_keys=True)
self.recent_candidate_ids = json.dumps([])
@gl.public.write
def evaluate_candidate(self, candidate_id: str, symbol: str, captured_at: str,
raw_signal_json: str, metrics_json: str, history_context_json: str) -> None:
raw_signal = json.loads(raw_signal_json)
metrics = json.loads(metrics_json)
history_context = json.loads(history_context_json)
prompt = f"""
You are a GenLayer market signal evaluator for crypto derivatives.
Evaluate one anomaly candidate for {self.market_name}.
Candidate ID: {candidate_id}
Symbol: {symbol}
Captured at: {captured_at}
Raw signal JSON: {json.dumps(raw_signal, sort_keys=True)}
Metrics JSON: {json.dumps(metrics, sort_keys=True)}
History context JSON: {json.dumps(history_context, sort_keys=True)}
"""
def leader_fn():
return gl.nondet.exec_prompt(prompt, response_format="json")
def validator_fn(leader_result) -> bool:
return isinstance(leader_result, gl.vm.Return)
evaluation = gl.vm.run_nondet_unsafe(leader_fn, validator_fn)
self.last_evaluation = json.dumps(evaluation, sort_keys=True)
(Full original contract is longer but the minimal repro above has the same two gl.* calls on the hot path.)
Constructor arg: market_name="BTCUSDT".
Repro calldata (for evaluate_candidate)
Arguments, decoded from the prd calldata:
| Arg |
Value |
candidate_id |
"base64-test-20260415-01" |
symbol |
"BTCUSDT" |
captured_at |
"2026-04-15T14:35:00Z" |
raw_signal_json |
'{"score":84,"severity":"high","triggeredRules":["funding_dislocation","open_interest_spike"]}' |
metrics_json |
'{"price":75139.6,"priceChange1h":1.9,"priceChange24h":4.2,"volume24h":123456789,"volumeChangeShort":22.4,"openInterest":99887766,"openInterestChange1h":14.8,"fundingRate":0.00045,"fundingRateDelta":0.0002,"longLiquidations":2500000,"shortLiquidations":1200000}' |
history_context_json |
(base64 of a JSON object with snapshotWindow:"6h" and a recentEvents array) |
Total calldata size: ~770 bytes. Not abnormally large — so this is unlikely to be a memory/stack blow-up from sheer size.
The contract name base64-test-20260415-01 suggests the user was specifically probing base64-encoded JSON string arguments. That might be a hint about what's upsetting the host.
What we've confirmed
- Deterministic. Same tx ran ~6,500 times with the same fingerprint.
- Not a user-code error. The trap is in
py_gl_call, before user Python code can throw. Python stack frames above it (PyEval_EvalFrameDefault, etc.) are just the dispatch path into the host call.
- No structured cause. Because of that,
causes=[] and code=None, which defeats the consensus-worker's classification.
- Two gl. calls on the hot path* — we can't tell from the studio-side logs which of them traps. We're not capturing executor stderr in Sentry, only the consensus-worker wrapper. If you have a way to run this locally with executor-side logging, that will nail down which call.
Asks
- Is this fixed in v0.2.16? Please try the repro above. If yes, we'll just bump our prd executor.
- If still reproing on head: need to know which of the two
gl.* calls traps, and why it's not producing a structured error (so the Python side can't classify it as non-retryable).
- Independent of this bug, a structured error for this class of trap would be very useful — even
INTERNAL_ERROR with a one-line message is enough for the studio side to stop hammering poisoned txs.
Cross-ref: we'll also add a max-retry counter on our side (studio consensus/worker.py) so one poisoned tx can't produce thousands of events regardless of the GenVM-side root cause.
Summary
Deterministic WASM trap in
cpython!py_gl_callwithcode=None, causes=[]— no structured GenVM error is produced, only aFingerprint { module_instances: {"cpython": ..., "softfloat": ...} }snapshot. Repros every time for one specific(contract, calldata)pair in production Studio.Because the error carries no structured
code, the consensus worker classifies it as retryable and loops forever — one poisoned tx produced ~6,500 identical errors before we cancelled it manually.Environment
v0.2.12(prd) — note: current release isv0.2.16, so this may already be fixed. First question: does this still repro on v0.2.16? If yes, details below. If no, we'll just upgrade.yeagerai/simulator-jsonrpc@sha256:7217dfadd020…(prd, 2026-04-15)GENLAYER-STUDIO-11X, stuck tx hash0x451098a355fe114f89575d720595cf87ed34a065e425a0aa3c56adbd14e6b1d5(now CANCELED in prd DB), contract0xf074a62BBfd331e62221a159853D536EA2ca9733.Error output
The trap is inside
py_gl_call(the cpython module's host-call dispatcher), before any Lua code inllm.luaruns — that's why no structured cause bubbles up.Repro contract
(Full original contract is longer but the minimal repro above has the same two
gl.*calls on the hot path.)Constructor arg:
market_name="BTCUSDT".Repro calldata (for
evaluate_candidate)Arguments, decoded from the prd calldata:
candidate_id"base64-test-20260415-01"symbol"BTCUSDT"captured_at"2026-04-15T14:35:00Z"raw_signal_json'{"score":84,"severity":"high","triggeredRules":["funding_dislocation","open_interest_spike"]}'metrics_json'{"price":75139.6,"priceChange1h":1.9,"priceChange24h":4.2,"volume24h":123456789,"volumeChangeShort":22.4,"openInterest":99887766,"openInterestChange1h":14.8,"fundingRate":0.00045,"fundingRateDelta":0.0002,"longLiquidations":2500000,"shortLiquidations":1200000}'history_context_jsonsnapshotWindow:"6h"and arecentEventsarray)Total calldata size: ~770 bytes. Not abnormally large — so this is unlikely to be a memory/stack blow-up from sheer size.
The contract name
base64-test-20260415-01suggests the user was specifically probingbase64-encoded JSON string arguments. That might be a hint about what's upsetting the host.What we've confirmed
py_gl_call, before user Python code can throw. Python stack frames above it (PyEval_EvalFrameDefault, etc.) are just the dispatch path into the host call.causes=[]andcode=None, which defeats the consensus-worker's classification.Asks
gl.*calls traps, and why it's not producing a structured error (so the Python side can't classify it as non-retryable).INTERNAL_ERRORwith a one-line message is enough for the studio side to stop hammering poisoned txs.Cross-ref: we'll also add a max-retry counter on our side (studio
consensus/worker.py) so one poisoned tx can't produce thousands of events regardless of the GenVM-side root cause.