Skip to content

refactor src/codegen#1923

Merged
mohamedbasuony merged 15 commits into
hyperledger-solang:mainfrom
Islam-Imad:refactor/codegen
Jun 23, 2026
Merged

refactor src/codegen#1923
mohamedbasuony merged 15 commits into
hyperledger-solang:mainfrom
Islam-Imad:refactor/codegen

Conversation

@Islam-Imad

@Islam-Imad Islam-Imad commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Problem

Target-specific policy was scattered across shared codegen files as inline if ns.target == … / match ns.target { … } branches.


Solution: TargetCodegen trait

src/codegen/interface.rs introduces a single TargetCodegen trait. SorobanTarget, SolanaTarget, and PolkadotTarget each implement it.


1. Selector hash algorithm — selector_hash_algorithm()

A target property (constant that differs per target) replaces a branch. Solana uses Sha256; every other target uses Keccak256. (c3ae3d24, expression.rs)

-    let hash_algorithm = if ns.target == Target::Solana {
-        ast::Builtin::Sha256
-    } else {
-        ast::Builtin::Keccak256
-    };
+    let hash_algorithm = target.selector_hash_algorithm();

2. Function dispatch — function_dispatch()

A standalone match ns.target factory in dispatch/mod.rs is replaced by a trait method; each target owns its dispatcher. (b9cbddb7, dispatch/mod.rsmod.rs)

-    match &ns.target {
-        Target::Solana => vec![solana::function_dispatch(contract_no, all_cfg, ns, opt)],
-        Target::Polkadot { .. } | Target::EVM => {
-            polkadot::function_dispatch(contract_no, all_cfg, ns, opt)
-        }
-        Target::Soroban => soroban::function_dispatch(contract_no, all_cfg, ns, opt),
-    }
+    for mut dispatch_cfg in target.function_dispatch(contract_no, &mut all_cfg, ns, opt) {
+        optimize_and_check_cfg(&mut dispatch_cfg, ns, ASTFunction::None, opt);
+        all_cfg.push(dispatch_cfg);
+    }

3. Pre-CFG validation — validate_contract()

Soroban's ABI type checks ran as a guarded block inside the shared program loop. (b9cbddb7, mod.rs)

-    if ns.target == Target::Soroban {
-        soroban::validate_accessor_abi_types(contract_no, ns, diagnostics);
-        soroban::validate_event_abi_types(contract_no, ns, diagnostics);
-    }
+    target.validate_contract(contract_no, ns);

4. Post-CFG validation — validate_cfgs()

Soroban's post-CFG ABI check ran inline after all CFGs were built. (b9cbddb7, mod.rs)

-    if ns.target == Target::Soroban {
-        soroban::validate_abi_types(&all_cfg, ns);
-        if ns.diagnostics.any_errors() {
-            return;
-        }
-    }
+    target.validate_cfgs(&all_cfg, ns);

5. Post-program processing — post_process_program()

Solana's account collection and management ran as a guarded block at the end of codegen(). (b9cbddb7, mod.rs)

-    if ns.target == Target::Solana {
-        for contract_no in 0..ns.contracts.len() {
-            if ns.contracts[contract_no].instantiable {
-                let diag = collect_accounts_from_contract(contract_no, ns);
-                ns.diagnostics.extend(diag);
-            }
-        }
-        for contract_no in 0..ns.contracts.len() {
-            if ns.contracts[contract_no].instantiable {
-                manage_contract_accounts(contract_no, ns);
-            }
-        }
-    }
+    target.post_process_program(ns, opt);

6. Initial storage slot — initial_storage_slot()

Solana reserves the first 16 bytes for account metadata; all other targets begin at slot 0. (6dbe54b8, mod.rs)

-    let mut slot = if ns.target == Target::Solana {
-        BigInt::from(SOLANA_FIRST_OFFSET)
-    } else {
-        BigInt::zero()
-    };
+    let mut slot = target.initial_storage_slot();

7. Storage slot alignment — align_storage_slot()

Solana aligns each field to its natural alignment; other targets pack without padding. (6dbe54b8, mod.rs)

-    if ns.target == Target::Solana {
-        let alignment = ty.align_of(ns);
-        let offset = slot.clone() % alignment;
-        if offset > BigInt::zero() {
-            slot += alignment - offset;
-        }
-    }
+    slot = target.align_storage_slot(slot, &ty, ns);

8. Storage array length — lower_storage_array_length()

Solana and Soroban store the length inline; Polkadot/EVM load it from a separate storage slot. (3fada021, expression.rs)

-    if ns.target == Target::Solana || ns.target == Target::Soroban {
-        Expression::StorageArrayLength { loc: *loc, ty: ty.clone(), array: Box::new(array), elem_ty: elem_ty.clone() }
-    } else {
-        load_storage(loc, &ns.storage_type(), array, cfg, vartab, None, ns)
-    }
+    target.lower_storage_array_length(loc, ty, array, elem_ty, cfg, vartab, ns)

9. Storage array push — storage_array_push()

Solana and storage_bytes use a flat array_push; Polkadot/Soroban use the slot-hashing path. (9930674b, expression.rs)

-    if ns.target == Target::Solana || args[0].ty().is_storage_bytes() {
-        array_push(loc, args, cfg, contract_no, func, ns, vartab, opt)
-    } else {
-        storage_slots_array_push(loc, args, cfg, contract_no, func, ns, vartab, opt)
-    }
+    target.storage_array_push(loc, args, cfg, contract_no, func, ns, vartab, opt)

10. Storage array pop — storage_array_pop()

Same split as push: Solana/bytes use array_pop; Polkadot/Soroban use storage_slots_array_pop. (9930674b, expression.rs)

-    if ns.target == Target::Solana || args[0].ty().is_storage_bytes() {
-        array_pop(loc, args, cfg, contract_no, func, ns, vartab, opt)
-    } else {
-        storage_slots_array_pop(loc, args, cfg, contract_no, func, ns, vartab, opt)
-    }
+    target.storage_array_pop(loc, args, &ty[0], cfg, contract_no, func, ns, vartab, opt)

11. Storage array entry offset — storage_array_entry_offset()

Soroban encodes the index and uses a Subscript; Polkadot/EVM hash the slot with keccak256. (6bb4269b, storage.rs)

-    let array_offset = if ns.target == Target::Soroban {
-        let index_encoded = soroban_encode_arg(index, cfg, vartab, ns);
-        Expression::Subscript { loc: *loc, ty: elem_ty.clone(), expr: Box::new(var_expr.clone()), index: Box::new(index_encoded), .. }
-    } else {
-        // keccak256-hash the slot + linear offset
-        ..
-    };
+    let array_offset = target.storage_array_entry_offset(loc, &var_expr, index, &elem_ty, &slot_ty, cfg, vartab, ns);

12. Storage struct member access — lower_storage_struct_member()

Three-way split: Solana uses pre-computed storage_offsets; Polkadot/EVM sum field slots; Soroban encodes the offset index and calls VecPushBack. (6dbe54b8, expression.rs)

-    let offset = if ns.target == Target::Solana {
-        struct_ty.definition(ns).storage_offsets[*field_no].clone()
-    } else {
-        struct_ty.definition(ns).fields[..*field_no]
-            .iter()
-            .filter(|field| !field.infinite_size)
-            .map(|field| field.ty.storage_slots(ns))
-            .sum()
-    };
-    if ns.target == Target::Soroban {
-        let offset_encoded = soroban_encode_arg(offset, cfg, vartab, ns);
-        cfg.add(vartab, Instr::Call { .. HostFunctions::VecPushBack .. });
-        var
-    } else {
-        Expression::Add { left: Box::new(expression(var, ..)), right: Box::new(NumberLiteral(offset)), .. }
-    }
+    let var_expr = expression(var, cfg, contract_no, func, ns, vartab, opt, target);
+    target.lower_storage_struct_member(loc, var_expr, struct_ty, *field_no, ns, cfg, vartab)

13. Load expression — lower_load()

Soroban lazily decodes handle references on load; all other targets return the Load expression as-is. (c3ae3d24, expression.rs)

-    if ns.target == Target::Soroban {
-        if let Type::Ref(inner) = expr.ty() {
-            if matches!(inner.as_ref(), Type::SorobanHandle(_)) {
-                let load_handle = Expression::Load { loc: *loc, ty: inner.as_ref().clone(), expr: expr.clone() };
-                return soroban_decode_arg(load_handle, cfg, vartab, ns, None);
-            }
-        }
-    }
-    Expression::Load { loc: *loc, ty: ty.clone(), expr }
+    let load = Expression::Load { loc: *loc, ty: ty.clone(), expr };
+    target.lower_load(load, cfg, vartab, ns)

14. Load-storage result — lower_load_storage()

After reading from storage, Soroban decodes any SorobanHandle reference; others pass through. (6dbe54b8, storage.rs)

-    let data = if ns.target == Target::Soroban
-        && matches!(dest.ty(), Type::Ref(inner) if matches!(inner.as_ref(), Type::SorobanHandle(_)))
-    {
-        soroban_encode_arg(Expression::Variable { .. }, cfg, vartab, ns)
-    } else {
-        Expression::Variable { loc: Loc::Codegen, ty: ty.clone(), var_no: pos }
-    };
+    target.lower_load_storage(var, cfg, vartab, ns)

15. Prepare storage value — prepare_storage_value()

Before writing to storage, Soroban encodes the value; all other targets write it unchanged. (c3ae3d24, expression.rs)

-    if ns.target == Target::Soroban {
-        value = soroban_encode_arg(value, cfg, vartab, ns);
-    }
-    cfg.add(vartab, Instr::SetStorage { value, .. });
+    let value = target.prepare_storage_value(value, dest, cfg, vartab, ns);
+    cfg.add(vartab, Instr::SetStorage { value, .. });

16. Default storage value — default_storage_value()

For Soroban, String/DynamicBytes/dynamic-array variables are initialised with soroban_vec_new; other targets skip initialisation and use continue. (c3ae3d24, mod.rs)

-    let soroban_init_with_vec = ns.target == Target::Soroban && match &var.ty { .. };
-    let mut value = if let Some(init) = &var.initializer {
-        expression(init, ..)
-    } else if soroban_init_with_vec {
-        soroban::soroban_vec_new(&var.loc, &var.ty, &mut cfg, &mut vartab)
-    } else {
-        continue;
-    };
+    let Some(mut value) = target.default_storage_value(&var.loc, &var.ty, &mut cfg, &mut vartab, ns)
+        .or_else(|| var.initializer.as_ref().map(|init| expression(init, ..)))
+    else { continue; };

17. ABI encode — abi_encode()

Soroban's early-return in the shared encoder is replaced by a per-target method. (ddac492c, encoding/mod.rs)

-    if ns.target == Target::Soroban {
-        let ret = soroban_encode(loc, args, ns, vartab, cfg, packed);
-        return (ret.0, ret.1);
-    }
-    let mut encoder = create_encoder(ns, packed);
-    // … shared buffer-encoder path …
+    target.abi_encode(loc, args, ns, vartab, cfg, false).0

18. ABI decode — abi_decode()

Same pattern as encode: Soroban's early-return is replaced by a trait method. (ddac492c, encoding/mod.rs)

-    if ns.target == Target::Soroban {
-        return soroban_decode(loc, buffer, types, ns, vartab, cfg, buffer_size_expr);
-    }
-    // … shared buffer-decoder path …
+    target.abi_decode(loc, &data, tys, ns, vartab, cfg, None)

19. Event emitter — event_emitter()

A standalone new_event_emitter factory with a full match ns.target is replaced by a trait method. (e45cbd83, events/mod.rs)

-    match ns.target {
-        Target::Polkadot { .. } | Target::EVM => Box::new(PolkadotEventEmitter { args, ns, event_no }),
-        Target::Solana  => Box::new(SolanaEventEmitter { loc: *loc, args, ns, event_no }),
-        Target::Soroban => Box::new(SorobanEventEmitter { args, ns, event_no }),
-    }
+    let emitter = target.event_emitter(loc, *event_no, args, ns);
+    emitter.emit(contract_no, func, cfg, vartab, opt, target);

20. Default gas — default_gas_builtin()

default_gas(ns) was a free function that branched on ns.target; now each target returns its constant directly. (6dbe54b8, expression.rs)

-    pub fn default_gas(ns: &Namespace) -> Expression {
-        Expression::NumberLiteral {
-            value: if ns.target == Target::EVM { BigInt::from(i64::MAX) } else { BigInt::zero() },
-            ..
-        }
-    }
+    value: target.default_gas_builtin(),

21. Print expression — lower_print_expr()

Polkadot prepends a print:/,\n prefix; EVM and other targets pass the expression through unchanged. (6dbe54b8, expression.rs)

-    let to_print = if ns.target.is_polkadot() {
-        add_prefix_and_delimiter_to_print(expr)
-    } else {
-        expr
-    };
+    let to_print = target.lower_print_expr(expr);

22. Mapping subscript — lower_mapping_subscript()

Polkadot hashes (array, index) with keccak256 to form the storage key; EVM/Solana/Soroban use a direct Subscript. (6dbe54b8, expression.rs)

-    return match ns.target {
-        Target::Solana | Target::Soroban | Target::EVM => Expression::Subscript {
-            loc: *loc, ty: elem_ty.clone(), array_ty: array_ty.clone(),
-            expr: Box::new(array), index: Box::new(index),
-        },
-        Target::Polkadot { .. } => Expression::Keccak256 {
-            loc: *loc, ty: array_ty.clone(), exprs: vec![array, index],
-        },
-    };
+    return target.lower_mapping_subscript(loc, elem_ty, array_ty, array, index);

23. Target-specific builtins — lower_builtin()

All per-target builtin arms (EVM Gasprice, PayableSend/PayableTransfer, and Soroban RequireAuth/AuthAsCurrContract/ExtendTtl) are consolidated behind a single hook. (6dbe54b8, 57e4f941, d740e636, expression.rs)

Before — three separate inline arms scattered across expression.rs:

-    // EVM-only gasprice (expression.rs:1073)
-    ast::Expression::Builtin { kind: ast::Builtin::Gasprice, args: expr, .. }
-        if expr.len() == 1 && ns.target == Target::EVM =>
-    { builtin_evm_gasprice(loc, expr, cfg, contract_no, func, ns, vartab, opt) }
-
-    // Polkadot/EVM payable (expression.rs:1032)
-    ast::Expression::Builtin { kind: ast::Builtin::PayableSend, .. }
-        => payable_send(args, cfg, contract_no, func, ns, vartab, loc, opt),
-    ast::Expression::Builtin { kind: ast::Builtin::PayableTransfer, .. }
-        => payable_transfer(args, cfg, contract_no, func, ns, vartab, loc, opt),
-
-    // Soroban auth/ttl (expression.rs:2427, 2493, 2879)
-    ast::Builtin::RequireAuth => { /* host call */ }
-    ast::Builtin::AuthAsCurrContract => { /* encode auth context + host call */ }
-    ast::Builtin::ExtendTtl => { /* append storage type + builtin */ }

After — one hook that every target opts into:

+    if let Some(e) = target.lower_builtin(loc, *kind, args, cfg, contract_no, func, ns, vartab, opt) {
+        return e;
+    }

Add the TargetCodegen strategy trait (interface.rs) plus a make_target
factory (targets/mod.rs) that the orchestrator drives instead of branching
on ns.target. Re-export EventEmitter from interface.rs as the second
boundary trait (visibility bumped to pub(crate)).

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
Move soroban.rs, dispatch/soroban.rs, encoding/soroban_encoding.rs and
events/soroban.rs under targets/soroban/ as history-preserving renames,
and move the SorobanTarget impl alongside them. Repoint imports and bump
the cross-boundary helpers to pub(crate). Pure relocation: golden CFG
tests pass with zero CHECK edits.

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
Signed-off-by: Islam-Imad <islamimad404@gmail.com>
…tCodegen

Convert five scattered target branches in the shared lowering to TargetCodegen
methods, each with a behaviour-preserving default:

  - selector_hash_algorithm()        (was Solana/else Sha256/Keccak256 ternary)
  - storage_array_length_is_inline()  (was Solana||Soroban inline-length branch)
  - lower_load()                      (was Soroban handle lazy-decode in Load arm)
  - prepare_storage_value()           (was Soroban encode before SetStorage/Store)
  - default_storage_value()           (was Soroban soroban_init_with_vec block)

Soroban overrides the three behavioural hooks; Solana overrides the two property
methods; Polkadot inherits the defaults. Golden CFG (zero // CHECK: edits) and
soroban suites pass; lower_load verified IR-identical via before/after emit-cfg
diff on handle-loading contracts.

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
Signed-off-by: Islam-Imad <islamimad404@gmail.com>
Signed-off-by: Islam-Imad <islamimad404@gmail.com>
Signed-off-by: Islam-Imad <islamimad404@gmail.com>
@Islam-Imad Islam-Imad marked this pull request as draft June 18, 2026 18:17
…gh TargetCodegen

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
…ltin

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
Signed-off-by: Islam-Imad <islamimad404@gmail.com>
…ze; seal TargetCodegen

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
…fset

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
@Islam-Imad Islam-Imad changed the title [WIP] Refactor/codegen [WIP] refactor src/codegen Jun 20, 2026
Signed-off-by: Islam-Imad <islamimad404@gmail.com>
@Islam-Imad Islam-Imad changed the title [WIP] refactor src/codegen refactor src/codegen Jun 20, 2026
@Islam-Imad Islam-Imad marked this pull request as ready for review June 20, 2026 16:50
Signed-off-by: Islam-Imad <islamimad404@gmail.com>
@mohamedbasuony mohamedbasuony self-requested a review June 21, 2026 09:54
@mohamedbasuony

Copy link
Copy Markdown
Contributor

The direction is right. Replacing the scattered if ns.target == … branches with a TargetCodegen trait (one impl per target) is the correct fix, and it's executed cleanly.

Two things to tighten before merge:

  • is_evm: bool just relocates the branch inside PolkadotTarget rather than eliminating it. A dedicated EvmTarget would be cleaner, and this is the most error-prone area to verify — anywhere EVM previously hit a _ => default needs a close look.

  • Finish the PR description — it's cut off mid-list at "### 1." The per-method mapping (each trait method to the branch it replaced) is exactly what reviewers need.
    Minor: the trait is getting large (~22 mixed-concern methods); worth committing to a future split. Otherwise, in favor of the approach — keep the batches coming.

@Islam-Imad

Copy link
Copy Markdown
Contributor Author

cc @mohamedbasuony

@mohamedbasuony

Copy link
Copy Markdown
Contributor

Please fix the issues causing the failed build tests for Linux x86-64.

Signed-off-by: Islam-Imad <islamimad404@gmail.com>
@mohamedbasuony mohamedbasuony merged commit 475e3c8 into hyperledger-solang:main Jun 23, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants