Skip to content

Conversation

@7layermagik
Copy link

@7layermagik 7layermagik commented Jan 14, 2026

Background: SIMD-186 and Account Loading

SIMD-186 changed how account data size limits are validated during transaction loading. When active, Mithril uses loadAndValidateTxAcctsSimd186() instead of the legacy loadAndValidateTxAccts() function.

The key difference: SIMD-186 requires pre-calculating total loaded account sizes (including program data accounts) before building the transaction accounts, to enforce the new size limits correctly.


Problem: Double-Cloning of Accounts

In pkg/replay/accounts.go, the loadAndValidateTxAcctsSimd186 function (lines 210-322) processes accounts in two passes:

Pass 1: Size Accumulation (lines 226-236)

for i, pubkey := range acctKeys {
    acct, err := slotCtx.GetAccount(pubkey)  // Clone #1
    // ...
    err = accumulator.collectAcct(acct)
    // ...
}

Pass 2: Build TransactionAccounts (lines 261-279)

for idx, acctMeta := range txAcctMetas {
    // ... special cases handled ...
    } else {
        acct, err = slotCtx.GetAccount(acctMeta.PublicKey)  // Clone #2 (redundant!)
    }
    acctsForTx = append(acctsForTx, *acct)
    // ...
}

Each slotCtx.GetAccount() call clones the account:

  • Account.Clone() allocates make([]byte, len(Data))
  • Copies entire data slice
  • For program accounts (up to 10MB): 2x 10MB allocations per tx referencing it

Additional O(n) Lookup Issue

The original code also used:

slices.Contains(programIdIdxs, uint64(idx))

This is O(n) per account in the transaction. For a tx with 10 accounts and 3 instructions, this performs up to 30 comparisons per tx just for program index checks.


Solution: Slice-Based Memoization

1. Cache accounts during Pass 1

// NEW: Memoize accounts loaded in Pass 1
acctCache := make([]*accounts.Account, len(acctKeys))

for i, pubkey := range acctKeys {
    acct, err := slotCtx.GetAccount(pubkey)
    // ...
    acctCache[i] = acct  // Store for Pass 2
    err = accumulator.collectAcct(acct)
    // ...
}

2. Replace O(n) slice scan with O(1) boolean mask

// OLD: O(n) per lookup
var programIdIdxs []uint64
// ... append indices ...
slices.Contains(programIdIdxs, uint64(idx))  // O(n)

// NEW: O(1) lookup
isProgramIdx := make([]bool, len(acctKeys))
for instrIdx, instr := range tx.Message.Instructions {
    i := int(instr.ProgramIDIndex)
    if i >= 0 && i < len(isProgramIdx) {
        isProgramIdx[i] = true
    }
    // ...
}
// Usage: isProgramIdx[idx]  // O(1)

3. Reuse cached accounts in Pass 2

for idx, acctMeta := range txAcctMetas {
    var acct *accounts.Account
    cached := acctCache[idx]  // Reuse from Pass 1 (same index ordering)

    if acctMeta.PublicKey == sealevel.SysvarInstructionsAddr {
        acct = instrsAcct  // Special case unchanged
    } else if /* dummy account conditions */ isProgramIdx[idx] /* O(1) now */ && ... {
        acct = &accounts.Account{Key: acctMeta.PublicKey, Owner: cached.Owner, ...}
    } else {
        acct = cached  // Use cached, no clone!
    }
    // ...
}

4. Reuse cache in program validation

for instrIdx, instr := range instrs {
    // Use ProgramIDIndex for direct cache lookup
    programIdx := int(tx.Message.Instructions[instrIdx].ProgramIDIndex)
    programAcct := acctCache[programIdx]  // No clone!
    
    // Fallback only if nil (shouldn't happen for valid txs)
    if programAcct == nil {
        programAcct, err = slotCtx.GetAccount(instr.ProgramId)
        // ...
    }
    // ... validation unchanged ...
}

Why Slice Over Map?

Approach Allocation Lookup Memory
map[solana.PublicKey]*Account Map bucket allocation Hash + probe 32-byte keys + pointers
[]*Account slice Single slice Direct index Just pointers

Key insight: tx.Message.AccountKeys and tx.AccountMetaList() use identical index ordering. The nth key in AccountKeys corresponds to the nth entry in AccountMetaList(). This makes positional indexing both correct and optimal.


Performance Impact

Metric Before After Reduction
GetAccount calls per tx 2N N 50%
Account clones per tx 2N N 50%
Data allocations per tx 2N × len(Data) N × len(Data) 50%
Program index lookup O(n) per account O(1) O(n) → O(1)

Concrete example:

  • Block with 1,000 transactions
  • Average 5 accounts per transaction
  • Before: 10,000 account clones
  • After: 5,000 account clones
  • Saved: 5,000 allocations + copies per block

For accounts with large data (programs, token accounts with extensions), this significantly reduces GC pressure.


Edge Cases Preserved

Case Handling
SysvarInstructions Uses instrsAcct directly (special sysvar)
Dummy program accounts Creates dummy struct, uses cached Owner
Out-of-range ProgramIDIndex Fallback to GetAccount (shouldn't occur in valid replay)
Cache miss in validation Falls back to GetAccount + GetAccountFromAccountsDb

Files Changed

File Lines Change
pkg/replay/accounts.go 210-322 Add acctCache slice, isProgramIdx mask, reuse in Pass 2 and validation

Verification

  1. Build: go build ./pkg/replay/
  2. Unit tests: go test ./pkg/replay/...
  3. Mainnet replay: Verify bank hashes match (no behavioral change)
  4. Metrics: Monitor alloc MB/s and Δgc in 100-slot summary — should decrease

🤖 Generated with Claude Code

…ateTxAccts

When SIMD-186 is active, loadAndValidateTxAcctsSimd186 was loading each
account twice: once for size accumulation (Pass 1) and once for building
TransactionAccounts (Pass 2). Each GetAccount call clones the account,
causing 2x allocations and data copies per account per transaction.

Changes:
- Add acctCache slice to store accounts from Pass 1
- Reuse cached accounts in Pass 2 instead of re-cloning
- Replace programIdIdxs slice with isProgramIdx boolean mask for O(1) lookup
  (eliminates slices.Contains linear scan in hot loop)
- Reuse cache in program validation loop via tx.Message.Instructions index

Impact: ~50% reduction in account allocations/copies per transaction,
reduced GC pressure during high-throughput replay.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@7layermagik 7layermagik force-pushed the perf/simd186-account-memoization branch from d45ea13 to 56ab9ea Compare January 18, 2026 04:27
@7layermagik 7layermagik changed the base branch from smcio/update-stake-cache-in-per to dev January 18, 2026 04:27
Adds defensive bounds check to prevent panic if ProgramIDIndex is
out of range. Falls back to GetAccount lookup for out-of-bounds
indices (shouldn't happen for valid mainnet transactions).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@7layermagik 7layermagik deleted the perf/simd186-account-memoization branch January 28, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants