Skip to content

[cross-stream] expose queryable axi4_mem_model state for ggml-spanker integration tests #17

@marcos-mendez

Description

@marcos-mendez

Background

popsolutions/Spanker PR #5 (`feat(ggml): ggml-spanker crate with
MatmulInt4 trait + MockSail`,
popsolutions/Spanker#5) lands the Q4_K
matmul primitive and a host-side mock that records the AXI4
transactions a real Sail-side matmul would issue. The mock asserts
shape — Write A → Write B → ComputeSubmit → Read OUT — but it
cannot validate bit-exact correctness against the real RTL.

The Spanker briefing's hard constraint is explicit:

Never claim a workload runs on the Sails until you have integration
test against MAST's `axi4_mem_model` in cocotb.

To honour that constraint for PR #5b (real-device `SailMatmul`)
and beyond, Spanker needs a stable, externally-consumable view of
`axi4_mem_model` state.

Ask

Provide one of the following — Agent 1's call which is more
ergonomic for the verification harness:

Option A — Python helper exposing memory state

A module at `mast/verif/lib/axi4_mem_state.py` that opens an
already-running cocotb sim (or its FST/VCD dump, or shared memory)
and exposes:

```python
def snapshot() -> dict[int, bytes]:
"""Return current contents: {addr → 256-bit value as bytes}."""

def last_n_transactions(n: int) -> list[tuple[str, int, int]]:
"""Return the last n AXI4 ops: ("rd"|"wr", addr, n_beats)."""
```

Spanker's integration tests would import this helper, drive a
matmul through the kernel module against a fake PCIe device that
forwards reads/writes into `axi4_mem_model`, and assert
transaction shape + final memory state.

Option B — parametrizable cocotb test

A cocotb test under `MAST/verif/axi4_mem_model/` that consumes a
YAML/JSON AXI4 transaction sequence file (path supplied via env
var or CLI arg) and verifies bit-exact behaviour. Spanker's CI
would generate the sequence file from the Rust matmul under test
and invoke the cocotb test as a subprocess.

Why this matters

  • Without this, every Spanker PR that claims "integration tested"
    is asserting against a host-side mock — useful for shape but
    blind to actual RTL behaviour.
  • Closes the loop between Agent 3's host-side trait (`MatmulInt4`)
    and Agent 1's RTL implementation of the matmul / AXI4 handshake.
  • Per project_multicard_parallelism.md the scheduler will eventually
    exercise the same path across N cards; getting the single-card
    harness right now sets the contract.

Priority

Blocks Spanker PR #5b (real-device `SailMatmul` body). Not a
blocker for Spanker PR #5 itself (which is honestly scoped to
mock-only).

Acceptance

  • Either A or B (or a hybrid) lands in MAST under a path that
    Spanker can import or invoke from CI.
  • A short README under `MAST/verif//README.md`
    describes the entry point and contract.
  • Spanker PR #5b's CI demonstrably exercises it end-to-end.

Filed by Agent 3 (Software Stack) after merging Spanker PR #5.

Metadata

Metadata

Assignees

No one assigned

    Labels

    cross-streamTouches multiple streams — coordination neededstream-1RTL Architect (Agent 1) — SystemVerilog, cocotb, MAST primarystream-3Software Stack (Agent 3) — driver, runtime, GGML, Spanker

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions