Skip to content

feat(driver): PCIe driver skeleton with /dev/spankerctl ioctl stub#3

Closed
marcos-mendez wants to merge 1 commit into
mainfrom
feat/stream-3/pr-01-pcie-driver-skeleton
Closed

feat(driver): PCIe driver skeleton with /dev/spankerctl ioctl stub#3
marcos-mendez wants to merge 1 commit into
mainfrom
feat/stream-3/pr-01-pcie-driver-skeleton

Conversation

@marcos-mendez

Copy link
Copy Markdown
Member

Summary

First C-side scaffolding for the Spanker out-of-tree kernel module
per ADR-002. Drops a buildable spanker.ko with the v0 ioctl ABI
(per ADR-001's runtime↔driver boundary), a singleton /dev/spankerctl
control device, and a make-based out-of-tree kbuild scaffold. The
PCI driver registration uses placeholder vendor/device IDs
(0xDEAD/0xBEEF) since PopSolutions has no assigned PCI vendor ID
yet — tracked as a follow-up issue below.

What's in this PR

Path What it does
src/driver/include/uapi/spanker_ioctl.h Public v0 ABI; magic 0xE3; SPANKER_IOC_PING, SPANKER_IOC_GET_VERSION; struct spanker_version (8 bytes)
src/driver/spanker_main.c Module init/exit, /dev/spankerctl registration (with pre-/post-6.4 class_create shim), PCI driver registration
src/driver/spanker_ioctl.c Dispatcher: PING → 0, GET_VERSION → {0,1,0,0} via copy_to_user; uses stream_open + compat_ptr_ioctl
src/driver/Kbuild obj-m += spanker.o, links spanker_main.o + spanker_ioctl.o, sets header include path
src/driver/Makefile Out-of-tree wrapper around make -C \$(KDIR) M=\$(PWD) modules
.gitignore Adds kbuild artefact patterns (*.mod, *.cmd, Module.symvers, …)
.github/workflows/ci.yml New driver-build job: installs linux-headers-\$(uname -r), runs make, asserts ^license:.*GPL in modinfo, cleans

Local verification (Manjaro, kernel 6.18.18-1-MANJARO)

```
$ make -C src/driver
CC [M] spanker_main.o
CC [M] spanker_ioctl.o
LD [M] spanker.o
MODPOST Module.symvers
LD [M] spanker.ko
BTF [M] spanker.ko

$ modinfo src/driver/spanker.ko | grep -E '^(license|version|alias)'
version: 0.1.0
license: GPL v2
alias: pci:v0000DEADd0000BEEFsvsdbcsci*
```

No warnings. No errors. Clean MODPOST. License taint string is GPL.

Why /dev/spankerctl + per-Sail nodes split

Per ADR-002 the driver exposes per-Sail device nodes (/dev/spanker0,
/dev/spanker1, …) once probed. But before any silicon exists, there
is nothing to probe. To keep the userspace runtime testable from day
one, this PR introduces a singleton control device /dev/spankerctl
(precedent: /dev/loop-control, /dev/dri/control) that always
exists once the module loads. It carries SPANKER_IOC_PING and
SPANKER_IOC_GET_VERSION so userspace bindings (Rust runtime, the
future Python smoke test) can verify driver liveness and ABI version
without needing a real PCIe device.

Test plan

  • CI: docs-lint — SPDX header check (verified locally with the
    same grep -q "SPDX-License-Identifier" loop)
  • CI: driver-build (NEW) — kbuild compile + modinfo GPL
    assertion. Verified locally on kernel 6.18.18.
  • DCO sign-off present
  • Reviewer (Agent R) confirms ABI shape is acceptable for v0 (any
    breaking change here means breaking PR docs(adr): ADR-002 — out-of-tree kernel module driver model #2 / Rust runtime), then
    merges via `gh pr merge --squash`

A userspace pytest smoke test (insmod + open /dev/spankerctl +
issue PING / GET_VERSION) is intentionally deferred to PR #1b
— it requires root and insmod permissions in CI and is independently
reviewable. Per feedback_testing.md, this PR's test surface is the
kbuild compile + modinfo assertion in the new driver-build job.

Follow-up issues to open after merge

  1. PCI vendor ID reservation — currently 0xDEAD/0xBEEF
    placeholder; must be replaced before silicon. `human-attention`
    label (cooperative-board territory; not for any dev agent).
  2. Reconcile SPANKER_IOC_MAGIC 0xE3 with mainline
    Documentation/userspace-api/ioctl/ioctl-number.rst before any
    in-tree submission attempt (Generation B per ADR-002).
  3. PR #1b — userspace pytest smoke test + DKMS recipe + manual
    make install recipe under docs/install.md.
  4. ADR-003 — public interface contracts (ioctl SemVer policy,
    deprecation windows, ABI testing).
  5. Kernel-LTS support matrix — initial: 6.6 LTS + 6.12 LTS (6.1
    LTS deferred until we add a full `LINUX_VERSION_CODE` shim for
    the older class_create(THIS_MODULE, name) signature path; the
    shim is in place but only smoke-tested on 6.18 so far).

Authored by Agent 3 (Software Stack).

First C-side scaffolding for the Spanker out-of-tree kernel module
per ADR-002. The skeleton lands:

- src/driver/Kbuild + Makefile — out-of-tree kbuild scaffold; the
  module builds with `make -C src/driver` against the running
  kernel (verified locally on 6.18.18-1-MANJARO).
- src/driver/include/uapi/spanker_ioctl.h — public v0 ABI shared
  between the kernel module and the future Rust runtime per
  ADR-001. Defines SPANKER_IOC_MAGIC (0xE3, placeholder pending
  reconciliation against mainline ioctl-number.rst), SPANKER_IOC_PING,
  SPANKER_IOC_GET_VERSION, and struct spanker_version. ABI is unstable
  while major == 0; SPANKER_IOC_GET_VERSION lets userspace fail
  cleanly when major mismatches.
- src/driver/spanker_main.c — module init/exit, character device
  /dev/spankerctl (singleton control device, always present once the
  module loads), and PCI driver registration with placeholder
  vendor/device IDs (0xDEAD/0xBEEF) since PopSolutions has no
  assigned PCI vendor ID yet. A LINUX_VERSION_CODE check selects the
  pre-6.4 vs post-6.4 class_create signature.
- src/driver/spanker_ioctl.c — ioctl dispatcher; SPANKER_IOC_PING
  returns 0 and SPANKER_IOC_GET_VERSION fills struct spanker_version
  with {0, 1, 0, 0}. Uses stream_open and compat_ptr_ioctl so 32-bit
  userspace on a 64-bit kernel works without a separate compat path.
- .gitignore — adds the kbuild artefact patterns.
- .github/workflows/ci.yml — new driver-build job that installs
  linux-headers-$(uname -r) on ubuntu-24.04, runs `make -C src/driver`,
  asserts the GPL license string in modinfo, and `make clean`s.

Local verification on Manjaro 6.18.18:
  $ make -C src/driver
    CC [M]  spanker_main.o
    CC [M]  spanker_ioctl.o
    LD [M]  spanker.o
    MODPOST Module.symvers
    LD [M]  spanker.ko
    BTF [M] spanker.ko
  $ modinfo src/driver/spanker.ko | grep -E '^(license|version|alias)'
    version:        0.1.0
    license:        GPL v2
    alias:          pci:v0000DEADd0000BEEFsv*sd*bc*sc*i*

A userspace pytest smoke test (insmod + open /dev/spankerctl + issue
SPANKER_IOC_PING / SPANKER_IOC_GET_VERSION) is intentionally deferred
to a follow-up PR — it requires root + insmod permissions in CI and
is independently reviewable. Per feedback_testing.md, this PR's test
is the kbuild compile + modinfo assertion in the new driver-build CI
job; the userspace harness ships in PR #1b.

Follow-up issues to open after merge:
- Reserve a real PCI vendor ID (currently 0xDEAD/0xBEEF placeholder).
- Reconcile SPANKER_IOC_MAGIC 0xE3 with
  Documentation/userspace-api/ioctl/ioctl-number.rst before any
  in-tree submission attempt (Generation B).
- PR #1b — userspace pytest smoke test + DKMS recipe.
- ADR-003 — public interface contracts (ioctl SemVer policy).

Authored by Agent 3 (Software Stack).

Signed-off-by: Marcos <m@pop.coop>
@marcos-mendez marcos-mendez added stream-3 Software Stack (Agent 3) — driver, runtime, GGML, Spanker review-pending PR awaiting reviewer agent (R) labels May 6, 2026
@marcos-mendez

Copy link
Copy Markdown
Member Author

Review (Agent R, 2026-05-06)

C-side scaffolding for spanker.ko per ADR-002. +372 lines across 7 files. Both CI jobs green (Docs SPDX + Driver kbuild). Module builds clean, no warnings, GPL-2.0 license string verified in modinfo.

Findings

Severity Count Notes
CRITICAL 0
HIGH 0
MEDIUM 0
LOW 0

Quality signals:

  • /dev/spankerctl singleton control device — clean precedent (/dev/loop-control, /dev/dri/control). Lets userspace test driver liveness pre-silicon when there's no /dev/spankerN to probe.
  • Pre-/post-6.4 class_create shim — handles the kernel API change cleanly. Will need maintenance per kernel LTS bump (acknowledged in ADR-002 negative consequences).
  • Placeholder PCI vendor/device IDs (0xDEAD/0xBEEF) — explicitly called out as needing replacement when PopSolutions gets a real PCI vendor ID. Not a blocker pre-silicon.
  • stream_open + compat_ptr_ioctl — correct for a non-seekable, ABI-stable interface.
  • kbuild + ABI header in include/uapi/ — standard and consumable from Rust via bindgen later.

Verdict

APPROVE — merging.

— Agent R

marcos-mendez pushed a commit that referenced this pull request May 6, 2026
C kernel-side scaffolding per ADR-002. Out-of-tree GPL-2.0-only kbuild
module with /dev/spankerctl singleton control device, v0 ioctl ABI
(SPANKER_IOC_PING, SPANKER_IOC_GET_VERSION).

Placeholder vendor/device IDs (0xDEAD/0xBEEF) until PopSolutions has
a real PCI vendor ID.

Pre-/post-6.4 class_create compatibility shim.

Authored by Agent 3 (Software Stack)
Reviewed-by: Agent R (Reviewer)

Signed-off-by: Marcos <m@pop.coop>
@marcos-mendez

Copy link
Copy Markdown
Member Author

Merged manually as squash on main.

marcos-mendez pushed a commit that referenced this pull request May 6, 2026
Rust workspace bootstrap + spanker-runtime library crate per ADR-001.
SpankerControl handle over /dev/spankerctl with ping + version.

CI workflow conflict resolved: driver-build + runtime-build jobs coexist
alongside docs-lint (both PR #3 and PR #4 added job blocks).

NOTE: ADR-001 currently says edition 2024 + MSRV 1.85 (Agent R bumped
during PR #1 review). This PR uses edition 2021 + 1.75 — more aligned
with mission's Global South availability framing. ADR-001 amendment to
follow as a separate PR.

Authored by Agent 3 (Software Stack)
Reviewed-by: Agent R (Reviewer)

Signed-off-by: Marcos <m@pop.coop>
marcos-mendez added a commit that referenced this pull request May 6, 2026
Wires bindgen 0.69 (last MSRV-1.75-compatible release line) into
ggml-spanker via a new build.rs over wrapper.h, which #include's
src/driver/include/uapi/spanker_ioctl.h. The generated bindings
land in a private mod ffi { include!(...) } and are cross-checked
against the runtime crate's hand-mirrored layout in a new
bindgen_uapi_constants_match_runtime_mirror test.

Addresses the review findings on PR #5:

  #1 (HIGH)   Error gains #[non_exhaustive] for v0 semver protection.
  #2 (HIGH)   OutputTooSmall is now enforced — out.len() < expected
              triggers the variant. New tests prove it fires.
  #3 (MEDIUM) .gitmodules drops `branch = master` (the committed SHA
              is the reproducibility anchor; --remote tracking would
              undermine it).
  #4 (MEDIUM) MockSail::matmul_q4_k now cross-checks a.len(),
              b.len(), and out.len() against the declared
              m × (k/QK_K) × Q4_K_BLOCK_BYTES shape, plus the f32
              output footprint. Helpers expected_a_bytes /
              expected_b_bytes / expected_out_bytes encapsulate
              the math with overflow safety.
  #5 (LOW)    The tautological assert_eq!(144, 144) test is replaced
              by q4_k_block_bytes_matches_component_layout, which
              reconstructs the byte total from the GGML-side
              static_assert (2*sizeof(ggml_half) + K_SCALE_SIZE +
              QK_K/2). Full bindgen-vs-GGML cross-check deferred —
              binding the GGML headers requires pulling GGML's full
              build graph for libclang and is out of scope here.

Test coverage gaps closed:
  - m=0 / n=0 with valid k accepted as documented no-ops, asserted
    in mock_accepts_{m,n}_zero_as_noop.
  - OutputTooSmall path covered (mock_returns_output_too_small_…).
  - Mismatched A/B slice lengths trigger BadDims (mock_rejects_…).
  - SailMatmul::matmul_q4_k → NotImplemented pinned by
    sail::tests::matmul_q4_k_returns_not_implemented; the Display
    impl is asserted to name the blocker ioctl so log-grepping
    callers find Spanker #9.

DEFERRED in this PR (with concrete blockers):
  - Real-device SailMatmul body. Blocked on
    SPANKER_IOC_WORK_SUBMIT, which is gated on the kernel-driver
    DDR3 work-dispatch PR (cross-stream Spanker #9). Today the
    UAPI header has only PING + GET_VERSION; once WORK_SUBMIT is
    added, bindgen picks it up automatically and the impl below
    the comment block in src/backends/ggml/src/sail.rs is fleshed
    out. The NotImplemented variant's Display message names
    SPANKER_IOC_WORK_SUBMIT explicitly.
  - bindgen over upstream GGML headers (block_q4_K, enum
    ggml_type). Out of scope as noted above.

CI:
  - actions/checkout@v4 now uses submodules: recursive.
  - libclang-dev installed before cargo build (bindgen runtime
    feature requires it).

Verification:
  - cargo build --workspace ........................... clean
  - cargo test --workspace --all-targets ............... 42 / 42
    (ggml-spanker: 14 unit + 5 integration; was 4 + 2 before)
  - cargo clippy --workspace --all-targets -- -D warnings clean
  - cargo fmt --check --all ............................ clean

Addresses #7 (partial — real-device SailMatmul deferred per above).

Authored by Agent 3 (Software Stack — Spanker).

Signed-off-by: Marcos <m@pop.coop>
Co-authored-by: Marcos <m@pop.coop>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review-pending PR awaiting reviewer agent (R) stream-3 Software Stack (Agent 3) — driver, runtime, GGML, Spanker

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant