Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# SPDX-License-Identifier: CC-BY-SA-4.0
name: ci

on:
push:
branches: [main]
pull_request:
branches: [main]
workflow_dispatch:

jobs:
verilator-cocotb:
name: Verilator + cocotb tests
runs-on: ubuntu-24.04
steps:
- name: Checkout (with submodules)
uses: actions/checkout@v4
with:
submodules: recursive

- name: Install Verilator build deps
run: |
sudo apt-get update
sudo apt-get install -y \
git make autoconf flex bison \
libfl2 libfl-dev zlib1g zlib1g-dev \
help2man perl ccache

- name: Cache Verilator install
id: verilator-cache
uses: actions/cache@v4
with:
path: ~/.local/verilator
key: verilator-v5.040-${{ runner.os }}-x64

- name: Build Verilator 5.040 from source (only on cache miss)
if: steps.verilator-cache.outputs.cache-hit != 'true'
run: |
git clone --branch v5.040 --depth 1 \
https://github.com/verilator/verilator.git /tmp/verilator
cd /tmp/verilator
autoconf
./configure --prefix=$HOME/.local/verilator
make -j$(nproc)
make install

- name: Add Verilator to PATH and verify
run: |
echo "$HOME/.local/verilator/bin" >> $GITHUB_PATH
$HOME/.local/verilator/bin/verilator --version

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: '3.12'

- name: Install cocotb
run: |
python -m pip install --upgrade pip
pip install cocotb cocotb-bus pytest
cocotb-config --version

- name: Run inner_jib_top tests (end-to-end with sum_ints.asm)
run: |
cd verif/inner_jib_top
make
10 changes: 10 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@

# verif outputs
verif/.venv
verif/**/sim_build/
verif/**/results.xml
verif/**/*.vcd
verif/**/*.fst
**/__pycache__/
*.pyc
.pytest_cache/
7 changes: 6 additions & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
[submodule "mast"]
path = mast
url = git@git.pop.coop:pop/MAST.git
# Use HTTPS so GitHub Actions CI (which has no SSH key for git.pop.coop)
# can clone the submodule. Local developers with SSH set up can still
# override via:
# git config submodule.mast.url git@git.pop.coop:pop/MAST.git
url = https://git.pop.coop/pop/MAST.git
branch = main
94 changes: 62 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,39 @@

> First Sail of the PopSolutions fleet. The validation tape-out.

[![ci](https://github.com/popsolutions/InnerJib7EA/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/popsolutions/InnerJib7EA/actions/workflows/ci.yml)

**InnerJib7EA** is the first silicon product of PopSolutions Sails. SKU
**POPC_16A** — embedded entry single-board RISC-V accelerator with 16 GB DDR5,
targeting edge AI inference and on-device fine-tuning.

This is a deliberately small first silicon: monolithic die in Skywater 130nm
via Google Open MPW shuttle, low cost, low risk. The goal is to validate the
end-to-end design flow (RTL → simulation → synthesis → P&R → tape-out → driver
→ application) with the smallest possible blast radius. Lessons from
InnerJib7EA inform the chiplet-based ForeTopsail7EA and MainTopsail7EA that
follow.
end-to-end design flow with the smallest possible blast radius.

## Status

Starting (2026-05). RTL integration in progress. See open issues.
**Sprint C (`inner_jib_top.sv` end-to-end integration) landed.** A real
RISC-V program (`mast/examples/direct/sum_ints.asm`) now runs through:

```
upstream core.sv -> core_axi4_adapter -> axi4_master_simple -> axi4_mem_model
```

The cocotb test pre-populates instruction memory through the mem model's
loader back-door (added in [MAST#8](https://github.com/popsolutions/MAST/pull/8)),
sets the core's PC, and asserts ena. The core then fetches instructions,
executes the loop, and emits the expected output sequence `[0, 5, 4, 3, 2,
1, 0, 15]` before halting.

This is the first program of any kind running on PopSolutions silicon-
equivalent hardware in simulation. Every subsequent program (factorial,
primes, matrix multiply, eventually GGML kernels) lands on the same path.

## Quick spec (target — to be locked via ADR in this repo)
See [`docs/adr/0001-spec.md`](docs/adr/0001-spec.md) for the locked POPC_16A
specification.

## Quick spec

| Parameter | Target |
|---|---|
Expand All @@ -28,44 +45,57 @@ Starting (2026-05). RTL integration in progress. See open issues.
| DRAM | 16 GB DDR5-4800 SO-DIMM (single channel) |
| Host | PCIe Gen4 x4 (via LitePCIe) |
| TDP | < 25 W |
| Form factor | Mini-ITX SBC + M.2 accelerator variant |
| Form factor | M.2 22110 NGFF accelerator card |
| Reference workload | GGML int4 inference of TinyLlama-1.1B |
| BOM target | R$ 800–1500 |

## How this repo relates to MAST
## Run the testbench locally

One-time setup:

```bash
git clone --recursive git@git.pop.coop:pop/InnerJib7EA.git
cd InnerJib7EA
~/.pyenv/versions/3.12.10/bin/python3 -m venv mast/verif/.venv
source mast/verif/.venv/bin/activate
pip install cocotb cocotb-bus pytest
deactivate
ln -s ../mast/verif/.venv verif/.venv
```

Run:

```bash
source verif/.venv/bin/activate
cd verif/inner_jib_top
make
```

Expected output ends with:

```
** TESTS=2 PASS=2 FAIL=0 SKIP=0 **
```

## Relationship to MAST

InnerJib7EA vendors [`popsolutions/MAST`](https://github.com/popsolutions/MAST)
as a git submodule under `mast/`. MAST holds the shared IP (RISC-V core,
compute unit, memory controller, AXI4 interconnect, verification harness).
This repo holds only product-specific integration: top-level Verilog,
configuration, PCB design, datasheets, product tests.
compute unit, AXI4 subsystem, verification harness). This repo holds only
product-specific integration: top-level Verilog (`src/inner_jib_top.sv`),
spec ADRs, eventually PCB design files, datasheets, product tests.

When InnerJib7EA tape-outs to silicon, the MAST submodule is frozen at the
specific MAST release used. That submodule pin is the reproducibility
contract.
When InnerJib7EA tape-outs to silicon, the MAST submodule pin is frozen at
the specific MAST release used. That pin is the reproducibility contract.

## License

Same dual-license model as MAST. See
[`popsolutions/MAST/NOTICE.md`](https://github.com/popsolutions/MAST/blob/main/NOTICE.md):

- Hardware contributions: CERN-OHL-S v2 (commercial dual-license available)
- Software contributions: Apache 2.0
- Documentation: CC-BY-SA 4.0
[`mast/NOTICE.md`](https://github.com/popsolutions/MAST/blob/main/NOTICE.md).

## Contributing

See [`popsolutions/MAST/CONTRIBUTING.md`](https://github.com/popsolutions/MAST/blob/main/CONTRIBUTING.md).
See [`mast/CONTRIBUTING.md`](https://github.com/popsolutions/MAST/blob/main/CONTRIBUTING.md)
and the cooperative-affiliate-only policy in
[`mast/GOVERNANCE.md`](https://github.com/popsolutions/MAST/blob/main/GOVERNANCE.md).
DCO sign-off required on every commit (`git commit -s`).

## Roadmap

See open issues. Major milestones for InnerJib7EA:

1. Lock spec (this repo, ADR-001-spec)
2. Top-level Verilog integration with MAST submodule
3. Verilator simulation runs end-to-end (TinyLlama-1.1B inference)
4. RTL synthesis area/timing report
5. Skywater 130nm Open MPW shuttle submission
6. First silicon validation
7. Driver + GGML backend hand-off to Spanker7EA
2 changes: 1 addition & 1 deletion mast
Submodule mast updated from 82921d to 9938d2
166 changes: 166 additions & 0 deletions src/inner_jib_top.sv
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
// SPDX-License-Identifier: CERN-OHL-S-2.0
// Copyright (c) 2026 PopSolutions Cooperative
//
// InnerJib7EA top-level: 1 upstream RISC-V core wired through
// MAST's AXI4 chain to a 256-line internal SRAM. Host loads the
// program through the mem_model's back-door loader port; once
// the program is in memory, a set_pc + ena pulse releases the
// core to execute it.
//
// This is the simulation top-level. The FPGA / silicon top-level
// will replace axi4_mem_model with a LiteDRAM controller and wire
// loader_en to a bootrom unrolling state machine; the core /
// adapter / master chain is unchanged from this module.

`default_nettype none

module inner_jib_top (
input wire clk,
input wire rst_n,

// Core control
input wire ena,
input wire set_pc_req,
input wire [addr_width-1:0] set_pc_addr,

// Core observation
output wire halt,
output wire [data_width-1:0] out,
output wire outen,
output wire outflen,

// Program loader pass-through
input wire loader_en,
input wire [phys_addr_width-1:0] loader_addr,
input wire [data_width-1:0] loader_data
);

// Core ↔ adapter (upstream bespoke memory iface)
wire core_mem_rd_req;
wire core_mem_wr_req;
wire [addr_width-1:0] core_mem_addr;
wire [data_width-1:0] core_mem_rd_data;
wire [data_width-1:0] core_mem_wr_data;
wire core_mem_busy;
wire core_mem_ack;

// Adapter ↔ master (simple req/resp)
wire ma_we;
wire [phys_addr_width-1:0] ma_addr;
wire [mem_data_width-1:0] ma_wdata;
wire [axi4_strb_width-1:0] ma_wstrb;
wire ma_start;
wire ma_busy;
wire ma_done;
wire [mem_data_width-1:0] ma_rdata;
wire ma_err;

// Master ↔ mem (AXI4)
wire [axi4_id_width-1:0] ax_awid;
wire [phys_addr_width-1:0] ax_awaddr;
wire [axi4_len_width-1:0] ax_awlen;
wire [axi4_size_width-1:0] ax_awsize;
wire [axi4_burst_width-1:0] ax_awburst;
wire ax_awvalid;
wire ax_awready;
wire [mem_data_width-1:0] ax_wdata;
wire [axi4_strb_width-1:0] ax_wstrb;
wire ax_wlast;
wire ax_wvalid;
wire ax_wready;
wire [axi4_id_width-1:0] ax_bid;
wire [axi4_resp_width-1:0] ax_bresp;
wire ax_bvalid;
wire ax_bready;
wire [axi4_id_width-1:0] ax_arid;
wire [phys_addr_width-1:0] ax_araddr;
wire [axi4_len_width-1:0] ax_arlen;
wire [axi4_size_width-1:0] ax_arsize;
wire [axi4_burst_width-1:0] ax_arburst;
wire ax_arvalid;
wire ax_arready;
wire [axi4_id_width-1:0] ax_rid;
wire [mem_data_width-1:0] ax_rdata;
wire [axi4_resp_width-1:0] ax_rresp;
wire ax_rlast;
wire ax_rvalid;
wire ax_rready;

// Upstream RISC-V core
core core1 (
.rst(rst_n),
.clk(clk),
.clr(1'b0), // no synchronous reset; rst_n handles init
.ena(ena),
.set_pc_req(set_pc_req),
.set_pc_addr(set_pc_addr),

.out(out),
.outen(outen),
.outflen(outflen),
.halt(halt),

.mem_addr(core_mem_addr),
.mem_rd_data(core_mem_rd_data),
.mem_wr_data(core_mem_wr_data),
.mem_wr_req(core_mem_wr_req),
.mem_rd_req(core_mem_rd_req),
.mem_ack(core_mem_ack),
.mem_busy(core_mem_busy)
);

// Bridge core ↔ AXI4
core_axi4_adapter adapter (
.clk(clk), .rst_n(rst_n),
.core_mem_rd_req(core_mem_rd_req), .core_mem_wr_req(core_mem_wr_req),
.core_mem_addr(core_mem_addr),
.core_mem_rd_data(core_mem_rd_data), .core_mem_wr_data(core_mem_wr_data),
.core_mem_busy(core_mem_busy), .core_mem_ack(core_mem_ack),
.m_req_we(ma_we), .m_req_addr(ma_addr), .m_req_wdata(ma_wdata),
.m_req_wstrb(ma_wstrb), .m_req_start(ma_start),
.m_req_busy(ma_busy), .m_req_done(ma_done),
.m_req_rdata(ma_rdata), .m_req_err(ma_err)
);

// AXI4 master
axi4_master_simple master (
.clk(clk), .rst_n(rst_n),
.req_we(ma_we), .req_addr(ma_addr),
.req_wdata(ma_wdata), .req_wstrb(ma_wstrb), .req_start(ma_start),
.req_busy(ma_busy), .req_done(ma_done),
.req_rdata(ma_rdata), .req_err(ma_err),
.m_awid(ax_awid), .m_awaddr(ax_awaddr), .m_awlen(ax_awlen),
.m_awsize(ax_awsize), .m_awburst(ax_awburst),
.m_awvalid(ax_awvalid), .m_awready(ax_awready),
.m_wdata(ax_wdata), .m_wstrb(ax_wstrb), .m_wlast(ax_wlast),
.m_wvalid(ax_wvalid), .m_wready(ax_wready),
.m_bid(ax_bid), .m_bresp(ax_bresp),
.m_bvalid(ax_bvalid), .m_bready(ax_bready),
.m_arid(ax_arid), .m_araddr(ax_araddr), .m_arlen(ax_arlen),
.m_arsize(ax_arsize), .m_arburst(ax_arburst),
.m_arvalid(ax_arvalid), .m_arready(ax_arready),
.m_rid(ax_rid), .m_rdata(ax_rdata), .m_rresp(ax_rresp),
.m_rlast(ax_rlast), .m_rvalid(ax_rvalid), .m_rready(ax_rready)
);

// 256-line internal SRAM (= 8 KB) with loader back-door
axi4_mem_model #(.DEPTH_WORDS(256)) mem (
.clk(clk), .rst_n(rst_n),
.s_awid(ax_awid), .s_awaddr(ax_awaddr), .s_awlen(ax_awlen),
.s_awsize(ax_awsize), .s_awburst(ax_awburst),
.s_awvalid(ax_awvalid), .s_awready(ax_awready),
.s_wdata(ax_wdata), .s_wstrb(ax_wstrb), .s_wlast(ax_wlast),
.s_wvalid(ax_wvalid), .s_wready(ax_wready),
.s_bid(ax_bid), .s_bresp(ax_bresp),
.s_bvalid(ax_bvalid), .s_bready(ax_bready),
.s_arid(ax_arid), .s_araddr(ax_araddr), .s_arlen(ax_arlen),
.s_arsize(ax_arsize), .s_arburst(ax_arburst),
.s_arvalid(ax_arvalid), .s_arready(ax_arready),
.s_rid(ax_rid), .s_rdata(ax_rdata), .s_rresp(ax_rresp),
.s_rlast(ax_rlast), .s_rvalid(ax_rvalid), .s_rready(ax_rready),
.loader_en(loader_en),
.loader_addr(loader_addr),
.loader_data(loader_data)
);

endmodule
Loading
Loading