Skip to content

[Speculation] Automated buffering#629

Open
shundroid wants to merge 8 commits intomainfrom
dev/shundroid/specv1-auto-buf
Open

[Speculation] Automated buffering#629
shundroid wants to merge 8 commits intomainfrom
dev/shundroid/specv1-auto-buf

Conversation

@shundroid
Copy link
Collaborator

@shundroid shundroid commented Oct 18, 2025

This PR modifies the speculation compilation flow to support automated buffering. To achieve this, the speculation pass has been split into two distinct passes: pre-buffering and post-buffering.

The automated buffering pass strictly assumes join semantics: meaning an elastic unit will only output a token to its output channels once it has joined (received) all input tokens. Because certain channels of speculator violate this timing model, we cannot place the complete speculator in a single pass before buffering.

Pre-Buffering Pass (Placeholder Generation):

  • Places temporary speculative units (SpecPreBuffer1 and 2) that strictly adhere to join semantics, allowing the buffering pass to run successfully.
    • For example, because SCSaveCtrl and dataOut emit a token upon receiving a trigger token, these channels are encapsulated into an elastic unit.
  • At this stage, SCCommitCtrl is intentionally omitted because it violates the join-based timing model (it only emits tokens after a misspeculation).
    • Because of this, the control network for the save-commit units is only partially constructed here.
image

Post-Buffering Pass (Finalization):

  • Coalesces the temporary SpecPreBuffer units into the final speculator unit.
  • Connects the non-join signals, primarily wiring up SCCommitCtrl.
  • Finalizes the control logic for the save-commit units.
  • Automatically places additional buffers to ensure high throughput and avoid deadlock.
image

I admit this method might not be ideal or optimal. Because of the decomposition of the speculator, the frequency regulation no longer works. This is a quick workaround to make spec v1 comparable with the new speculation and to mitigate the non-deterministic circuit construction.

I also fixed several bugs in the speculator/save-commit implementations and the speculation pass, which are necessary for the speculation integration tests to work.

@shundroid shundroid requested a review from murphe67 October 18, 2025 08:17
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I didn't realize the merge of #552
before this latencies of beta-backend units were fixed, and speculation tests are benchmarked with those old latencies

if (TailEn = '1') and (HeadEn = '0') then
-- if new tail index will reach head index
if ((Tail +1) mod {fifo_depth} = Head) then
if ((Tail + 2) mod {fifo_depth} = Head) then
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the ring-buffer thing: we cannot fill up all slots to distinguish the empty and full state.

Full is latched so we should compare using Tail + 2, instead of + 1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean buffered? A latched value is usually accidental https://vhdlwhiz.com/why-latches-are-bad/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're correct, it's buffered (or registered)

static inline bool canGoThroughOutsideBlocks(Operation *op) {
return isa<handshake::ForkOp, handshake::ExtUIOp, handshake::ExtSIOp,
handshake::TruncIOp>(op);
handshake::TruncIOp, handshake::BufferOp>(op);
Copy link
Collaborator Author

@shundroid shundroid Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to call isBackedge after buffering, we should include BufferOp

Comment on lines +1147 to +1149
let summary = "";
let description = [{
}];
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add these once this design is approved

Comment on lines +851 to +864
for (auto branch : funcOp.getOps<handshake::SpeculatingBranchOp>()) {
if (branch->getAttr("specv1_branchDiscardCondNonSpec")) {
unsigned bb = getLogicBB(specOp1).value();
ConditionalBranchOp controlBranch = findControlBranch(funcOp, bb);
if (controlBranch == nullptr) {
specOp1->emitError()
<< "Could not find backedge within speculation bb.\n";
return signalPassFailure();
}
auto conditionOperand = controlBranch.getConditionOperand();
branch->setOperand(0, conditionOperand);
branch->setOperand(1, conditionOperand);
}
}
Copy link
Collaborator Author

@shundroid shundroid Oct 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operands of this SpeculatingBranch didn't follow this figure actually.
The loop condition is not necessarily produced by the speculator.
It only becomes definite after placing speculator and save-commit units (because the loop condition may be generated by either of them), so I'm doing this at the end of the pass.

image

@murphe67
Copy link
Collaborator

Thanks for this Shun!

Could you put this into the google test integration tests instead of bringing the python script back?

Could you also give an explanation of why we can't just run buffering after the previous spec pass? what is the problem with the save commit control signal?

If you could also double check your explanations, they don't quite make sense currently, e.g.

At this stage, SCCommitCtrl does not exist. It is only connected after buffering. Therefore, the control for save-commit units is connected temporarily.

I don't know what this means, is there a incorrect word in it maybe?

@murphe67
Copy link
Collaborator

also, what does

At this stage, we also place additional buffers to ensure high throughput and avoid deadlock.
mean?

You are still manually buffering the circuit?

@shundroid
Copy link
Collaborator Author

Could you put this into the google test integration tests instead of bringing the python script back?

This might take time because I first need to study what the google test integration tests are like.
Except for this, I think this PR is ready as long as all conflicts are resolved.

Could you also give an explanation of why we can't just run buffering after the previous spec pass?

Updated the description, this is because buffering requires join-based semantics that speculator doesn't follow.

what is the problem with the save commit control signal?

I don't know what this means, is there a incorrect word in it maybe?

The problem is mostly the presence of a merge before buffering.
So I intentionally omit SCCommitCtrl before buffering.

You are still manually buffering the circuit?

No, these buffers are still placed automatically by the post-buffering speculation pass, but not handled by the buffering algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants