Bench/leios by fmaste · Pull Request #6494 · IntersectMBO/cardano-node

fmaste · 2026-03-19T21:40:49Z

Description

A ground-up transaction generator for Cardano benchmarking, built to support Leios without risking regressions in the historical baselines produced by tx-generator over several years of release benchmarks.

Rather than retrofitting new capabilities into the existing tx-generator where any behavioural change could silently invalidate years of baseline data, tx-centrifuge implements them from scratch behind a clean, pull-based architecture. The two generators coexist: tx-generator continues to produce comparable release benchmarks while tx-centrifuge targets the higher TPS rates and workload isolation that Leios requires.

Why a new tool

The tx-generator (TpsThrottle.hs, SubmissionClient.hs, Submission.hs) was designed for earlier Cardano iterations with lower TPS targets. Eleven specific limitations motivated the new design:

Cumulative scheduling instead of per-tick sleep. The tx-generator's sendNTicks sleeps 1/TPS between each tick. If the feeder falls behind (GC pause, jitter), those ticks are lost. At high TPS the per-tick sleep rounds to 0 or 1 microsecond, making rates above ~20k TPS unreliable. In the tx-centrifuge, the target time for token N is startTime + N * nanosPerToken. If the system falls behind, subsequent tokens are dispatched immediately until the schedule is caught up.
One rate-limit slot per fetch, not req blocking takes. The tx-generator's consumeTxsBlocking loops req times, each doing a blocking takeTMVar, serializing all workers through one variable. Here, blockingFetch claims exactly one slot per call. The client calls it once for the mandatory first tx, then drains the rest via nonBlockingFetch.
Non-blocking fills up to the batch size. The tx-generator's consumeTxsNonBlocking ignores the req parameter and returns 0 or 1 ticks regardless. Here, the client calls nonBlockingFetch in a loop up to maxBatchSize times, filling as many as the rate limit allows.
Closed-loop input recycling. The tx-generator consumes transactions from a pre-built stream. Once submitted, the funds are gone; runs must pre-generate all transactions. Here, consumed inputs are recycled back to the workload's input queue after each delivery, enabling indefinite-duration runs.
Independent pipelines per workload. The tx-generator shares one MVar stream across all workers. A slow node blocks the stream for all others. Here, each workload has its own input queue, payload builder, and payload queue. Workloads are fully independent.
Per-target fairness. The tx-generator has all workers competing for the same TMVar. Distribution depends on which thread wins the race. Here, per-target mode gives each target its own rate limiter at a configured TPS, fair by construction. Shared mode is also available when aggregate accuracy matters more than per-target balance.
Delay outside the critical section. The tx-generator calls threadDelay inside the feeder loop, the timing-critical path. Here, the delay is computed inside STM (pure integer arithmetic) and applied in the worker thread outside the critical section.
Monotonic clock. The tx-generator uses getCurrentTime (wall clock, subject to NTP adjustments). Here, all timing uses MonotonicRaw, immune to NTP slew and system clock steps.
Integer arithmetic for timing. The tx-generator computes delays as realToFrac (1.0 / rate) with Double conversions. At high TPS, floating-point error accumulates. Here, nanosPerToken = round (1e9 / tps) is computed once as an Integer. All subsequent scheduling uses integer multiplication. Rounding error is at most 0.5 ns per token.
Testable in isolation. The tx-generator's throttle depends on TxSource era, MVar StreamState, Trace, and the full cardano-api type hierarchy. Here, the pull-fiction sub-library has zero Cardano dependencies. The test suite validates TPS accuracy and per-target fairness at up to 100k TPS across 50 simulated targets using integer tokens and IORef counters.
Non-blocking STM. The tx-generator uses a single TMVar (Maybe Int) with retry when the buffer is full, parking threads inside STM. Here, TBQueue with tryReadTBQueue (never retry) ensures no thread parks inside STM. Critical sections are short; delays are applied outside the transaction.

Architecture

The project is split into two independent sub-libraries and one executable:

tx-centrifuge/
├── lib/
│   ├── pull-fiction/          # Domain-independent rate-limiting engine
│   │                          # Zero Cardano dependencies
│   │
│   └── tx-centrifuge/         # Cardano-specific layer
│                              # N2N protocols, tx building, fund loading
├── app/
│   └── Main.hs               # Wires both libraries together
│
├── test/
│   ├── pull-fiction/          # TPS accuracy & fairness tests (no node needed)
│   └── tx-centrifuge/         # Transaction building tests
│
└── bench/
    └── Bench.hs               # Criterion benchmarks

pull-fiction is the core engine. It provides rate-limited pipeline management (input queue, payload builder, bounded payload queue, closed-loop recycling), GCRA-based admission control, and workload orchestration. It is parameterised over abstract input and payload types -- it knows nothing about Cardano, transactions, or UTxOs. Dependencies: base, aeson, async, clock, containers, stm. No cardano-api, no ouroboros-*, no network.

tx-centrifuge-lib holds everything that touches cardano-api, ouroboros-network, and ouroboros-consensus: fund loading from JSON, Conway-era transaction assembly and signing, multiplexed N2N connections (TxSubmission2, ChainSync, BlockFetch, KeepAlive), transaction confirmation tracking via an observer pattern, and structured tracing via trace-dispatcher. It does not depend on pull-fiction -- the two sub-libraries are independent siblings.

app/Main.hs is the only component that depends on both libraries and on cardano-node. It loads config, creates the consensus protocol, resolves STM pipelines, spawns builders and workers, and connects to nodes.

Pipeline

inputQueue --> [payload builder] --> payloadQueue --> [worker/fetcher] --> node
    ^                                                    |
    +----------------- recycled inputs ------------------+

Each workload gets its own independent pipeline. Workers never push transactions to the node -- the node pulls via TxSubmission2 when its mempool has room. The generator's job is to keep the payload queue supplied and to pace delivery so aggregate TPS matches the configured ceiling.

Rate limiter

The rate limiter uses GCRA (Generic Cell Rate Algorithm, ITU-T I.371). It is a receiver-side rate limiter (policer), not a sender-side traffic shaper: it admits or delays pull requests against a TPS ceiling.

Pre-computes nanosPerToken once as an integer; all scheduling is integer multiplication.
Never blocks inside STM; returns a delay that the caller sleeps outside the transaction.
Token slots are claimed atomically, providing FIFO-fair scheduling across workers.
Supports shared (aggregate ceiling) and per-target (independent ceiling per node) scopes.

Fairness matters because the benchmarking cluster uses per-target metrics. Skewed distribution produces misleading results.

Recycling strategies

Three strategies for returning spent UTxO outputs to the input queue:

Strategy	When outputs recycle	Trade-off
`on_build`	Immediately after tx construction	Highest throughput; assumes downstream success
`on_pull`	When a worker fetches the tx from the payload queue	Safe default; inputs lost only if worker killed mid-delivery
`on_confirm`	After observer confirms tx on-chain at configured depth	Safest; handles mempool eviction; requires observer connection

Configuration

JSON-based with cascading defaults: builder, rate_limit, max_batch_size, and on_exhaustion can be set at top-level, workload, or target level. Most specific wins. Three-tier config pipeline: Raw (JSON parsing, no validation) -> Validated (pure validation, hidden constructors) -> Runtime (live STM resources). Fund files are compatible with cardano-cli conway create-testnet-data --utxo-keys output.

Testing

pull-fiction-test: Validates rate-limiting accuracy and per-target fairness at up to 100k TPS across 50 simulated targets. No network, no node, no blockchain. Checks: elapsed time within 5% of target, global TPS within 5%, per-target token counts within 5% (per-target mode) or 15% (shared mode).
tx-centrifuge-test: Validates transaction construction and signing with real cardano-api types.
core-bench: Criterion benchmarks for shared-limiter and per-target-limiter modes.

…ps://github.com/input-output-hk/ouroboros-leios/blob/ebc1f7c76b34e3d4f1b28485b720ed324633a6b3/demo/proto-devnet/config/config.yaml#L9

fmaste force-pushed the bench/leios branch 6 times, most recently from be2fd17 to 314ecc8 Compare March 20, 2026 15:01

fmaste force-pushed the bench/leios branch from 2cb6baf to e2036b3 Compare March 31, 2026 01:25

This was referenced Apr 15, 2026

System-level benchmarks: Scaling workload submission input-output-hk/ouroboros-leios#791

Open

Stress test tx submission and mempools with continous load input-output-hk/ouroboros-leios#789

Open

fmaste added 3 commits April 25, 2026 04:25

WIP: workbench hacks, make it work!

d60b8e9

bench | tx-generator strictness analysis

2e031e9

bench | tx-centrifuge: Leios tx-generator

ddbd64a

fmaste force-pushed the bench/leios branch from e2036b3 to ddbd64a Compare April 25, 2026 04:25

fmaste added 2 commits April 25, 2026 18:07

value-leios-nomadperf

556e37a

Increase mempool capacity\nCopy MempoolCapacityBytesOverride from htt…

45be5aa

…ps://github.com/input-output-hk/ouroboros-leios/blob/ebc1f7c76b34e3d4f1b28485b720ed324633a6b3/demo/proto-devnet/config/config.yaml#L9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bench/leios#6494

Bench/leios#6494
fmaste wants to merge 5 commits intomasterfrom
bench/leios

fmaste commented Mar 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fmaste commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Why a new tool

Architecture

Pipeline

Rate limiter

Recycling strategies

Configuration

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fmaste commented Mar 19, 2026 •

edited

Loading