Draft
Conversation
be2fd17 to
314ecc8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
A ground-up transaction generator for Cardano benchmarking, built to support Leios without risking regressions in the historical baselines produced by
tx-generatorover several years of release benchmarks.Rather than retrofitting new capabilities into the existing
tx-generatorwhere any behavioural change could silently invalidate years of baseline data,tx-centrifugeimplements them from scratch behind a clean, pull-based architecture. The two generators coexist:tx-generatorcontinues to produce comparable release benchmarks whiletx-centrifugetargets the higher TPS rates and workload isolation that Leios requires.Why a new tool
The
tx-generator(TpsThrottle.hs,SubmissionClient.hs,Submission.hs) was designed for earlier Cardano iterations with lower TPS targets. Eleven specific limitations motivated the new design:Cumulative scheduling instead of per-tick sleep. The tx-generator's
sendNTickssleeps1/TPSbetween each tick. If the feeder falls behind (GC pause, jitter), those ticks are lost. At high TPS the per-tick sleep rounds to 0 or 1 microsecond, making rates above ~20k TPS unreliable. In the tx-centrifuge, the target time for token N isstartTime + N * nanosPerToken. If the system falls behind, subsequent tokens are dispatched immediately until the schedule is caught up.One rate-limit slot per fetch, not
reqblocking takes. The tx-generator'sconsumeTxsBlockingloopsreqtimes, each doing a blockingtakeTMVar, serializing all workers through one variable. Here,blockingFetchclaims exactly one slot per call. The client calls it once for the mandatory first tx, then drains the rest vianonBlockingFetch.Non-blocking fills up to the batch size. The tx-generator's
consumeTxsNonBlockingignores thereqparameter and returns 0 or 1 ticks regardless. Here, the client callsnonBlockingFetchin a loop up tomaxBatchSizetimes, filling as many as the rate limit allows.Closed-loop input recycling. The tx-generator consumes transactions from a pre-built stream. Once submitted, the funds are gone; runs must pre-generate all transactions. Here, consumed inputs are recycled back to the workload's input queue after each delivery, enabling indefinite-duration runs.
Independent pipelines per workload. The tx-generator shares one
MVarstream across all workers. A slow node blocks the stream for all others. Here, each workload has its own input queue, payload builder, and payload queue. Workloads are fully independent.Per-target fairness. The tx-generator has all workers competing for the same
TMVar. Distribution depends on which thread wins the race. Here, per-target mode gives each target its own rate limiter at a configured TPS, fair by construction. Shared mode is also available when aggregate accuracy matters more than per-target balance.Delay outside the critical section. The tx-generator calls
threadDelayinside the feeder loop, the timing-critical path. Here, the delay is computed inside STM (pure integer arithmetic) and applied in the worker thread outside the critical section.Monotonic clock. The tx-generator uses
getCurrentTime(wall clock, subject to NTP adjustments). Here, all timing usesMonotonicRaw, immune to NTP slew and system clock steps.Integer arithmetic for timing. The tx-generator computes delays as
realToFrac (1.0 / rate)withDoubleconversions. At high TPS, floating-point error accumulates. Here,nanosPerToken = round (1e9 / tps)is computed once as anInteger. All subsequent scheduling uses integer multiplication. Rounding error is at most 0.5 ns per token.Testable in isolation. The tx-generator's throttle depends on
TxSource era,MVar StreamState,Trace, and the fullcardano-apitype hierarchy. Here, thepull-fictionsub-library has zero Cardano dependencies. The test suite validates TPS accuracy and per-target fairness at up to 100k TPS across 50 simulated targets using integer tokens andIORefcounters.Non-blocking STM. The tx-generator uses a single
TMVar (Maybe Int)withretrywhen the buffer is full, parking threads inside STM. Here,TBQueuewithtryReadTBQueue(neverretry) ensures no thread parks inside STM. Critical sections are short; delays are applied outside the transaction.Architecture
The project is split into two independent sub-libraries and one executable:
pull-fictionis the core engine. It provides rate-limited pipeline management (input queue, payload builder, bounded payload queue, closed-loop recycling), GCRA-based admission control, and workload orchestration. It is parameterised over abstractinputandpayloadtypes -- it knows nothing about Cardano, transactions, or UTxOs. Dependencies:base,aeson,async,clock,containers,stm. Nocardano-api, noouroboros-*, nonetwork.tx-centrifuge-libholds everything that touchescardano-api,ouroboros-network, andouroboros-consensus: fund loading from JSON, Conway-era transaction assembly and signing, multiplexed N2N connections (TxSubmission2, ChainSync, BlockFetch, KeepAlive), transaction confirmation tracking via an observer pattern, and structured tracing viatrace-dispatcher. It does not depend onpull-fiction-- the two sub-libraries are independent siblings.app/Main.hsis the only component that depends on both libraries and oncardano-node. It loads config, creates the consensus protocol, resolves STM pipelines, spawns builders and workers, and connects to nodes.Pipeline
Each workload gets its own independent pipeline. Workers never push transactions to the node -- the node pulls via TxSubmission2 when its mempool has room. The generator's job is to keep the payload queue supplied and to pace delivery so aggregate TPS matches the configured ceiling.
Rate limiter
The rate limiter uses GCRA (Generic Cell Rate Algorithm, ITU-T I.371). It is a receiver-side rate limiter (policer), not a sender-side traffic shaper: it admits or delays pull requests against a TPS ceiling.
nanosPerTokenonce as an integer; all scheduling is integer multiplication.Fairness matters because the benchmarking cluster uses per-target metrics. Skewed distribution produces misleading results.
Recycling strategies
Three strategies for returning spent UTxO outputs to the input queue:
on_buildon_pullon_confirmConfiguration
JSON-based with cascading defaults:
builder,rate_limit,max_batch_size, andon_exhaustioncan be set at top-level, workload, or target level. Most specific wins. Three-tier config pipeline:Raw(JSON parsing, no validation) ->Validated(pure validation, hidden constructors) ->Runtime(live STM resources). Fund files are compatible withcardano-cli conway create-testnet-data --utxo-keysoutput.Testing
pull-fiction-test: Validates rate-limiting accuracy and per-target fairness at up to 100k TPS across 50 simulated targets. No network, no node, no blockchain. Checks: elapsed time within 5% of target, global TPS within 5%, per-target token counts within 5% (per-target mode) or 15% (shared mode).tx-centrifuge-test: Validates transaction construction and signing with realcardano-apitypes.core-bench: Criterion benchmarks for shared-limiter and per-target-limiter modes.