Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an end-to-end benchmarking suite: Makefile target, README, benchmark tests, config, load generators, collectors, reporters, and cluster API extensions to capture mempool, block, throughput, and latency metrics across baseline, mempool-only, and combined variants. Changes
Sequence DiagramsequenceDiagram
participant Test as Test
participant Cluster as Cluster
participant Load as LoadGenerator
participant Poller as MempoolPoller
participant Collector as Collector
participant Reporter as Reporter
Test->>Cluster: Setup cluster (BenchConfig, MemIAVL)
Test->>Cluster: Collect initial account metas
Test->>Test: Warmup transactions
Test->>Cluster: Refresh metas, record start height
Test->>Poller: Start mempool polling
Test->>Load: Execute load (Burst/Sequential/OutOfOrder)
Load->>Cluster: Submit transactions
Load-->>Test: Return LoadResult
Test->>Cluster: WaitForAllIncluded (drain + finality)
Test->>Poller: Stop polling (get peak)
Test->>Collector: CollectResults (query blocks, match submissions)
Collector->>Cluster: QueryBlock (fetch & decode txs, timestamps)
Collector-->>Test: Return BenchResult (TPS, latencies, BlockStats)
Test->>Reporter: WriteResult / PrintComparisonTable
Reporter-->>Test: Persist JSON & logs
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (5)
Makefile (1)
247-249: Parameterize the benchmark test filter and keep module mode read-only.This target is useful as-is, but adding
BENCH_RUNand-mod=readonlymakes it safer and easier to run focused suites in CI/dev workflows.♻️ Proposed improvement
+BENCH_RUN ?= TestBenchmark + benchmark-e2e: - cd integration-tests/e2e && go test -v -tags benchmark -run TestBenchmark -timeout 30m -count=1 ./benchmark/ + cd integration-tests/e2e && go test -mod=readonly -v -tags benchmark -run '$(BENCH_RUN)' -timeout 30m -count=1 ./benchmark/🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 247 - 249, Update the benchmark-e2e Makefile target to accept a BENCH_RUN filter and run Go in module readonly mode: change the command invoked by the benchmark-e2e target (currently invoking go test -v -tags benchmark -run TestBenchmark -timeout 30m -count=1 ./benchmark/) to use the BENCH_RUN variable (e.g., -run "$(BENCH_RUN:-TestBenchmark)") and add the -mod=readonly flag so the go test invocation becomes go test -mod=readonly -v -tags benchmark -run "$(BENCH_RUN)" -timeout 30m -count=1 ./benchmark/; ensure the target name benchmark-e2e and variable BENCH_RUN are used exactly as referenced.integration-tests/e2e/benchmark/collector.go (1)
199-199: Make the finality wait context-aware.This fixed sleep ignores cancellation and can keep tests waiting after timeout/cancel paths.
♻️ Proposed improvement
- // Wait one more block to ensure finality - time.Sleep(3 * time.Second) + // Wait one more block to ensure finality + select { + case <-ctx.Done(): + return 0, ctx.Err() + case <-time.After(3 * time.Second): + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/collector.go` at line 199, Replace the fixed sleep with a context-aware wait: instead of time.Sleep(3 * time.Second) use a select that waits on time.After(3*time.Second) and ctx.Done() (where ctx is the request/test context in scope) so the routine aborts promptly when cancelled; ensure the surrounding function (e.g., the finality wait routine) returns or propagates ctx.Err() on cancellation rather than blocking until the sleep completes.integration-tests/e2e/cluster/cluster.go (1)
294-303: Add a request-level timeout for block RPC calls.With benchmark flows using long-lived contexts, a stalled RPC can block result collection indefinitely.
♻️ Proposed improvement
url := fmt.Sprintf("http://127.0.0.1:%d/block?height=%d", n.Ports.RPC, height) - req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) + reqCtx, cancel := context.WithTimeout(ctx, 10*time.Second) + defer cancel() + req, err := http.NewRequestWithContext(reqCtx, http.MethodGet, url, nil) if err != nil { return BlockResult{}, err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 294 - 303, The block RPC call currently uses the long-lived ctx and can hang; wrap the request in a short request-level timeout by creating a derived context, e.g. ctxWithTimeout, cancel := context.WithTimeout(ctx, rpcRequestTimeout) (choose a sensible duration constant like rpcRequestTimeout), defer cancel(), and use ctxWithTimeout in http.NewRequestWithContext instead of ctx; ensure you reference url, req creation, and the http.DefaultClient.Do(req) call so the request will fail fast on timeout rather than blocking result collection.integration-tests/e2e/benchmark/report.go (1)
18-20: Avoid overwriting previous benchmark runs with identical labels.Using label-only filenames drops historical runs and makes trend comparisons harder.
♻️ Proposed improvement
import ( "encoding/json" "fmt" "os" "path/filepath" "strings" "testing" + "time" ) @@ safe := strings.NewReplacer("/", "_", " ", "_", "+", "_").Replace(result.Config.Label) - path := filepath.Join(dir, safe+".json") + stamp := time.Now().UTC().Format("20060102T150405Z") + path := filepath.Join(dir, fmt.Sprintf("%s_%s.json", safe, stamp))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/report.go` around lines 18 - 20, The current filename uses only result.Config.Label (via safe := strings.NewReplacer(...).Replace(result.Config.Label) and path := filepath.Join(dir, safe+".json")), which overwrites previous runs with the same label; change the filename to include a unique suffix (e.g., a timestamp from time.Now().UTC().Format with seconds or milliseconds, or a run ID/UUID) so each run produces a distinct file (e.g., combine safe + "_" + timestamp + ".json") and update the code that reads/writes using the new path variable accordingly.integration-tests/e2e/benchmark/load.go (1)
185-197: Duplicate slice initialisation insequencePattern.The
[]uint64{base+2, base, base+1}literal is initialised twice — once for thecount <= 3branch and once before thecount > 3loop. Unifying them removes the duplication.♻️ Proposed refactor
func sequencePattern(base uint64, count int) []uint64 { - if count <= 3 { - seqs := []uint64{base + 2, base, base + 1} - return seqs[:count] - } - seqs := []uint64{base + 2, base, base + 1} + if count <= 3 { + return seqs[:count] + } for i := 3; i < count; i++ { seqs = append(seqs, base+uint64(i)) } - return seqs }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/load.go` around lines 185 - 197, The function sequencePattern duplicates the literal []uint64{base + 2, base, base + 1} in both branches; to fix, initialize seqs once (seqs := []uint64{base + 2, base, base + 1}), then if count <= 3 return seqs[:count], otherwise run the existing for loop (for i := 3; i < count; i++ { seqs = append(seqs, base+uint64(i)) }) and return seqs; this removes the duplicated initialization while preserving behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 253-256: The benchmark creates a MempoolPoller per node but each
MempoolPoller already computes the max across all nodes, causing duplicated
metrics and the saved result uses a hardcoded peak_mempool_size=0; fix by
instantiating a single MempoolPoller via NewMempoolPoller(ctx, cluster,
mempoolPollInterval) (reuse that one for the whole test instead of creating one
per node), and when persisting results read the actual max value from the poller
(e.g., poller.Max or the appropriate accessor on MempoolPoller) into
peak_mempool_size rather than 0; apply the same change for the analogous block
around the other occurrence (lines 266-271).
In `@integration-tests/e2e/benchmark/collector.go`:
- Around line 157-164: The throughput window currently uses the first and last
entries of blockStats which includes empty blocks; instead scan blockStats to
find the first and last entries that actually contain benchmark transactions
(e.g., where a transaction count field such as TxCount > 0 or a Txs slice length
> 0), assign their Time values to first/last, compute blockDuration =
last.Sub(first).Seconds(), and only set totalDuration when blockDuration > 0;
update the code around blockStats, first, last, blockDuration and totalDuration
accordingly so the TPS window excludes empty blocks.
In `@integration-tests/e2e/benchmark/load.go`:
- Line 48: Check and guard cfg.NodeCount before performing the modulo in both
BurstLoad and OutOfOrderLoad: if cfg.NodeCount is zero (or otherwise invalid)
return an error or set a safe default before entering the loop so the expression
i % cfg.NodeCount cannot panic; update the entry checks in BurstLoad and add the
same pre-loop guard to OutOfOrderLoad (the code computing viaNode using i %
cfg.NodeCount) to ensure safe execution.
- Around line 13-19: TxSubmission currently lacks an application-level response
code and non-zero res.Code checks are ignored; add a Code int (or uint32) field
to the TxSubmission struct and populate it wherever responses are processed
(e.g., in BurstLoad where res.Code is checked) so callers can distinguish
success vs app-level failures; update the response handling logic in BurstLoad
to assign res.Code into the TxSubmission entries and apply the same change to
OutOfOrderLoad and SingleNodeLoad so all load paths record the code
consistently.
- Line 135: SingleNodeLoad currently takes targetNode int and passes it straight
to SendBankTxWithSequence without bounds checking; add an entry validation in
SingleNodeLoad to ensure 0 <= targetNode < cfg.NodeCount and handle invalid
values by returning an early LoadResult/error (or logging and aborting the load)
instead of calling SendBankTxWithSequence; apply the same validation to the
other similar usage site referenced in the diff so no out-of-range targetNode
can reach SendBankTxWithSequence and corrupt the benchmark.
- Around line 44-73: The per-account goroutine loops in BurstLoad (and similarly
in OutOfOrderLoad and SingleNodeLoad) do not observe ctx cancellation and will
continue dispatching txs after the context is done; update each loop that
iterates up to cfg.TxPerAccount (the goroutine that calls
cluster.SendBankTxWithSequence and appends to result.Submissions) to check
ctx.Done() at the top of each iteration (e.g., use a non-blocking select on
ctx.Done() or check ctx.Err()) and break/return if cancelled so the goroutine
stops sending work and allows wg.Wait() to complete promptly.
- Around line 185-197: sequencePattern currently emits gaps when count < 3 (e.g.
returns [base+2] for count==1), which leaves some sequence numbers never
submitted and transactions stuck; fix by changing sequencePattern to return a
contiguous sequence for small counts (for count <= 3 or specifically count < 3)
— i.e., when count < 3 return []uint64{base, base+1, ...} truncated to count,
otherwise keep the existing out-of-order logic; alternatively (or in addition)
validate TxPerAccount in OutOfOrderLoad and enforce a minimum of 3 before
calling sequencePattern to avoid calling it with invalid counts. Ensure you
update the function sequencePattern and/or the caller
OutOfOrderLoad/TxPerAccount validation accordingly.
- Around line 200-212: The Warmup function currently ignores the result of
SendBankTxWithSequence and doesn't update the metas map, which hides
transport/app errors and leaves stale on-chain sequence numbers; modify Warmup
to capture and check the return/error from cluster.SendBankTxWithSequence, log
any failures (including context like account name and node), and on success
increment the corresponding metas[name].Sequence (or refresh metas[name] from
chain) so subsequent calls use the updated sequence; also add a brief comment in
Warmup documenting the sequence-side effect and that callers must use the
updated metas or re-fetch account metadata.
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 87-92: Update the README to use one consistent default results
directory everywhere: replace occurrences of `results/` with
`benchmark/results/` (or vice versa based on project convention) so all
references (e.g., the "Results land in `benchmark/results/` as JSON." line and
the "After baseline results exist in `results/`:" line) point to the same path,
and run a quick grep to ensure no remaining mismatched occurrences remain in the
document.
- Around line 107-112: Add language identifiers (e.g., "text") to the fenced
code blocks in the README so Markdown lint (MD040) stops warning: update each
triple-backtick block that contains the table or code samples (the backtick
fences around the tabular benchmark output and the code snippets under the
"benchmark/" section) to start with ```text instead of ```; ensure all
occurrences noted in the comment (the table blocks and the code listing blocks)
are changed consistently.
- Around line 79-81: The E2E_INITIAD_BIN environment variable is currently
applied only to the cd command so go test may not see it; change how the env var
is set so go test receives it (either prefix the go test invocation with
E2E_INITIAD_BIN=./build/initiad or export E2E_INITIAD_BIN before running cd and
go test) to ensure the TestBenchmarkBaseline run uses the intended binary
(reference the E2E_INITIAD_BIN variable and the go test -run
TestBenchmarkBaseline invocation).
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 332-336: The loop over decoded.Result.Block.Data.Txs currently
ignores base64 decode failures (using continue) which silently under-reports
txs; replace the silent continue with a fail-fast error return that includes
block context and tx index: when base64Decode(txBase64) returns decErr,
construct and return an error (or propagate decErr) that includes
decoded.Result.Block.Header.Height (or block identifier from decoded), the tx
index within the loop, and the original decErr so callers can surface the
failure and metrics remain correct; update any surrounding function signature to
return an error if needed (identify code at the loop over
decoded.Result.Block.Data.Txs and the base64Decode call).
---
Nitpick comments:
In `@integration-tests/e2e/benchmark/collector.go`:
- Line 199: Replace the fixed sleep with a context-aware wait: instead of
time.Sleep(3 * time.Second) use a select that waits on time.After(3*time.Second)
and ctx.Done() (where ctx is the request/test context in scope) so the routine
aborts promptly when cancelled; ensure the surrounding function (e.g., the
finality wait routine) returns or propagates ctx.Err() on cancellation rather
than blocking until the sleep completes.
In `@integration-tests/e2e/benchmark/load.go`:
- Around line 185-197: The function sequencePattern duplicates the literal
[]uint64{base + 2, base, base + 1} in both branches; to fix, initialize seqs
once (seqs := []uint64{base + 2, base, base + 1}), then if count <= 3 return
seqs[:count], otherwise run the existing for loop (for i := 3; i < count; i++ {
seqs = append(seqs, base+uint64(i)) }) and return seqs; this removes the
duplicated initialization while preserving behavior.
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 18-20: The current filename uses only result.Config.Label (via
safe := strings.NewReplacer(...).Replace(result.Config.Label) and path :=
filepath.Join(dir, safe+".json")), which overwrites previous runs with the same
label; change the filename to include a unique suffix (e.g., a timestamp from
time.Now().UTC().Format with seconds or milliseconds, or a run ID/UUID) so each
run produces a distinct file (e.g., combine safe + "_" + timestamp + ".json")
and update the code that reads/writes using the new path variable accordingly.
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 294-303: The block RPC call currently uses the long-lived ctx and
can hang; wrap the request in a short request-level timeout by creating a
derived context, e.g. ctxWithTimeout, cancel := context.WithTimeout(ctx,
rpcRequestTimeout) (choose a sensible duration constant like rpcRequestTimeout),
defer cancel(), and use ctxWithTimeout in http.NewRequestWithContext instead of
ctx; ensure you reference url, req creation, and the http.DefaultClient.Do(req)
call so the request will fail fast on timeout rather than blocking result
collection.
In `@Makefile`:
- Around line 247-249: Update the benchmark-e2e Makefile target to accept a
BENCH_RUN filter and run Go in module readonly mode: change the command invoked
by the benchmark-e2e target (currently invoking go test -v -tags benchmark -run
TestBenchmark -timeout 30m -count=1 ./benchmark/) to use the BENCH_RUN variable
(e.g., -run "$(BENCH_RUN:-TestBenchmark)") and add the -mod=readonly flag so the
go test invocation becomes go test -mod=readonly -v -tags benchmark -run
"$(BENCH_RUN)" -timeout 30m -count=1 ./benchmark/; ensure the target name
benchmark-e2e and variable BENCH_RUN are used exactly as referenced.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (9)
Makefileintegration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/collector.gointegration-tests/e2e/benchmark/config.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster.gointegration-tests/e2e/cluster/cluster.go
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
integration-tests/e2e/benchmark/README.md (2)
37-38:⚠️ Potential issue | 🟡 MinorUse one consistent default results directory path throughout the doc.
results/andbenchmark/results/are both documented as defaults, which is ambiguous for users and CI scripts.✏️ Suggested cleanup
-- The JSON results persist in `results/`. Subsequent `TestBenchmarkFullComparison` runs load them automatically. +- The JSON results persist in `benchmark/results/`. Subsequent `TestBenchmarkFullComparison` runs load them automatically. @@ -| `BENCHMARK_RESULTS_DIR` | `results/` | Output directory for JSON results | +| `BENCHMARK_RESULTS_DIR` | `benchmark/results/` | Output directory for JSON results |Also applies to: 87-92, 189-189
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 37 - 38, The README documents two different default results directories ("results/" and "benchmark/results/"), causing ambiguity for users and CI; pick one canonical default path and update all mentions to it (including the sentence referencing TestBenchmarkFullComparison and the other occurrences noted around lines 87-92 and 189) so the doc consistently uses the same directory throughout; ensure any examples, scripts, and the TestBenchmarkFullComparison description reflect that single chosen path.
107-112:⚠️ Potential issue | 🟡 MinorAdd language identifiers to fenced blocks to satisfy MD040.
These blocks should use
```text(or another explicit language) to clear markdownlint warnings.✏️ Suggested cleanup
-``` +```text Config | Variant | TPS | vs base | P50ms | vs base | P95ms | vs base | P99ms | vs base | Peak Mempool ...-
+text
Config | Variant | TPS | P50ms | P95ms | P99ms | Max ms | Peak Mempool
...-``` +```text benchmark/ config.go Variant definitions, BenchConfig, preset constructors ...</details> Also applies to: 130-134, 155-163 <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against the current code and only fix it if needed.
In
@integration-tests/e2e/benchmark/README.mdaround lines 107 - 112, The
markdown fenced code blocks triggering MD040 should include an explicit language
identifier (e.g., usetext) instead of bare; update the shown table block
and the other fenced blocks (the blocks around the snippet at lines 130-134 and
155-163) by changing their opening fences fromtotext (or another
appropriate language) so markdownlint MD040 no longer warns and the content
renders correctly.</details> </blockquote></details> <details> <summary>integration-tests/e2e/benchmark/collector.go (1)</summary><blockquote> `131-137`: _⚠️ Potential issue_ | _🟠 Major_ **TPS window is still derived from block non-emptiness, not benchmark tx inclusion.** At Line 133/160, `TxCount` tracks all txs in the block, so duration can still include blocks with no benchmark txs. The window should use first/last block times where matched benchmark txs were actually found. <details> <summary>🐛 Proposed fix</summary> ```diff var ( blockStats []BlockStat latencies []float64 included int + firstIncludedTime time.Time + lastIncludedTime time.Time ) @@ - for _, txHash := range block.TxHashes { + includedInBlock := 0 + for _, txHash := range block.TxHashes { sub, ok := submitMap[txHash] if !ok { continue } included++ + includedInBlock++ latencyMs := float64(block.BlockTime.Sub(sub.SubmitTime).Milliseconds()) if latencyMs < 0 { latencyMs = 0 } latencies = append(latencies, latencyMs) } + if includedInBlock > 0 { + if firstIncludedTime.IsZero() { + firstIncludedTime = block.BlockTime + } + lastIncludedTime = block.BlockTime + } } @@ - // use span from first to last block that actually included benchmark txs - var firstTxTime, lastTxTime time.Time - for _, bs := range blockStats { - if bs.TxCount == 0 { - continue - } - if firstTxTime.IsZero() { - firstTxTime = bs.Time - } - lastTxTime = bs.Time - } - if !firstTxTime.IsZero() { - blockDuration := lastTxTime.Sub(firstTxTime).Seconds() + if !firstIncludedTime.IsZero() { + blockDuration := lastIncludedTime.Sub(firstIncludedTime).Seconds() if blockDuration > 0 { totalDuration = blockDuration } } ``` </details> Also applies to: 157-173 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/collector.go` around lines 131 - 137, BlockStat currently uses TxCount: len(block.TxHashes) and block times which causes the TPS window to include blocks that contain no benchmark transactions; update the BlockStat population logic (where blockStats is appended) to count only matched benchmark transactions (e.g., filter block.TxHashes by the benchmark tx set and set TxCount to that filtered length) and also record the timestamp of the first/last matched benchmark tx (or a flag/field indicating presence) so that the overall window computation (the later logic around computing start/end times and TPS between lines ~157-173) uses the timestamps of the first and last blocks that actually contained benchmark txs rather than any non-empty block times. Ensure references to BlockStat, TxCount, blockStats and the window computation code are updated accordingly. ``` </details> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against the current code and only fix it if needed.
Inline comments:
In@integration-tests/e2e/benchmark/collector.go:
- Around line 48-64: NewMempoolPoller may create a time.Ticker with a
non-positive interval which panics; in the constructor (NewMempoolPoller) ensure
p.interval is a positive duration before starting the goroutine—if the provided
interval is <= 0, replace it with a sensible default (e.g., time.Second or
another minimum > 0) so that MempoolPoller.run can safely create a
time.NewTicker(p.interval) without panicking.- Around line 206-208: The post-mempool wait currently uses a blocking
time.Sleep(3 * time.Second) which ignores ctx cancellation; replace it with a
context-aware timer: create a time.NewTimer(3*time.Second) and use select to
wait on either timer.C or ctx.Done(), stop the timer when appropriate, and if
ctx.Done() fires propagate/return the context error instead of proceeding to
call cluster.LatestHeight(ctx, 0).In
@integration-tests/e2e/cluster/cluster.go:
- Around line 886-890: The code only sets memiavl.enable to "true" when
c.opts.MemIAVL is true, leaving the false case to upstream defaults; update the
logic around the conditional that calls setTOMLValue(appPath, "memiavl",
"enable", ...) so that it always writes an explicit value: write "true" when
c.opts.MemIAVL is true and write "false" when it's false (i.e., call
setTOMLValue with "false" in the else branch), ensuring the memiavl.enable key
is deterministically set regardless of upstream config.
Duplicate comments:
In@integration-tests/e2e/benchmark/collector.go:
- Around line 131-137: BlockStat currently uses TxCount: len(block.TxHashes) and
block times which causes the TPS window to include blocks that contain no
benchmark transactions; update the BlockStat population logic (where blockStats
is appended) to count only matched benchmark transactions (e.g., filter
block.TxHashes by the benchmark tx set and set TxCount to that filtered length)
and also record the timestamp of the first/last matched benchmark tx (or a
flag/field indicating presence) so that the overall window computation (the
later logic around computing start/end times and TPS between lines ~157-173)
uses the timestamps of the first and last blocks that actually contained
benchmark txs rather than any non-empty block times. Ensure references to
BlockStat, TxCount, blockStats and the window computation code are updated
accordingly.In
@integration-tests/e2e/benchmark/README.md:
- Around line 37-38: The README documents two different default results
directories ("results/" and "benchmark/results/"), causing ambiguity for users
and CI; pick one canonical default path and update all mentions to it (including
the sentence referencing TestBenchmarkFullComparison and the other occurrences
noted around lines 87-92 and 189) so the doc consistently uses the same
directory throughout; ensure any examples, scripts, and the
TestBenchmarkFullComparison description reflect that single chosen path.- Around line 107-112: The markdown fenced code blocks triggering MD040 should
include an explicit language identifier (e.g., usetext) instead of bare;
update the shown table block and the other fenced blocks (the blocks around the
snippet at lines 130-134 and 155-163) by changing their opening fences fromtotext (or another appropriate language) so markdownlint MD040 no longer
warns and the content renders correctly.</details> --- <details> <summary>ℹ️ Review info</summary> **Configuration used**: Repository UI **Review profile**: CHILL **Plan**: Pro **Cache: Disabled due to data retention organization setting** **Knowledge base: Disabled due to data retention organization setting** <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between de2baaf1144434fe86ea58dc43a0d066cde3821c and e9fab6663dd0532eb038f3353c191049f39ae171. </details> <details> <summary>📒 Files selected for processing (5)</summary> * `integration-tests/e2e/benchmark/README.md` * `integration-tests/e2e/benchmark/benchmark_test.go` * `integration-tests/e2e/benchmark/collector.go` * `integration-tests/e2e/benchmark/load.go` * `integration-tests/e2e/cluster/cluster.go` </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (2)</summary> * integration-tests/e2e/benchmark/benchmark_test.go * integration-tests/e2e/benchmark/load.go </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
There was a problem hiding this comment.
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (2)
886-892: Past review comment addressed correctly.
memiavl.enableis now always written explicitly, eliminating dependence on upstream defaults for both variants.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 886 - 892, The code now always sets memiavl.enable explicitly; ensure the call to setTOMLValue writes the string value derived from c.opts.MemIAVL by keeping memIAVLEnable and the if block that sets it based on c.opts.MemIAVL, then pass memIAVLEnable to setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable) so memiavl.enable is unambiguous regardless of upstream defaults (verify in the function where memIAVLEnable, c.opts.MemIAVL, appPath and setTOMLValue are used).
332-336: Past review comment addressed correctly.The fail-fast decode error with
heightandidxcontext directly addresses the previous concern about silent tx under-reporting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 332 - 336, The loop already fails fast on decode errors and includes contextual info; no change needed—keep the current base64Decode handling inside the for loop that returns fmt.Errorf("failed to decode tx at height=%d index=%d: %w", height, idx, decErr) so failures surface with height/idx context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 886-892: The code now always sets memiavl.enable explicitly;
ensure the call to setTOMLValue writes the string value derived from
c.opts.MemIAVL by keeping memIAVLEnable and the if block that sets it based on
c.opts.MemIAVL, then pass memIAVLEnable to setTOMLValue(appPath, "memiavl",
"enable", memIAVLEnable) so memiavl.enable is unambiguous regardless of upstream
defaults (verify in the function where memIAVLEnable, c.opts.MemIAVL, appPath
and setTOMLValue are used).
- Around line 332-336: The loop already fails fast on decode errors and includes
contextual info; no change needed—keep the current base64Decode handling inside
the for loop that returns fmt.Errorf("failed to decode tx at height=%d index=%d:
%w", height, idx, decErr) so failures surface with height/idx context.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
integration-tests/e2e/benchmark/collector.gointegration-tests/e2e/cluster/cluster.go
🚧 Files skipped from review as they are similar to previous changes (1)
- integration-tests/e2e/benchmark/collector.go
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (1)
886-890:⚠️ Potential issue | 🟠 MajorExplicitly set
memiavl.enablefor both true and false paths.At Line 886, only the
truepath is written. Whenc.opts.MemIAVLis false, config falls back to upstream defaults, which can skew baseline vs. variant comparisons.🔧 Proposed fix
- if c.opts.MemIAVL { - if err := setTOMLValue(appPath, "memiavl", "enable", "true"); err != nil { - return err - } - } + memIAVLEnable := "false" + if c.opts.MemIAVL { + memIAVLEnable = "true" + } + if err := setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable); err != nil { + return err + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 886 - 890, The code only sets memiavl.enable when c.opts.MemIAVL is true, causing defaults to leak when false; update the conditional around c.opts.MemIAVL to explicitly call setTOMLValue(appPath, "memiavl", "enable", "true") when true and call setTOMLValue(appPath, "memiavl", "enable", "false") when false (preserve error handling/returns), referencing c.opts.MemIAVL, setTOMLValue and appPath so the configuration is deterministic for both branches.integration-tests/e2e/benchmark/collector.go (1)
160-176:⚠️ Potential issue | 🟠 MajorTPS window is still keyed off total block txs, not benchmark-matched txs.
Line 163 checks
bs.TxCount, which includes non-benchmark traffic. That can shift first/last timestamps and distort TPS for this benchmark run.🔧 Proposed fix
- // use span from first to last block that actually included benchmark txs - var firstTxTime, lastTxTime time.Time - for _, bs := range blockStats { - if bs.TxCount == 0 { - continue - } - if firstTxTime.IsZero() { - firstTxTime = bs.Time - } - lastTxTime = bs.Time - } + // use span from first to last block that included matched benchmark txs + var firstTxTime, lastTxTime time.Time + for h := startHeight; h <= endHeight; h++ { + block, err := cluster.QueryBlock(ctx, 0, h) + if err != nil { + return BenchResult{}, fmt.Errorf("query block %d: %w", h, err) + } + + matched := 0 + for _, txHash := range block.TxHashes { + if _, ok := submitMap[txHash]; ok { + matched++ + } + } + if matched == 0 { + continue + } + if firstTxTime.IsZero() { + firstTxTime = block.BlockTime + } + lastTxTime = block.BlockTime + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/collector.go` around lines 160 - 176, The TPS window selection currently uses bs.TxCount (which includes non-benchmark traffic) to find firstTxTime/lastTxTime; change the check to use the count of benchmark-matched transactions for each block instead (e.g., compute matchedCount from a field like bs.MatchedTxCount or by filtering bs.TxResults for the benchmark marker/flag), skip blocks when matchedCount == 0, and use those matched blocks to set firstTxTime/lastTxTime and compute blockDuration/totalDuration (update the condition that referenced bs.TxCount and any related logic in the loop that sets firstTxTime, lastTxTime, blockDuration and totalDuration).
🧹 Nitpick comments (1)
integration-tests/e2e/benchmark/report.go (1)
118-137: Consider sorting loaded results for deterministic reporting output.
os.ReadDirordering is filesystem-dependent; sortingresults(or filenames) before returning will keep table ordering stable across environments.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/report.go` around lines 118 - 137, The code relies on os.ReadDir(dir) ordering which is filesystem-dependent; make the output deterministic by sorting the entries slice by entry.Name() right after calling os.ReadDir (before the for _, entry := range entries loop) using sort.Slice or sort.SliceStable, so the subsequent reads/unmarshals produce a stable results order (refer to the entries variable and the loop that reads files into results).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 88-92: The check uses only r.Config.Variant so it hides deltas for
any row sharing that Variant; instead detect the actual baseline row itself.
Replace the condition at the block that sets tpsDelta/p50Delta/p95Delta so it
compares the selected row to the baseline (e.g., r == baseline) or compares the
full config identity (e.g., reflect.DeepEqual(r.Config, baseline.Config) or
compare a unique ID field) rather than r.Config.Variant; keep the rest of the
delta assignment logic unchanged and only set "-" when the row equals the
baseline.
---
Duplicate comments:
In `@integration-tests/e2e/benchmark/collector.go`:
- Around line 160-176: The TPS window selection currently uses bs.TxCount (which
includes non-benchmark traffic) to find firstTxTime/lastTxTime; change the check
to use the count of benchmark-matched transactions for each block instead (e.g.,
compute matchedCount from a field like bs.MatchedTxCount or by filtering
bs.TxResults for the benchmark marker/flag), skip blocks when matchedCount == 0,
and use those matched blocks to set firstTxTime/lastTxTime and compute
blockDuration/totalDuration (update the condition that referenced bs.TxCount and
any related logic in the loop that sets firstTxTime, lastTxTime, blockDuration
and totalDuration).
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 886-890: The code only sets memiavl.enable when c.opts.MemIAVL is
true, causing defaults to leak when false; update the conditional around
c.opts.MemIAVL to explicitly call setTOMLValue(appPath, "memiavl", "enable",
"true") when true and call setTOMLValue(appPath, "memiavl", "enable", "false")
when false (preserve error handling/returns), referencing c.opts.MemIAVL,
setTOMLValue and appPath so the configuration is deterministic for both
branches.
---
Nitpick comments:
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 118-137: The code relies on os.ReadDir(dir) ordering which is
filesystem-dependent; make the output deterministic by sorting the entries slice
by entry.Name() right after calling os.ReadDir (before the for _, entry := range
entries loop) using sort.Slice or sort.SliceStable, so the subsequent
reads/unmarshals produce a stable results order (refer to the entries variable
and the loop that reads files into results).
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (4)
integration-tests/e2e/benchmark/collector.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster/cluster.go
| if r.Config.Variant == baseline.Config.Variant { | ||
| tpsDelta = "-" | ||
| p50Delta = "-" | ||
| p95Delta = "-" | ||
| } |
There was a problem hiding this comment.
Baseline row detection should compare the selected row, not just Variant.
At Line 88, using r.Config.Variant == baseline.Config.Variant suppresses deltas for every baseline-variant row. Only the actual baseline row should show -.
🔧 Proposed fix
- for _, r := range results {
+ for i := range results {
+ r := results[i]
tpsDelta := deltaStr(baseline.TxPerSecond, r.TxPerSecond)
p50Delta := deltaStr(baseline.P50LatencyMs, r.P50LatencyMs)
p95Delta := deltaStr(baseline.P95LatencyMs, r.P95LatencyMs)
- if r.Config.Variant == baseline.Config.Variant {
+ if &results[i] == baseline {
tpsDelta = "-"
p50Delta = "-"
p95Delta = "-"
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@integration-tests/e2e/benchmark/report.go` around lines 88 - 92, The check
uses only r.Config.Variant so it hides deltas for any row sharing that Variant;
instead detect the actual baseline row itself. Replace the condition at the
block that sets tpsDelta/p50Delta/p95Delta so it compares the selected row to
the baseline (e.g., r == baseline) or compares the full config identity (e.g.,
reflect.DeepEqual(r.Config, baseline.Config) or compare a unique ID field)
rather than r.Config.Variant; keep the rest of the delta assignment logic
unchanged and only set "-" when the row equals the baseline.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #481 +/- ##
=======================================
Coverage 37.07% 37.07%
=======================================
Files 311 311
Lines 29362 29362
=======================================
Hits 10885 10885
Misses 16742 16742
Partials 1735 1735 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (2)
integration-tests/e2e/cluster/cluster.go (1)
938-942:⚠️ Potential issue | 🟠 Major
memiavl.enableis still only set whenMemIAVLis true — baseline runs rely on upstream defaults.The code only writes
memiavl.enable=truewhen the flag is set, but never explicitly writesfalse. For benchmark correctness, both variants must be deterministically configured. If a node's defaultapp.tomlships withmemiavl.enable=true, baseline and mempool-only runs would silently use MemIAVL, invalidating comparisons.Proposed fix
- if c.opts.MemIAVL { - if err := setTOMLValue(appPath, "memiavl", "enable", "true"); err != nil { - return err - } + memIAVLEnable := "false" + if c.opts.MemIAVL { + memIAVLEnable = "true" + } + if err := setTOMLValue(appPath, "memiavl", "enable", memIAVLEnable); err != nil { + return err }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 938 - 942, The code only sets memiavl.enable when c.opts.MemIAVL is true, so baseline runs may inherit an upstream default; change the block so setTOMLValue(appPath, "memiavl", "enable", ...) is always called with "true" if c.opts.MemIAVL else "false" (i.e., unconditionally write the explicit setting), and propagate/return any error from setTOMLValue; update the branch around c.opts.MemIAVL to call setTOMLValue regardless of its value.integration-tests/e2e/benchmark/report.go (1)
83-92:⚠️ Potential issue | 🟡 MinorBaseline row detection compares
Variant, not identity — suppresses deltas for all rows sharing the baseline variant.If multiple rows have
VariantBaseline(e.g., loaded baseline + a re-run), all of them will show"-"instead of actual deltas. Compare the specific row, not just the variant.Proposed fix
- for _, r := range results { + for i := range results { + r := results[i] tpsDelta := deltaStr(baseline.TxPerSecond, r.TxPerSecond) p50Delta := deltaStr(baseline.P50LatencyMs, r.P50LatencyMs) p95Delta := deltaStr(baseline.P95LatencyMs, r.P95LatencyMs) - if r.Config.Variant == baseline.Config.Variant { + if &results[i] == baseline { tpsDelta = "-" p50Delta = "-" p95Delta = "-" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/report.go` around lines 83 - 92, The code currently suppresses deltas by comparing r.Config.Variant to baseline.Config.Variant, which hides deltas for any row sharing the same variant; change the baseline detection to compare the specific row identity instead (e.g., compare the result object or its unique run ID) so only the exact baseline row sets tpsDelta/p50Delta/p95Delta = "-". Update the conditional in the loop that references results, baseline, and r.Config.Variant to use r == baseline or r.RunID/ResultID equality (whichever unique identifier exists) to locate the baseline row.
🧹 Nitpick comments (5)
integration-tests/e2e/benchmark/load.go (2)
346-357:sequencePatternstill produces invalid sequences forcount < 3, but the caller guards against it.The guard in
OutOfOrderLoad(line 160-162) ensurescount >= 3, so this is safe in practice. Adding a defensive comment or panic insidesequencePatternitself forcount < 3would prevent future misuse if someone calls it directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/load.go` around lines 346 - 357, The sequencePattern function can return invalid slices when called with count < 3; add a defensive check at the start of sequencePattern to validate count (e.g., panic or return an explicit error/empty slice) and document the requirement in a comment; reference the function name sequencePattern and note the existing caller OutOfOrderLoad enforces count >= 3 but sequencePattern should enforce or clearly document this precondition to avoid future misuse.
30-88: Significant structural duplication across load functions.
BurstLoad,SequentialLoad, andSingleNodeLoadshare ~80% of their body (NodeCount guard, mutex/wg setup, goroutine-per-account with ctx check, submission recording). The only meaningful differences are howviaNodeis computed and whether submissions are sequential or concurrent within an account. Consider extracting a commonrunLoadhelper parameterized by aviaNodeFnand concurrency mode to reduce maintenance burden.Not blocking for this PR — the duplication is manageable at the current scale.
Also applies to: 94-152, 220-277
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/load.go` around lines 30 - 88, BurstLoad, SequentialLoad, and SingleNodeLoad duplicate most logic (NodeCount guard, mutex/wg setup, goroutine-per-account, ctx check, submission recording); refactor by extracting a shared helper runLoad that takes a viaNodeFn (func(i int) int) and a concurrency mode (sequential vs concurrent within an account), then have BurstLoad/SequentialLoad/SingleNodeLoad call runLoad with the appropriate viaNodeFn and mode; ensure runLoad performs the NodeCount check, sets up mu/wg, iterates cluster.AccountNames(), computes seq using meta.Sequence+uint64(i), calls cluster.SendBankTxWithSequence, and appends TxSubmission/Errors to result so callers keep the same behavior and signatures.integration-tests/e2e/cluster/cluster.go (2)
1223-1230: Helpers are fine, butbase64Decodeis a trivial one-liner wrapper.
base64Decodejust callsbase64.StdEncoding.DecodeString. Inlining the call at the one usage site inQueryBlock(line 333) would be simpler. Minor nit — not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 1223 - 1230, The helper base64Decode is a trivial one-liner wrapper around base64.StdEncoding.DecodeString; remove the base64Decode function and replace its single usage in QueryBlock with a direct call to base64.StdEncoding.DecodeString, updating any error handling there accordingly (ensure QueryBlock imports encoding/base64 is available and adjust variable names if needed).
628-678:SendMoveExecuteJSONWithGasis largely duplicated fromMoveExecuteJSONWithSequence.The only difference is that gas estimation is skipped and
gasLimitis passed directly. Consider extracting the shared CLI-invocation logic into a private helper to reduce the ~50 lines of duplication. Not blocking, but worth addressing to keep maintenance manageable as the CLI surface grows.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/cluster/cluster.go` around lines 628 - 678, SendMoveExecuteJSONWithGas duplicates most of MoveExecuteJSONWithSequence; factor the shared CLI invocation, JSON marshalling, c.exec call, parseTxResultFromOutput and logging into a private helper (e.g., execMoveExecuteJSON or sendMoveExecuteJSONCommon) that accepts parameters for moduleAddress,moduleName,functionName,typeArgs,args,fromName,accountNumber,sequence,gasLimit,viaNode and any flags like "--offline" vs estimation behavior; have both SendMoveExecuteJSONWithGas and MoveExecuteJSONWithSequence call that helper (preserving their public signatures) and only keep the differing gas-handling logic in each caller. Ensure the new helper uses c.exec, parseTxResultFromOutput and the same log format as in the original functions.integration-tests/e2e/benchmark/README.md (1)
103-111: Fenced code block missing language identifier.Static analysis (MD040) flags the code block at line 103. Adding
textas the language identifier would satisfy the linter.Proposed fix
-``` +```text benchmark/ config.go Variant definitions, BenchConfig, preset constructors ...🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration-tests/e2e/benchmark/README.md` around lines 103 - 111, The fenced code block starting with the benchmark/ listing is missing a language identifier and triggers MD040; update the fence that opens the block (the triple backticks before "benchmark/") to include the language identifier "text" (i.e., change ``` to ```text) so the linter accepts it while preserving the block contents (the section that lists config.go, load.go, collector.go, report.go, benchmark_test.go, results/).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration-tests/e2e/benchmark/benchmark_test.go`:
- Around line 31-47: setupCluster currently starts the cluster with
cluster.Start() but does not register a cleanup, so if require.NoError(t,
cluster.WaitForReady(...)) fails the test will Goexit and leave node processes
running; fix by calling t.Cleanup(func(){ cluster.Close(context.Background()) })
(or t.Cleanup(func(){ _ = cluster.Close(ctx) }) if context is available)
immediately after successful e2e.NewCluster and before calling
cluster.Start/WaitForReady, ensuring Close is always invoked on test teardown,
and then remove manual defer cluster.Close() calls from runBenchmark,
TestBenchmarkMemIAVLMoveExec, and TestBenchmarkGossipPropagation.
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 12-28: WriteResult can produce a hidden ".json" file when
result.Config.Label is empty; after computing safe :=
strings.NewReplacer(...).Replace(result.Config.Label) add a guard that if safe
== "" you assign a deterministic fallback (e.g. "result-"+timestamp or
"result-"+UnixNano) to avoid collisions, then continue to build path using that
safe value; reference symbols: function WriteResult, variable safe, and
result.Config.Label to locate and update the logic.
---
Duplicate comments:
In `@integration-tests/e2e/benchmark/report.go`:
- Around line 83-92: The code currently suppresses deltas by comparing
r.Config.Variant to baseline.Config.Variant, which hides deltas for any row
sharing the same variant; change the baseline detection to compare the specific
row identity instead (e.g., compare the result object or its unique run ID) so
only the exact baseline row sets tpsDelta/p50Delta/p95Delta = "-". Update the
conditional in the loop that references results, baseline, and r.Config.Variant
to use r == baseline or r.RunID/ResultID equality (whichever unique identifier
exists) to locate the baseline row.
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 938-942: The code only sets memiavl.enable when c.opts.MemIAVL is
true, so baseline runs may inherit an upstream default; change the block so
setTOMLValue(appPath, "memiavl", "enable", ...) is always called with "true" if
c.opts.MemIAVL else "false" (i.e., unconditionally write the explicit setting),
and propagate/return any error from setTOMLValue; update the branch around
c.opts.MemIAVL to call setTOMLValue regardless of its value.
---
Nitpick comments:
In `@integration-tests/e2e/benchmark/load.go`:
- Around line 346-357: The sequencePattern function can return invalid slices
when called with count < 3; add a defensive check at the start of
sequencePattern to validate count (e.g., panic or return an explicit error/empty
slice) and document the requirement in a comment; reference the function name
sequencePattern and note the existing caller OutOfOrderLoad enforces count >= 3
but sequencePattern should enforce or clearly document this precondition to
avoid future misuse.
- Around line 30-88: BurstLoad, SequentialLoad, and SingleNodeLoad duplicate
most logic (NodeCount guard, mutex/wg setup, goroutine-per-account, ctx check,
submission recording); refactor by extracting a shared helper runLoad that takes
a viaNodeFn (func(i int) int) and a concurrency mode (sequential vs concurrent
within an account), then have BurstLoad/SequentialLoad/SingleNodeLoad call
runLoad with the appropriate viaNodeFn and mode; ensure runLoad performs the
NodeCount check, sets up mu/wg, iterates cluster.AccountNames(), computes seq
using meta.Sequence+uint64(i), calls cluster.SendBankTxWithSequence, and appends
TxSubmission/Errors to result so callers keep the same behavior and signatures.
In `@integration-tests/e2e/benchmark/README.md`:
- Around line 103-111: The fenced code block starting with the benchmark/
listing is missing a language identifier and triggers MD040; update the fence
that opens the block (the triple backticks before "benchmark/") to include the
language identifier "text" (i.e., change ``` to ```text) so the linter accepts
it while preserving the block contents (the section that lists config.go,
load.go, collector.go, report.go, benchmark_test.go, results/).
In `@integration-tests/e2e/cluster/cluster.go`:
- Around line 1223-1230: The helper base64Decode is a trivial one-liner wrapper
around base64.StdEncoding.DecodeString; remove the base64Decode function and
replace its single usage in QueryBlock with a direct call to
base64.StdEncoding.DecodeString, updating any error handling there accordingly
(ensure QueryBlock imports encoding/base64 is available and adjust variable
names if needed).
- Around line 628-678: SendMoveExecuteJSONWithGas duplicates most of
MoveExecuteJSONWithSequence; factor the shared CLI invocation, JSON marshalling,
c.exec call, parseTxResultFromOutput and logging into a private helper (e.g.,
execMoveExecuteJSON or sendMoveExecuteJSONCommon) that accepts parameters for
moduleAddress,moduleName,functionName,typeArgs,args,fromName,accountNumber,sequence,gasLimit,viaNode
and any flags like "--offline" vs estimation behavior; have both
SendMoveExecuteJSONWithGas and MoveExecuteJSONWithSequence call that helper
(preserving their public signatures) and only keep the differing gas-handling
logic in each caller. Ensure the new helper uses c.exec, parseTxResultFromOutput
and the same log format as in the original functions.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (5)
integration-tests/e2e/benchmark/README.mdintegration-tests/e2e/benchmark/benchmark_test.gointegration-tests/e2e/benchmark/load.gointegration-tests/e2e/benchmark/report.gointegration-tests/e2e/cluster/cluster.go
| func setupCluster(t *testing.T, ctx context.Context, cfg BenchConfig) *e2e.Cluster { | ||
| t.Helper() | ||
|
|
||
| cluster, err := e2e.NewCluster(ctx, t, e2e.ClusterOptions{ | ||
| NodeCount: cfg.NodeCount, | ||
| AccountCount: cfg.AccountCount, | ||
| ChainID: "bench-e2e", | ||
| BinaryPath: os.Getenv("E2E_INITIAD_BIN"), | ||
| MemIAVL: cfg.MemIAVL, | ||
| }) | ||
| require.NoError(t, err) | ||
|
|
||
| require.NoError(t, cluster.Start(ctx)) | ||
| require.NoError(t, cluster.WaitForReady(ctx, clusterReadyTimeout)) | ||
|
|
||
| return cluster | ||
| } |
There was a problem hiding this comment.
Node processes leak if setupCluster fails after Start.
If WaitForReady fails (line 44), require.NoError triggers runtime.Goexit(). At that point cluster.Start() has already spawned node processes, but no Close() has been deferred — the caller's defer cluster.Close() hasn't been registered yet. This leaves orphan processes on the test host.
Proposed fix — register cleanup inside setupCluster
func setupCluster(t *testing.T, ctx context.Context, cfg BenchConfig) *e2e.Cluster {
t.Helper()
cluster, err := e2e.NewCluster(ctx, t, e2e.ClusterOptions{
NodeCount: cfg.NodeCount,
AccountCount: cfg.AccountCount,
ChainID: "bench-e2e",
BinaryPath: os.Getenv("E2E_INITIAD_BIN"),
MemIAVL: cfg.MemIAVL,
})
require.NoError(t, err)
+ t.Cleanup(cluster.Close)
require.NoError(t, cluster.Start(ctx))
require.NoError(t, cluster.WaitForReady(ctx, clusterReadyTimeout))
return cluster
}Then remove the manual defer cluster.Close() calls in runBenchmark, TestBenchmarkMemIAVLMoveExec, and TestBenchmarkGossipPropagation since t.Cleanup will handle it.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func setupCluster(t *testing.T, ctx context.Context, cfg BenchConfig) *e2e.Cluster { | |
| t.Helper() | |
| cluster, err := e2e.NewCluster(ctx, t, e2e.ClusterOptions{ | |
| NodeCount: cfg.NodeCount, | |
| AccountCount: cfg.AccountCount, | |
| ChainID: "bench-e2e", | |
| BinaryPath: os.Getenv("E2E_INITIAD_BIN"), | |
| MemIAVL: cfg.MemIAVL, | |
| }) | |
| require.NoError(t, err) | |
| require.NoError(t, cluster.Start(ctx)) | |
| require.NoError(t, cluster.WaitForReady(ctx, clusterReadyTimeout)) | |
| return cluster | |
| } | |
| func setupCluster(t *testing.T, ctx context.Context, cfg BenchConfig) *e2e.Cluster { | |
| t.Helper() | |
| cluster, err := e2e.NewCluster(ctx, t, e2e.ClusterOptions{ | |
| NodeCount: cfg.NodeCount, | |
| AccountCount: cfg.AccountCount, | |
| ChainID: "bench-e2e", | |
| BinaryPath: os.Getenv("E2E_INITIAD_BIN"), | |
| MemIAVL: cfg.MemIAVL, | |
| }) | |
| require.NoError(t, err) | |
| t.Cleanup(cluster.Close) | |
| require.NoError(t, cluster.Start(ctx)) | |
| require.NoError(t, cluster.WaitForReady(ctx, clusterReadyTimeout)) | |
| return cluster | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@integration-tests/e2e/benchmark/benchmark_test.go` around lines 31 - 47,
setupCluster currently starts the cluster with cluster.Start() but does not
register a cleanup, so if require.NoError(t, cluster.WaitForReady(...)) fails
the test will Goexit and leave node processes running; fix by calling
t.Cleanup(func(){ cluster.Close(context.Background()) }) (or t.Cleanup(func(){ _
= cluster.Close(ctx) }) if context is available) immediately after successful
e2e.NewCluster and before calling cluster.Start/WaitForReady, ensuring Close is
always invoked on test teardown, and then remove manual defer cluster.Close()
calls from runBenchmark, TestBenchmarkMemIAVLMoveExec, and
TestBenchmarkGossipPropagation.
| // WriteResult writes a BenchResult as JSON to the results directory. | ||
| func WriteResult(t *testing.T, result BenchResult, dir string) error { | ||
| if err := os.MkdirAll(dir, 0o755); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| safe := strings.NewReplacer("/", "_", " ", "_", "+", "_").Replace(result.Config.Label) | ||
| path := filepath.Join(dir, safe+".json") | ||
|
|
||
| data, err := json.MarshalIndent(result, "", " ") | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| t.Logf("Writing benchmark result to %s", path) | ||
| return os.WriteFile(path, data, 0o600) | ||
| } |
There was a problem hiding this comment.
WriteResult could produce a file named .json if Label is empty.
If result.Config.Label is empty, safe will be empty and the output file will be .json — a hidden file on Unix that's easy to overlook and could collide across runs.
Proposed guard
safe := strings.NewReplacer("/", "_", " ", "_", "+", "_").Replace(result.Config.Label)
+ if safe == "" {
+ safe = "unnamed"
+ }
path := filepath.Join(dir, safe+".json")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@integration-tests/e2e/benchmark/report.go` around lines 12 - 28, WriteResult
can produce a hidden ".json" file when result.Config.Label is empty; after
computing safe := strings.NewReplacer(...).Replace(result.Config.Label) add a
guard that if safe == "" you assign a deterministic fallback (e.g.
"result-"+timestamp or "result-"+UnixNano) to avoid collisions, then continue to
build path using that safe value; reference symbols: function WriteResult,
variable safe, and result.Config.Label to locate and update the logic.
Description
Closes: #XXXX
Author Checklist
All items are required. Please add a note to the item if the item is not applicable and
please add links to any relevant follow up issues.
I have...
!in the type prefix if API or client breaking changeReviewers Checklist
All items are required. Please add a note if the item is not applicable and please add
your handle next to the items reviewed if you only reviewed selected items.
I have...