Skip to content

[RFC] Add Comprehensive Performance Benchmark Suite for Intelligent Contract Execution #73

@riyannode

Description

@riyannode

Proposal

Add a dedicated BenchmarkRunner class and gltest bench CLI command to the GenLayer Testing Suite. This delivers production-grade performance benchmarks for Intelligent Contracts (deterministic + non-deterministic paths) as outlined in the attached analysis.

Motivation

GenLayer Intelligent Contracts introduce LLM calls, web fetches, and equivalence-based consensus. Current testing supports statistical analysis via .analyze(runs=100), but lacks standardized wall-clock, resource, and throughput benchmarks. This feature enables:

  • Regression testing
  • LLM provider comparison
  • Gas-equivalent metering validation
  • Studio-mode network simulation under load

Proposed Implementation

  • New module: gltest/bench/benchmark_runner.py
  • CLI command: gltest bench --mode direct|studio --workload llm-heavy --validators 8 --iterations 1000
  • Core metrics: mean/p95 latency (ms), CPU/memory peak, TPS, consensus overhead
  • Reuse existing fixtures (direct_vm, studio_network) and psutil + statistics

Reference implementation (ready to add):

from genlayer.test import ContractFactory
import time, psutil, statistics, json
from pathlib import Path

class BenchmarkRunner:
    def __init__(self, mode: str = "direct"):
        self.factory = ContractFactory()
        self.mode = mode

    def run(self, contract_code: str, method: str, inputs: list, iterations: int = 1000):
        contract = self.factory.deploy(contract_code)
        times = []
        resources = []
        for _ in range(iterations):
            start = time.perf_counter_ns()
            result = contract.call(method, *inputs)
            duration_ms = (time.perf_counter_ns() - start) / 1e6
            times.append(duration_ms)
            resources.append(psutil.Process().memory_info().rss / 1024**2)
        return {
            "mean_ms": statistics.mean(times),
            "p95_ms": statistics.quantiles(times, n=20)[18],
            "memory_mb_peak": max(resources),
            "results": json.dumps({"times": times})
        }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions