Proposal
Add a dedicated BenchmarkRunner class and gltest bench CLI command to the GenLayer Testing Suite. This delivers production-grade performance benchmarks for Intelligent Contracts (deterministic + non-deterministic paths) as outlined in the attached analysis.
Motivation
GenLayer Intelligent Contracts introduce LLM calls, web fetches, and equivalence-based consensus. Current testing supports statistical analysis via .analyze(runs=100), but lacks standardized wall-clock, resource, and throughput benchmarks. This feature enables:
- Regression testing
- LLM provider comparison
- Gas-equivalent metering validation
- Studio-mode network simulation under load
Proposed Implementation
- New module:
gltest/bench/benchmark_runner.py
- CLI command:
gltest bench --mode direct|studio --workload llm-heavy --validators 8 --iterations 1000
- Core metrics: mean/p95 latency (ms), CPU/memory peak, TPS, consensus overhead
- Reuse existing fixtures (
direct_vm, studio_network) and psutil + statistics
Reference implementation (ready to add):
from genlayer.test import ContractFactory
import time, psutil, statistics, json
from pathlib import Path
class BenchmarkRunner:
def __init__(self, mode: str = "direct"):
self.factory = ContractFactory()
self.mode = mode
def run(self, contract_code: str, method: str, inputs: list, iterations: int = 1000):
contract = self.factory.deploy(contract_code)
times = []
resources = []
for _ in range(iterations):
start = time.perf_counter_ns()
result = contract.call(method, *inputs)
duration_ms = (time.perf_counter_ns() - start) / 1e6
times.append(duration_ms)
resources.append(psutil.Process().memory_info().rss / 1024**2)
return {
"mean_ms": statistics.mean(times),
"p95_ms": statistics.quantiles(times, n=20)[18],
"memory_mb_peak": max(resources),
"results": json.dumps({"times": times})
}
Proposal
Add a dedicated
BenchmarkRunnerclass andgltest benchCLI command to the GenLayer Testing Suite. This delivers production-grade performance benchmarks for Intelligent Contracts (deterministic + non-deterministic paths) as outlined in the attached analysis.Motivation
GenLayer Intelligent Contracts introduce LLM calls, web fetches, and equivalence-based consensus. Current testing supports statistical analysis via
.analyze(runs=100), but lacks standardized wall-clock, resource, and throughput benchmarks. This feature enables:Proposed Implementation
gltest/bench/benchmark_runner.pygltest bench --mode direct|studio --workload llm-heavy --validators 8 --iterations 1000direct_vm,studio_network) andpsutil+statisticsReference implementation (ready to add):