Skip to content

Add contracts, statistics, and chaos engineering modules#65

Merged
pratyush618 merged 8 commits intomainfrom
feature/contracts-statistics-chaos
Apr 7, 2026
Merged

Add contracts, statistics, and chaos engineering modules#65
pratyush618 merged 8 commits intomainfrom
feature/contracts-statistics-chaos

Conversation

@pratyush618
Copy link
Copy Markdown
Collaborator

Summary

  • agenteval-contracts: Contract testing for AI agents — define behavioral invariants (safety, compliance, tool usage) verified across diverse inputs. Sealed Contract interface with deterministic and LLM-judged implementations, fluent builder API, ContractVerifier orchestrator, StandardContracts library, JSON definition loader, and JUnit 5 integration (@ContractTest, @Invariant).
  • agenteval-statistics: Statistical rigor for evaluation results — confidence intervals, paired t-test/Wilcoxon significance testing, Cohen's d effect sizes, bootstrap CIs, variance analysis, and sample size recommendations. Pure Java 21 math (no external libraries). StatisticalAnalyzer facade with single-run analysis, two-run comparison, and multi-run stability analysis.
  • agenteval-chaos: Chaos engineering for AI agents — inject infrastructure failures (tool errors, context corruption, latency, schema mutations) and measure agent resilience. ChaosInjector sealed interface with 4 implementations, ChaosSuite orchestrator, ResilienceEvaluator, and 14 built-in scenarios.

77 files, 6312 lines, 117 new tests — all passing.

Test plan

  • mvn test -pl agenteval-contracts — 38 tests pass
  • mvn test -pl agenteval-statistics — 59 tests pass
  • mvn test -pl agenteval-chaos — 20 tests pass
  • All pre-commit hooks pass (checkstyle, editorconfig, spotbugs)
  • Verify full reactor build: mvn clean install -Denforcer.skip=true

pratyush618 and others added 8 commits April 7, 2026 12:09
Sealed Contract interface (Deterministic, LLMJudged, Composite),
fluent builder, ContractVerifier orchestrator, StandardContracts
library, JSON definition loader, JUnit 5 integration, 38 tests.
Distributions (normal/t CDF), DescriptiveCalculator, InferenceCalculator
(t-CI, bootstrap, paired t-test, Wilcoxon, Cohen's d, power analysis),
StatisticalAnalyzer facade, StatisticalConfig, 59 tests.
ChaosInjector sealed interface (ToolFailure, ContextCorruption,
Latency, SchemaMutation), ChaosSuite orchestrator, ResilienceEvaluator,
14 built-in scenarios, 20 tests.
Bumps org.bsc.langgraph4j:langgraph4j-core-jdk8 from 1.0.0 to 1.1.5.

---
updated-dependencies:
- dependency-name: org.bsc.langgraph4j:langgraph4j-core-jdk8
  dependency-version: 1.1.5
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps `spring-ai.version` from 1.0.0 to 1.1.4.

Updates `org.springframework.ai:spring-ai-model` from 1.0.0 to 1.1.4
- [Release notes](https://github.com/spring-projects/spring-ai/releases)
- [Commits](spring-projects/spring-ai@v1.0.0...v1.1.4)

Updates `org.springframework.ai:spring-ai-client-chat` from 1.0.0 to 1.1.4
- [Release notes](https://github.com/spring-projects/spring-ai/releases)
- [Commits](spring-projects/spring-ai@v1.0.0...v1.1.4)

Updates `org.springframework.ai:spring-ai-commons` from 1.0.0 to 1.1.4
- [Release notes](https://github.com/spring-projects/spring-ai/releases)
- [Commits](spring-projects/spring-ai@v1.0.0...v1.1.4)

---
updated-dependencies:
- dependency-name: org.springframework.ai:spring-ai-model
  dependency-version: 1.1.4
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.springframework.ai:spring-ai-client-chat
  dependency-version: 1.1.4
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.springframework.ai:spring-ai-commons
  dependency-version: 1.1.4
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@pratyush618 pratyush618 merged commit 3b324fc into main Apr 7, 2026
8 checks passed
@pratyush618 pratyush618 deleted the feature/contracts-statistics-chaos branch April 7, 2026 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant