An open-source AI co-pilot for the Performance Testing Lifecycle β from test creation to execution, monitoring, analysis, reporting, and delivery.
β οΈ This project is under active development. Repository structure, APIs, setup instructions, Docker files, and documentation may change frequently while the platform is being assembled.
PerfPilot is an AI-assisted performance testing platform that brings together:
- π€ Multi-agent orchestration for performance testing workflows
- π©οΈ PerfPilot Hub, a unified MCP gateway for performance testing tools
- π§ͺ JMeter script generation and execution
- π₯ BlazeMeter test execution and result collection
- π Datadog metrics, logs, and APM correlation
- π§ PerfMemory, a persistent AI memory layer for debugging and lessons learned
- π Performance analysis and reporting
- π¬ Collaboration integrations such as Confluence, MS Teams, and SharePoint
- π₯οΈ A browser-based CopilotKit / React UI for human-in-the-loop workflows
PerfPilot extends the original mcp-perf-suite project into a broader platform that combines MCP tools, AI agents, an A2A server, an AG-UI backend, and a web frontend under one repository.
Performance testing often requires a human to move between many disconnected tools:
- Capture or design a user flow
- Generate a JMeter script
- Debug correlation and test data issues
- Execute a load test
- Monitor infrastructure and application metrics
- Analyze bottlenecks
- Write a report
- Publish results
- Notify stakeholders
PerfPilot is designed to turn that fragmented process into an agent-assisted workflow.
The goal is not to remove the human from the process. The goal is to give performance engineers an AI co-pilot that can handle the repetitive work while keeping humans in control of consequential decisions such as launching tests, approving reports, and publishing results.
PerfPilot uses an aviation-inspired naming model to make the platform easier to understand, remember, and explain.
Performance testing has many parallels to aviation: every test needs a plan, a controlled launch, real-time monitoring, telemetry, communication, analysis, and a final flight record. PerfPilot applies that metaphor to the Performance Testing Lifecycle while keeping the actual repository structure practical and developer-friendly.
The codebase uses clear folder names such as agent-framework/, mcp-perf-suite/, docker/, and docs/. The aviation names are used in the documentation and product language so users can quickly understand how the pieces fit together.
| Product Name | Technical Component | Meaning |
|---|---|---|
| PerfPilot | Overall framework and orchestrator concept | The pilot coordinating the full performance testing mission |
| Pilot | Orchestrator agent | Plans the workflow, delegates tasks, and keeps humans in control |
| Copilots | Specialized agents | Domain-specific agents that assist with scripting, execution, monitoring, analysis, reporting, and notifications |
| PerfPilot Hub | mcp-perf-suite/gateway-mcp/ |
The central MCP gateway that gives agents one endpoint for the full performance testing toolchain |
| FlightDeck | agent-framework/frontend/ |
The human-facing web UI where users interact with PerfPilot |
| ACARS | A2A server inside agent-framework/ |
The agent-to-agent communication layer for upstream and downstream AI frameworks |
| Flight Log | Test artifacts, reports, and run history | The record of what happened during a performance testing mission |
| Black Box | PerfMemory and persisted debugging context | The memory layer that helps agents recall prior issues, fixes, and lessons learned |
In short: PerfPilot is the pilot, the specialist agents are copilots, and PerfPilot Hub is the airport-style tool hub where they access the systems needed to complete the mission.
This gives users a simple mental model:
βMy PerfPilots are handling my performance testing workflow, while I stay in control from the FlightDeck.β
perfpilot/
βββ agent-framework/ # AG2 agents, A2A server, AG-UI backend, CopilotKit frontend
β βββ frontend/ # CopilotKit / React / Next.js web UI
β βββ backend/ # Python AG-UI server and A2A server
βββ mcp-perf-suite/ # Gateway + all MCP servers
β βββ gateway-mcp/ # PerfPilot Hub β unified MCP gateway via FastMCP
β βββ blazemeter-mcp/ # BlazeMeter API tools
β βββ confluence-mcp/ # Confluence publishing tools
β βββ datadog-mcp/ # Datadog metrics, logs, and APM tools
β βββ jmeter-mcp/ # JMeter script generation and execution tools
β βββ perfanalysis-mcp/ # Performance analysis and correlation tools
β βββ perfmemory-mcp/ # AI memory backed by PostgreSQL, pgvector, and Apache AGE
β βββ perfreport-mcp/ # Report generation tools
β βββ artifacts/ # Test artifacts, reports, JTLs, logs, and generated files
β βββ streamlit-ui/ # Web UI for viewing performance test results
βββ docker/ # Compose files, Dockerfiles, and config templates
βββ docs/ # Public documentation
PerfPilot Hub is the central MCP gateway. Instead of connecting an AI agent to many separate MCP servers, PerfPilot Hub exposes the performance testing toolchain through one MCP endpoint.
It routes requests to specialized MCP servers such as JMeter, BlazeMeter, Datadog, PerfAnalysis, PerfReport, Confluence, PerfMemory, MS Teams, and SharePoint.
The Pilot is the orchestrator agent. It understands the userβs request, creates a plan, delegates work, coordinates progress, and keeps the human in control.
The Copilots are specialist agents. Each Copilot focuses on a specific phase of the Performance Testing Lifecycle, such as script generation, test execution, monitoring, analysis, reporting, or notifications.
Together, the Pilot and Copilots form the agent layer of PerfPilot.
Planned and/or evolving agents include:
| Agent | Purpose |
|---|---|
| π― Orchestrator Agent | Coordinates the full workflow and delegates work to specialists |
| π Script Agent | Generates or adapts performance test scripts |
| π Execution Agent | Starts and monitors performance test execution |
| π Monitoring Agent | Pulls infrastructure, application, logs, and APM data |
| π Analysis Agent | Correlates test results with monitoring data |
| π Reporting Agent | Drafts performance test reports |
| π£ Notifications Agent | Sends summaries, links, and status updates to stakeholders |
FlightDeck is the human-facing web UI. It gives users a cockpit-style command surface for chatting with PerfPilot, reviewing workflow progress, approving human-in-the-loop actions, and viewing results.
The implementation lives under agent-framework/frontend/, but the product experience is called PerfPilot FlightDeck.
It is built around:
- Next.js
- React
- CopilotKit
- AG-UI
- A Python backend
- Persistent conversation and workflow state
ACARS is the agent-to-agent communication layer. In aviation, ACARS is associated with structured aircraft communication. In PerfPilot, the term represents the A2A server that allows upstream and downstream AI frameworks to communicate with the PerfPilot agent runtime.
The implementation lives inside agent-framework/.
"PerfPilot gives performance engineers the feeling that βmy PerfPilots are handling my performance testing workflowβ β while the human remains in command."
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Human Users β
β β
β Browser UI / Cursor / Claude / Other AI Agent Frameworks β
βββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PerfPilot Agent Layer β
β β
β Orchestrator Agent β
β βββ Script Agent β
β βββ Execution Agent β
β βββ Monitoring Agent β
β βββ Analysis Agent β
β βββ Reporting Agent β
β βββ Notifications Agent β
β β
β Surfaces: β
β - A2A server for agent-to-agent workflows β
β - AG-UI backend for browser-based human interaction β
βββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PerfPilot Hub β
β β
β Unified MCP gateway exposing performance testing tools β
βββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββΌβββββββββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
β JMeter MCP β β BlazeMeter MCP β β Datadog MCP β
βββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
β PerfAnalysis β β PerfReport β β PerfMemory β
β MCP β β MCP β β MCP β
βββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
βΌ βΌ βΌ
βββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
β Confluence β β MS Teams β β SharePoint β
β MCP β β MCP β β MCP β
βββββββββββββββββ ββββββββββββββββββ βββββββββββββββββββ
A future end-to-end PerfPilot workflow may look like this:
- A user submits a request through the Web UI, Cursor, Claude, or an upstream AI framework.
- The Orchestrator Agent creates a performance testing plan.
- The Script Agent generates or updates a JMeter script.
- The human reviews and approves the generated script.
- The Execution Agent starts a test through BlazeMeter or another load testing backend.
- The Monitoring Agent collects metrics, logs, and traces during the test.
- The Analysis Agent correlates load test results with application and infrastructure telemetry.
- The Reporting Agent drafts an executive-friendly report.
- The human reviews, revises, and approves the report.
- The report is published to Confluence and/or archived to SharePoint.
- Stakeholders are notified through MS Teams or another notification channel.
| Layer | Technology |
|---|---|
| Agent framework | AG2 |
| Agent-to-agent communication | A2A |
| Tool protocol | MCP |
| MCP gateway | FastMCP |
| Backend APIs | Python / FastAPI |
| Browser UI | Next.js, React, CopilotKit |
| Database | PostgreSQL |
| Vector memory | pgvector |
| Knowledge graph memory | Apache AGE |
| Load testing | JMeter, BlazeMeter |
| Observability | Datadog |
| Reporting / collaboration | Confluence, SharePoint, MS Teams |
| Local orchestration | Docker Compose |
PerfPilot is currently being assembled from the original mcp-perf-suite project and the experimental agent-framework branch.
| Area | Status |
|---|---|
| MCP Perf Suite | Existing project being moved under the new root repo |
| PerfPilot Hub | Existing MCP gateway concept from gateway-mcp |
| Agent Framework | Active development |
| A2A Server | Active development |
| AG-UI Backend | Active development |
| CopilotKit Web UI | Active development |
| Docker Compose | Planned / evolving |
| Public documentation | In progress |
The repository is currently in early setup. Full installation instructions will be added as the folder structure stabilizes.
For now, the expected local development flow will be:
# Clone the repository
git clone https://github.com/canyonlabz/perfpilot.git
cd perfpilotThen follow the setup instructions inside each major module:
| Module | Path | Purpose |
|---|---|---|
| Agent Framework | agent-framework/ |
Multi-agent orchestration, A2A server, AG-UI backend, and frontend |
| MCP Perf Suite | mcp-perf-suite/ |
MCP gateway and specialized performance testing MCP servers |
| Docker | docker/ |
Local containers, databases, and service orchestration |
| Docs | docs/ |
Public documentation and architecture notes |
Planned areas of work include:
- Move
mcp-perf-suiteinto the newperfpilotrepository - Move the experimental
agent-frameworkbranch into the new root structure - Normalize environment configuration across agents, MCPs, and Docker
- Add root-level Docker Compose orchestration
- Add one-command local startup for database, MCP gateway, agents, and UI
- Expand specialist agents beyond the first working vertical slices
- Add human-in-the-loop approval cards in the Web UI
- Add persistent multi-thread conversation history
- Add task progress streaming and test-run result views
- Add public architecture documentation
- Add contribution guidelines
Contributions, ideas, and feedback are welcome.
This project is still early and moving quickly, so please expect breaking changes while the architecture stabilizes.
This project is licensed under the MIT License.
See the LICENSE file for details.
Created with β€οΈ for performance engineers, quality engineers, SREs, and AI-assisted testing workflows.