PolyPilot uses a two-layer testing strategy: deterministic unit tests that run without the app, and executable UI scenarios that validate against a running instance using MauiDevFlow.
cd PolyPilot.Tests
dotnet test # Run all tests
dotnet test --filter "FullyQualifiedName~ChatMessageTests" # One test class
dotnet test --filter "FullyQualifiedName~ChatMessageTests.UserMessage_SetsRoleAndType" # Single testThe test project targets net10.0 (not MAUI) so tests run on any machine without platform SDKs. Since the MAUI project can't be directly referenced from a plain .NET project, the test .csproj uses <Compile Include> links to pull in source files from the main project:
<!-- PolyPilot.Tests.csproj -->
<Compile Include="..\PolyPilot\Models\ChatMessage.cs" Link="Linked\ChatMessage.cs" />
<Compile Include="..\PolyPilot\Services\CopilotService.cs" Link="Linked\CopilotService.cs" />
<!-- ... 70+ linked files -->This means any new model or service class with no MAUI dependencies can be tested by adding a <Compile Include> entry to the test project.
| Area | Test Files | What's Covered |
|---|---|---|
| Models | ChatMessageTests, AgentSessionInfoTests, BridgeMessageTests, FiestaModelTests, ConnectionSettingsTests | Serialization, round-trip JSON, enum parsing, property defaults |
| Multi-Agent | MultiAgentRegressionTests, MultiAgentGapTests, ReflectionCycleTests, WorktreeStrategyTests | Orchestration modes, reflection loops, stall detection, worktree isolation strategies |
| Organization | SessionOrganizationTests, SquadDiscoveryTests, SquadWriterTests | Group stability, .squad directory parsing, charter round-tripping |
| Services | CopilotServiceInitializationTests, ServerManagerTests, RepoManagerTests | Mode switching, reconnection, save guards, thread safety |
| Watchdog | ProcessingWatchdogTests, StuckSessionRecoveryTests | Timeout tiers, stuck session detection, recovery messages |
| Persistence | SessionPersistenceTests, UiStatePersistenceTests | Session merge, partial restore, UI state round-trip |
| Parsing | DiffParserTests, CommandParserTests, EventsJsonlParsingTests | Git diffs, slash commands, JSONL event streams |
| UI Logic | InputSelectionTests, SlashCommandAutocompleteTests, BottomBarTooltipTests | CSS classes, autocomplete ranking, tooltip formatting |
| Bridge | WsBridgeIntegrationTests, WsBridgeServerAuthTests | WebSocket messaging, authentication flow |
[Collection("BaseDir")]— Any test class that callsSetBaseDirForTesting()must use this xUnit collection attribute. It prevents parallel test execution that would corrupt the shared base directory.- Fake services — Tests use lightweight fakes (e.g.,
FakeRepoManager,FakeCopilotService) rather than mocks. Look inTestStubs.csand individual test files for patterns. - No network required — All unit tests are fully offline. Copilot SDK interactions are faked.
Scenarios are JSON files in PolyPilot.Tests/Scenarios/ that describe end-to-end user flows. They run against a live PolyPilot instance using MauiDevFlow's CDP (Chrome DevTools Protocol) commands to interact with the Blazor WebView.
| File | Scenarios | What's Covered |
|---|---|---|
multi-agent-scenarios.json |
25 | OrchestratorReflect loop, stall detection, group lifecycle, Squad discovery, preset creation, charter injection, round-trip write-back |
mode-switch-scenarios.json |
10+ | Persistent ↔ Embedded ↔ Remote mode switching, session persistence across restarts, CLI source switching, stuck session recovery |
-
Build and launch the app:
cd PolyPilot # macOS ./relaunch.sh # Windows powershell -ExecutionPolicy Bypass -File relaunch.ps1
-
Wait for the MauiDevFlow agent to connect:
maui-devflow MAUI status # Poll until connected -
Execute scenario steps via CDP:
# Navigate maui-devflow cdp Input dispatchClickEvent "a[href='/settings']" # Read state maui-devflow cdp Runtime evaluate "document.querySelectorAll('.session-item').length" # Fill input maui-devflow cdp Input fill ".branch-input" "feature/my-branch" # Take screenshot for visual verification maui-devflow MAUI screenshot --output check.png
Each scenario is a JSON object with an id, name, description, optional invariants (what must be true), and steps:
{
"id": "reflect-loop-completes-goal-met",
"name": "OrchestratorReflect loop runs to goal completion",
"invariants": [
"ReflectionState.GoalMet == true on exit",
"ReflectionState.IsActive == false on exit"
],
"steps": [
{ "action": "createGroup", "mode": "OrchestratorReflect", "workers": 2 },
{ "action": "sendPrompt", "text": "Analyze the project structure" },
{ "action": "waitForPhase", "phase": "Complete", "timeout": 600 },
{ "action": "assertReflectionState", "field": "GoalMet", "expected": true }
]
}ScenarioReferenceTests.cs bridges the two layers — it validates that scenario JSON files are well-formed and documents which unit tests cover the same invariants as each scenario. This ensures that every CDP scenario has a fast, deterministic unit-test equivalent.
For example, the scenario "mode-switch-persistent-to-embedded-and-back" is cross-referenced to ModeSwitch_RapidModeSwitches_NoCorruption in the unit test suite.
- Create
YourFeatureTests.csinPolyPilot.Tests/ - If testing a new source file, add a
<Compile Include>toPolyPilot.Tests.csproj - Use
[Fact]for single cases,[Theory]with[InlineData]for parameterized - If your test uses
SetBaseDirForTesting(), add[Collection("BaseDir")]
- Add a scenario object to the appropriate JSON file in
Scenarios/ - Add a cross-reference test in
ScenarioReferenceTests.cs - Write a matching deterministic unit test that covers the same invariant
For the multi-agent orchestration architecture, invariants, and detailed test matrix, see docs/multi-agent-orchestration.md.