Skip to content

Add end-to-end trust demo flow integration test#46

Merged
00PrabalK00 merged 2 commits into
mainfrom
qa/demo-flow-e2e
May 31, 2026
Merged

Add end-to-end trust demo flow integration test#46
00PrabalK00 merged 2 commits into
mainfrom
qa/demo-flow-e2e

Conversation

@00PrabalK00
Copy link
Copy Markdown
Owner

What this proves

Adds tests/test_demo_flow.py, a single class-based unittest integration test that drives the full advertised v0.9.0 trust pipeline as one chained flow on a real temporary git project, through the public API (MemoryStore + managers) and the CLI surface (continuum.cli.main([...])):

continuum init -> objective --mode schedule (tasks + worktree lanes + file claims) -> real change + commit inside a lane worktree -> record_tests (PASS) + record_review (APPROVED) -> evidence -> pr-packet -> flight-record -> roi -> benchmark capture/compare.

Assertions

  1. objective ... --mode schedule produces tasks with lanes/owned paths and worktrees (exit 0).
  2. After a real worktree change+commit and recorded PASS/APPROVED gates, gather_evidence shows the changed file and PASS/APPROVED with no risks.
  3. flight-record final_status == "merge_ready".
  4. roi reports flight_records >= 1 and merge_ready_tasks >= 1.
  5. benchmark capture then compare against a synthetic worse baseline yields verdict "Continuum improved the measured run." (both via the pure functions and the CLI render path).
  6. Each CLI command (objective, evidence, pr-packet, flight-record, roi, benchmark capture/compare) returns exit 0 on the populated project and exit 1 (no traceback, graceful Error: on stderr) on a bogus task id.

Git identity is persisted in setUp (CI runners have no global identity); the project is a tempfile.TemporaryDirectory and all CLI calls pass --project so the real repo is never touched.

Integration result

The pipeline chains correctly end to end with no code changes required to objective.py or cli.py.

Finding (documented, not a blocker)

The out-of-scope risk heuristic in evidence._risks matches changed files against claimed paths by exact path, so owning a directory lane (e.g. --path backend=src) does not cover a nested change like src/app.py — it is flagged as an out-of-scope edit. The demo therefore owns precise file paths (--path backend=src/app.py), which is also how a reviewer wants scope expressed. This is in evidence.py (owned by another agent), so it was not modified; the behavior is documented in the README demo section and called out here for routing if directory-prefix scope matching is desired.

Docs

Adds an "End-To-End Trust Demo" subsection to the README "Plan Objectives And Evidence" section showing the full chained flow and the file-level scoping note.

Verification

CI-mimic from the worktree root:

GIT_CONFIG_GLOBAL=/dev/null GIT_CONFIG_SYSTEM=/dev/null python -m unittest discover -s tests

All 289 tests pass (286 prior + 3 new).

🤖 Generated with Claude Code

00PrabalK00 and others added 2 commits May 31, 2026 23:45
Drives the full advertised v0.9.0 pipeline as one chained flow on a real
temporary git project: init -> objective (schedule: tasks + worktree lanes +
claims) -> commit change in lane worktree -> record PASS/APPROVED gates ->
evidence -> pr-packet -> flight-record (merge_ready) -> roi -> benchmark
capture/compare (improved verdict). Also asserts each CLI command exits 0 on
the populated project and exits 1 (no traceback) on a bogus task id.

Documents the file-level lane-scoping detail in the README demo section.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@00PrabalK00 00PrabalK00 merged commit f6a8159 into main May 31, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant