Add end-to-end trust demo flow integration test by 00PrabalK00 · Pull Request #46 · 00PrabalK00/Continuum

00PrabalK00 · 2026-05-31T16:46:11Z

What this proves

Adds tests/test_demo_flow.py, a single class-based unittest integration test that drives the full advertised v0.9.0 trust pipeline as one chained flow on a real temporary git project, through the public API (MemoryStore + managers) and the CLI surface (continuum.cli.main([...])):

continuum init -> objective --mode schedule (tasks + worktree lanes + file claims) -> real change + commit inside a lane worktree -> record_tests (PASS) + record_review (APPROVED) -> evidence -> pr-packet -> flight-record -> roi -> benchmark capture/compare.

Assertions

objective ... --mode schedule produces tasks with lanes/owned paths and worktrees (exit 0).
After a real worktree change+commit and recorded PASS/APPROVED gates, gather_evidence shows the changed file and PASS/APPROVED with no risks.
flight-record final_status == "merge_ready".
roi reports flight_records >= 1 and merge_ready_tasks >= 1.
benchmark capture then compare against a synthetic worse baseline yields verdict "Continuum improved the measured run." (both via the pure functions and the CLI render path).
Each CLI command (objective, evidence, pr-packet, flight-record, roi, benchmark capture/compare) returns exit 0 on the populated project and exit 1 (no traceback, graceful Error: on stderr) on a bogus task id.

Git identity is persisted in setUp (CI runners have no global identity); the project is a tempfile.TemporaryDirectory and all CLI calls pass --project so the real repo is never touched.

Integration result

The pipeline chains correctly end to end with no code changes required to objective.py or cli.py.

Finding (documented, not a blocker)

The out-of-scope risk heuristic in evidence._risks matches changed files against claimed paths by exact path, so owning a directory lane (e.g. --path backend=src) does not cover a nested change like src/app.py — it is flagged as an out-of-scope edit. The demo therefore owns precise file paths (--path backend=src/app.py), which is also how a reviewer wants scope expressed. This is in evidence.py (owned by another agent), so it was not modified; the behavior is documented in the README demo section and called out here for routing if directory-prefix scope matching is desired.

Docs

Adds an "End-To-End Trust Demo" subsection to the README "Plan Objectives And Evidence" section showing the full chained flow and the file-level scoping note.

Verification

CI-mimic from the worktree root:

GIT_CONFIG_GLOBAL=/dev/null GIT_CONFIG_SYSTEM=/dev/null python -m unittest discover -s tests

All 289 tests pass (286 prior + 3 new).

🤖 Generated with Claude Code

Drives the full advertised v0.9.0 pipeline as one chained flow on a real temporary git project: init -> objective (schedule: tasks + worktree lanes + claims) -> commit change in lane worktree -> record PASS/APPROVED gates -> evidence -> pr-packet -> flight-record (merge_ready) -> roi -> benchmark capture/compare (improved verdict). Also asserts each CLI command exits 0 on the populated project and exits 1 (no traceback) on a bogus task id. Documents the file-level lane-scoping detail in the README demo section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

00PrabalK00 and others added 2 commits May 31, 2026 23:45

Merge branch 'main' into qa/demo-flow-e2e

c3dc210

00PrabalK00 merged commit f6a8159 into main May 31, 2026
12 checks passed

00PrabalK00 mentioned this pull request May 31, 2026

Evidence: directory-level lane scope flags nested files as out-of-scope #48

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add end-to-end trust demo flow integration test#46

Add end-to-end trust demo flow integration test#46
00PrabalK00 merged 2 commits into
mainfrom
qa/demo-flow-e2e

00PrabalK00 commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

00PrabalK00 commented May 31, 2026

What this proves

Assertions

Integration result

Finding (documented, not a blocker)

Docs

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant