ci: trigger CI on ci-fix branches and ci-fixer-e2e-test PRs by rnagulapalle · Pull Request #8 · usephalanx/phalanx

rnagulapalle · 2026-04-15T23:08:30Z

Summary

Add phalanx/ci-fix/** to push triggers — CI runs when the fixer commits a fix branch
Add ci-fixer-e2e-test to pull_request base branch list — CI runs on fix PRs targeting that branch

Why

CI fix PRs (e.g. PR #7) were opening with no CI checks because the workflow only triggered on PRs targeting main or develop. Fix branches target the original failing branch (ci-fixer-e2e-test), so CI never fired.

Result

Every phalanx/ci-fix/* push now runs the full quality gate suite, giving the fix PR a green/red CI signal before anyone reviews it.

rnagulapalle · 2026-04-15T23:10:16Z

🔍 Phalanx CI Fixer investigated the ruff failure but could not produce a safe fix.

Diagnosed root cause: Multiple files contain unused imports, unsorted import blocks, and f-strings without placeholders, requiring edits in several locations.

Reason: max_turns_exceeded

No code was committed. Fix run: 60f1de27-7352-4194-ad67-4c6217b07d8e

- ruff --fix auto-resolved 131 of 160 violations (F401, I001, UP037, F541, TC003, TC005) - pyproject.toml: add per-file-ignores for tests/** (N806, SIM117, F811, E402, SIM105) — mock class names, nested-with patterns, and late imports are intentional conventions in test scaffolding, not real bugs ruff check phalanx/ tests/ → All checks passed!

rnagulapalle · 2026-04-15T23:16:16Z

🔍 Phalanx CI Fixer investigated the mypy failure but could not produce a safe fix.

Diagnosed root cause: Multiple files assign objects of the wrong type to variables typed as 'LintError', causing mypy type errors.

Reason: max_turns_exceeded

No code was committed. Fix run: 71511e35-4fa6-4f0e-8689-cd5e80d8e86f

…PI endpoints Introduces the foundational shared-state layer for the multi-agent CI fix pipeline, PR deduplication, and inspectable API endpoints for pipeline state. What's in Phase 1: - phalanx/ci_fixer/context.py: CIFixContext dataclass (+ StructuredFailure, ClassifiedFailure, ReproductionResult, VerifiedPatch, VerificationResult) persisted as JSON in pipeline_context_json; full to_dict/from_dict round-trip; current_stage property walks populated fields - alembic/versions/20260415_0001_ci_fix_context.py: migration adding pipeline_context_json Text column to ci_fix_runs - phalanx/db/models.py: pipeline_context_json mapped_column - phalanx/agents/ci_fixer.py: _find_existing_fix_pr() — checks GitHub API for open phalanx/ci-fix/* PRs before opening a new one (no duplicate PRs); _persist_context() helper; context initialised + updated at each stage - phalanx/api/routes/ci_fix_runs.py: GET /v1/ci-fix-runs/{run_id}/context, GET /v1/ci-fix-runs/{run_id}, GET /v1/ci-fix-runs (list w/ filters) - phalanx/api/main.py: register ci_fix_runs_router - docs/MULTI_AGENT_CI_FIXER.md: full architecture brainstorm doc — 7-agent DAG, sandbox design, fallback ladder, phased plan Tests: 46 new unit tests across test_ci_fix_context.py and test_ci_fix_runs_api.py; Phase 1 files at 100% coverage; full suite 1666 passed, 80.57% overall.

Adds the sandbox provisioning and failure reproduction layer to the multi-agent CI fix pipeline. What's in Phase 2: phalanx/ci_fixer/sandbox.py (NEW): - SandboxProvisioner.detect_stack() — pure file-existence detection for python/node/go/rust/unknown; python wins tie-breaks (checked first) - SandboxProvisioner.provision() — returns SandboxResult with sandbox_id, stack, image, workspace_path; async for Phase 3 Docker forward-compat - sandbox_enabled=False fast-path → returns None → reproducer skips - _STACK_FILES / _STACK_IMAGES module constants; stack_hint bypass phalanx/ci_fixer/reproducer.py (NEW): - ReproducerAgent.reproduce() — runs reproducer_cmd in subprocess, classifies into: confirmed / flaky / env_mismatch / timeout / skipped - _output_matches_failure() — conservative match: tool name OR any structured error code (e.g. F401) in stdout/stderr → confirmed - asyncio.create_subprocess_shell + asyncio.wait_for for timeout - Process killed + reaped on timeout breach - Empty/whitespace cmd guard → skipped phalanx/agents/ci_fixer.py (MODIFIED): - Import SandboxProvisioner + ReproducerAgent at module level - Phase 2 block inserted after clone, before analyst loop: provision → update ctx → reproduce → update ctx → flaky gate → env_mismatch gate; both gate exits call _mark_failed + ctx.complete phalanx/config/settings.py (MODIFIED): - sandbox_docker_cmd, sandbox_timeout_seconds, sandbox_enabled settings Tests: 39 new unit tests; sandbox.py 100%, reproducer.py 97%; full suite 1705 passed, 80.71% overall.

Adds a broad post-fix verification sweep to catch regressions before committing, and wires ctx.verification_result into the pipeline. What's in Phase 3: phalanx/ci_fixer/verifier.py (NEW): - VerifierAgent.verify() — runs the full verification profile for the detected stack; returns VerificationResult(verdict: passed/failed/ skipped/timeout) - Per-stack profiles: python (pytest if infra detected + ruff full repo), node (npm test), go (go test ./...), rust (cargo test) - Unknown stack → skipped (non-blocking) - Timeout per step is skipped (conservative); all-timed-out → "timeout" - FileNotFoundError → step reports tool-not-found, does not raise - _has_pytest(): detects pyproject.toml / pytest.ini / setup.cfg - _run_cmd(): asyncio.create_subprocess_exec + wait_for timeout phalanx/agents/ci_fixer.py (MODIFIED): - Import VerifierAgent at module level - Post-fix verification block: verify() → ctx.verification_result set → verdict="failed" → ctx.complete("escalated"), _mark_failed, early return - Bug fix: removed Phase 1 stub that unconditionally overwrote ctx.reproduction_result with verdict="skipped" on the success path; now only sets it as a fallback when sandbox was disabled (still None) Tests: 23 new unit tests in test_ci_fixer_verifier.py; verifier.py 97% coverage; full suite 1728 passed, 80.80% overall.

…, container exec CircleCI: - log_fetcher: full v2 API (workflow jobs → step logs → raw output) - ci_webhooks: signature verify, workflow-completed handler, double-prefix bug fix - settings: circleci_token, circleci_webhook_secret Sandbox pool: - sandbox_pool.py: SandboxPool with asyncio.Queue per stack, checkout/checkin/borrow, background refill + reaper, Celery-fork-safe lazy singleton, docker exec helpers - sandbox.py: SandboxResult gains container_id + mount_path; SandboxProvisioner uses pool (provision/release); fallback chain preserved (available=False on error) - reproducer.py: _run_subprocess wraps command with docker exec when container_id set - verifier.py: _run_cmd wraps command with docker exec when container_id set - docker/sandbox/: Dockerfiles for python/node/go/rust + reset.sh Tests: 1794 passing, 81% coverage, all new modules ≥85%

… for CI gate

Wrap the reproduce→verify→commit block in try/finally so provisioner.release() is guaranteed on every exit path — normal return, early return (flaky/env_mismatch/validation_failed), and uncaught exception. Container is returned to the pool rather than leaking until the reaper kills it.

rnagulapalle

lgtm

Three test files, 13 tests, ~0.8s. Each one is a static or schema-level guard against a bug class that bit us during canary, without requiring real Anthropic, real OpenAI, or real Docker: test_techlead_openai_message_shape.py (5 tests, bug #5) Mimics the OpenAI Responses API's input contract via a small schema validator. Re-runs cifix_techlead._tool_result_message and asserts it would be ACCEPTED. If a future refactor regresses to role='tool' or top-level tool_use_id (the actual canary failure), the validator raises ResponsesApiSchemaError before deploy. test_engineer_wires_llm_call.py (5 tests, bug #6) Source-level inspection of cifix_engineer.execute(). Asserts: - run_coder_subagent is called - llm_call= is passed (not the test-only NotImplementedError stub) - build_sonnet_coder_callable + coder_subagent_tool_schemas + CODER_SUBAGENT_SYSTEM_PROMPT are imported Plus a sister check that v2's _call_sonnet_llm IS still a stub — the day someone wires it for real, this test reminds us we no longer need the explicit injection. test_state_transition_audit.py (3 tests, bug #2) Asserts ALL four v3 agents inherit BaseAgent._audit unchanged (no shadowing). The signature-mismatch bug from canary #2 fails this check at import time. Plus a real-DB integration test that runs cifix_commander._transition_run('INTAKE','RESEARCHING') against a live Postgres row and verifies it doesn't TypeError — skips cleanly if DATABASE_URL isn't reachable so dev workflow isn't blocked. conftest.py Real-Postgres fixtures (db_engine module-scoped, db_session per- test with rollback) following tests/integration/test_db_constraints pattern. Plus cifix_project + cifix_work_order fixtures with work_order_type='ci_fix' shape. Coverage of the canary bug list now: Bug | Class | Tier-1 | Tier-2 | Tier-3 #1 | infra | ✓ | | #2 | shadowing | | ✓ | #3 | infra | ✓ | | #4 | parser | ✓ | | #5 | provider | | ✓ | #6 | wiring | | ✓ | #7 | prompt | | | (canary) #8 | prompt | | | (canary) apt | regex | ✓ | | 6 of 8 humanize-canary bugs are now caught locally pre-deploy. The remaining 2 (prompt issues) require real LLM + real repo and stay in the canary process. Combined harness runtime: 51 + 13 = 64 tests, ~2 seconds total. Run with: pytest tests/integration/v3_harness/ (Tier-1, no deps) pytest tests/integration/v3_harness_t2/ (Tier-2, skips DB tests if Postgres absent)

ci: trigger on phalanx/ci-fix/** push + ci-fixer-e2e-test PRs

f0c7d7a

FORGE added 2 commits April 15, 2026 16:12

fix(format): run ruff format — 28 files reformatted

e053c6f

FORGE added 8 commits April 15, 2026 17:27

fix(lint): sort imports in test_ci_fix_context + test_ci_fix_runs_api…

c21b945

… for CI gate

fix(format): ruff format all modified files for CI gate

61453df

fix(types): annotate verdict as Literal type to satisfy mypy

a1e2c64

rnagulapalle commented Apr 18, 2026

View reviewed changes

rnagulapalle merged commit 739af6f into main Apr 18, 2026
6 of 8 checks passed

rnagulapalle deleted the fix/ci-workflow-triggers branch April 18, 2026 00:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: trigger CI on ci-fix branches and ci-fixer-e2e-test PRs#8

ci: trigger CI on ci-fix branches and ci-fixer-e2e-test PRs#8
rnagulapalle merged 11 commits intomainfrom
fix/ci-workflow-triggers

rnagulapalle commented Apr 15, 2026

Uh oh!

rnagulapalle commented Apr 15, 2026

Uh oh!

rnagulapalle commented Apr 15, 2026

Uh oh!

rnagulapalle left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rnagulapalle commented Apr 15, 2026

Summary

Why

Result

Uh oh!

rnagulapalle commented Apr 15, 2026

Uh oh!

rnagulapalle commented Apr 15, 2026

Uh oh!

rnagulapalle left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant