Deterministic CI root-cause analysis for failed CI runs.
From the curated MVP benchmark (6 cases):
- Classification accuracy:
100%(6/6) - Baseline classification accuracy:
66.67%(4/6) - Improvement:
+33.33percentage points (about50%relative lift vs baseline) - Artifact hash reproducibility:
100% - Confidence reproducibility:
100%
Benchmark source:
Primary path (recommended):
- Install the GitHub App for your target repository.
- Configure app runtime with safe defaults:
enabled=truepost_comment=trueenable_pr_mode=falsecreate_fix_pr=false
- Trigger a failed
workflow_runand verify:- RCA comment appears on PR/commit context
ci-rca.jsonandci-rca.mdpaths are returned- Outcome status/reason codes are machine-readable
Setup references:
docs/app-first-mvp.mddocs/app-config-contract.mddocs/app-outcome-codes.mddocs/app-operations.mddocs/migration-action-to-app.md
Use this path when you want explicit workflow YAML control. It keeps PR creation disabled (create_fix_pr: "false"), so you get RCA output first.
name: ci-rootcause
on:
workflow_run:
workflows: ["CI"]
types: [completed]
permissions:
contents: read
pull-requests: write
actions: read
jobs:
root-cause:
if: ${{ github.event.workflow_run.conclusion == 'failure' }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Analyze failed CI run
id: rca
uses: ibrahim1023/ci-rootcause-action@v0
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
create_fix_pr: "false"
post_pr_comment: "true"Expected outputs from the action step:
classificationconfidenceprimary_root_cause_titlepr_failure_reason_code+pr_failure_reason(CREATE_FIX_PR_DISABLED+create_fix_pr=falsein safe mode)rca_json_pathandrca_md_path(artifact files)
Reference artifact examples:
artifacts/benchmark-mvp/case-typecheck-ts2345/ci-rca.jsonartifacts/benchmark-mvp/case-typecheck-ts2345/ci-rca.md
Recommended default for new users: deterministic.
| Mode | Autonomy | Key requirement | Cost profile | Risk profile |
|---|---|---|---|---|
deterministic |
Rule-based only | None | Lowest | Lowest |
agentic_assist |
LLM proposes candidate fix steps, deterministic pipeline validates/falls back | Hosted providers require API key; local does not |
Medium | Low-medium |
agentic_full |
Highest autonomy path (explicit opt-in gate required) | Hosted providers require API key; local does not |
Highest | Highest |
Provider support:
- Hosted:
openai,gemini,anthropic(requireprovider_api_keyin agentic modes). - Local:
local(Ollama endpoint compatible, no paid vendor API key required).
Action secret examples:
provider: openai+provider_api_key: ${{ secrets.OPENAI_API_KEY }}provider: gemini+provider_api_key: ${{ secrets.GEMINI_API_KEY }}provider: anthropic+provider_api_key: ${{ secrets.ANTHROPIC_API_KEY }}provider: local+ noprovider_api_key
ci-rootcause analyzes CI failures and produces:
- Structured failure graph
- Deterministic root-cause ranking
- Deterministic confidence score
- Evidence-backed fix plan
- Deterministic patch plan operations (
modify/create/delete/rename) - Optional guarded fix PR (never auto-merged)
ci-rca.jsonandci-rca.mdartifactsci-rca-observability.jsonrun telemetry artifact (trace/timing/failure taxonomy)
Primary runtime target is GitHub Actions. Provider adapter defaults support GitHub Actions and GitLab CI metadata resolution.
Ideal use cases:
- CI failed and you need deterministic root-cause ranking with evidence, not just a generic summary.
- You want machine-readable RCA artifacts (
ci-rca.json) for automation/reporting. - You want safe, guardrailed fix PR proposals with explicit confidence thresholds.
- You need consistent behavior across repeated runs on the same inputs.
Not a fit (non-goals):
- Running arbitrary autonomous repo-wide refactors.
- Replacing your normal test/lint/build workflows.
- Auto-merging remediation changes without human review.
Comparison with formatter-only autofix workflows:
| Capability | ci-rootcause | Formatter/Linter autofix flow |
|---|---|---|
| Works from failed CI logs + diff | Yes | Usually no |
| Root-cause classification | Yes | No |
| Ranked RCA with confidence | Yes | No |
Structured RCA artifact (ci-rca.json) |
Yes | No |
| Guardrailed optional fix PRs | Yes | Yes (tool-dependent) |
| Designed for deterministic replay | Yes | Varies |
flowchart LR
A[CI Logs + Diff] --> B[Log Ingest Agent]
A --> C[Diff Analysis Agent]
B --> D[Failure Classification Agent]
C --> E[Root Cause Ranker Agent]
D --> E
E --> F[Fix Planner Agent]
E --> G[Reporter Agent]
F --> H[PR Creation Agent]
G --> I[Artifacts ci-rca.json + ci-rca.md]
H --> J[Guarded Fix PR]
Requirements:
- Python 3.11+
Install tools:
python -m pip install --upgrade pip
pip install -r requirements.txt
pre-commit installRun checks:
ruff check .
ruff format --check .
pytest- Install dependencies:
pip install -r requirements.txt- Run the local pipeline once:
ci-rootcause \
--log-path fixtures/ci-logs/github-actions-python-failure.log \
--diff-path fixtures/diffs/refactor-only.diff \
--output-dir artifacts \
--timestamp 2026-02-21T00:00:00Z \
--commit abc123 \
--run-id gha_quickstart_1 \
--base-commit abc122 \
--head-commit abc123 \
--repository owner/repo- Inspect generated artifacts:
artifacts/ci-rca.jsonartifacts/ci-rca.md
Run three reproducible demo scenarios:
for case in \
fixtures/demos/01-dependency-lockfile-drift \
fixtures/demos/02-typecheck-ts2345 \
fixtures/demos/03-infra-timeout
do
name="$(basename "$case")"
ci-rootcause \
--log-path "$case/ci.log" \
--diff-path "$case/change.diff" \
--output-dir "artifacts/demo/$name" \
--timestamp 2026-02-21T00:00:00Z \
--commit abc123 \
--run-id "demo_${name}" \
--base-commit abc122 \
--head-commit abc123 \
--repository owner/repo
doneDemo fixture pack:
fixtures/demos/README.mdfixtures/demos/01-dependency-lockfile-driftfixtures/demos/02-typecheck-ts2345fixtures/demos/03-infra-timeout
Run end-to-end deterministic analysis locally:
ci-rootcause \
--log-path fixtures/ci-logs/github-actions-python-failure.log \
--diff-path fixtures/diffs/refactor-only.diff \
--historical-runs-path fixtures/classification/historical-runs.sample.json \
--output-dir artifacts \
--timestamp 2026-02-20T00:00:00Z \
--commit abc123 \
--run-id gha_local_1 \
--base-commit abc122 \
--head-commit abc123 \
--repository owner/repoCLI behavior:
- Writes
ci-rca.jsonandci-rca.mdinto--output-dir - Prints a machine-readable JSON summary to stdout
- Exits
0forcompleted/partialanalysis runs,2for runtime/input errors - Supports optional deterministic flaky-test detection via
--historical-runs-path - Supports local
--config-path(simplekey: value) and single-stream stdin input via- - Supports
--offline-onlyto force no remote PR creation/network calls - Supports rollout profile
--profile safe-github-rollout(enforces min PR confidence >=0.90)
Runtime mode:
- Uses Google ADK runtime orchestration by default when
google-adkis installed - Falls back to deterministic local orchestration if ADK runtime initialization fails
- Uses deterministic local orchestration when
--fail-fastis enabled
The action is defined in action.yml.
Inputs:
github_token(required)create_fix_pr(defaultfalse)post_pr_comment(defaulttrue)base_ref,head_refconfig_path(default.ci-rootcause.yml)max_fix_files(default5)min_pr_confidence(default0.75)rollout_profile(default empty, supported:safe-github-rollout)offline_only(defaultfalse)mode(defaultdeterministic, supported:deterministic,agentic_assist,agentic_full)enable_agentic_full(defaultfalse, required formode=agentic_full)provider(defaultlocal, supported:openai,gemini,anthropic,local)model(default provider-specific)provider_api_key(required for hosted providers in agentic modes)validation_commands(optional;or newline separated validation gate commands)- Local/adapter detection supports GitHub Actions and GitLab CI context fields
Outputs:
classification,confidence,primary_root_cause_titlerca_json_path,rca_md_pathpr_created,pr_url,pr_number,pr_failure_reason_code,pr_failure_reason
Agentic failure reason code categories surfaced in pr_failure_reason_code:
AGENTIC_MISSING_KEYAGENTIC_PROVIDER_ERRORVALIDATION_FAILEDAGENTIC_MAX_ATTEMPTS_EXCEEDED
Reference:
Autonomous PR note:
- If
create_fix_pr=trueand explicitvalidated_changesare not provided, the pipeline can synthesize deterministic evidence-backed changes forTYPECHECKcases. - For simple
intassignment mismatches (for examplevalue: int = "7"), synthesis prefers semantic correction (value: int = 7) over suppression comments. - For non-
TYPECHECKcases, explicit validated changes are still required for PR creation.
Safe rollout profile:
safe-github-rolloutkeeps guarded PR flow conservative during rollout- Enforces
min_pr_confidence >= 0.90 - Keeps
create_fix_pr=falseunless explicitly enabled by workflow input/config - Example local config:
fixtures/config/safe-github-rollout.yml
Required workflow permissions:
contents: write(PR creation only)pull-requests: writeactions: read
Marketplace wrapper sync automation:
- Source repo keeps development workflows; Marketplace wrapper is ibrahim1023/ci-rootcause-action.
- Release tags (
v*) trigger.github/workflows/publish-wrapper.ymlto syncsrc/,action.yml,README.md, andrequirements.txtinto the wrapper repo. - Wrapper tags are updated to the same release tag and major alias
v0. - Wrapper GitHub Release objects are created/updated automatically for the synced tag.
- Required repository secret in source repo:
CI_ROOTCAUSE_ACTION_REPO_TOKEN(PAT with write access toibrahim1023/ci-rootcause-action).
Execution order is deterministic and fixed:
log_ingestdiff_analysisfailure_classificationroot_cause_rankerfix_plannerreporterpr_creation
Runtime behavior:
- ADK runtime is used by default when available.
- Deterministic local fallback executes on ADK initialization/runtime failure.
fail_fastuses deterministic local orchestration to preserve exception behavior.
Live PR creation/idempotency validation is available as an opt-in integration test:
scripts/run_live_github_test.sh \
--repo-path /path/to/disposable/repo \
--repository owner/repo \
--token ghp_xxx \
--target-branch mainNotes:
- Test is skipped unless
CI_ROOTCAUSE_LIVE_GITHUB=1. - Use a disposable repository with push + PR permissions.
- Script prints a cleanup checklist after the test run.
You can run a manual in-repo smoke test of the GitHub Action from the Actions tab:
- Workflow:
Smoke Test Marketplace Action - Trigger:
workflow_dispatch - Behavior:
- runs
uses: ibrahim1023/ci-rootcause-action@v0.1.6only - uses a deterministic typecheck-style log signal and
create_fix_pr=true - prints
pr_failure_reasonwhen PR creation is skipped - asserts
pr_created=trueand non-emptypr_url - uploads RCA artifacts (
ci-rca.json,ci-rca.md,ci-rca-observability.json)
- runs
This workflow is isolated from default CI and is intended only for manual validation.
- Benchmark report JSON:
docs/reports/mvp-benchmark-report.json - Benchmark report summary:
docs/reports/mvp-benchmark-report.md - Release checklist:
docs/release-checklist-v0.1.1.md - Benchmark metrics include classification/primary RCA accuracy, confidence reproducibility,
artifact-hash reproducibility, timing distribution (
mean/median/p95), and deterministic lift againstbasic-log-summarizer-v1baseline classification accuracy. - Release notes:
docs/release-notes-v0.1.0.md - Known limitations:
docs/limitations.md
- Current curated benchmark corpus is intentionally small (MVP scope).
- Classification coverage is deterministic-rule based and pattern limited.
- Timing metrics are runtime-derived and marked as nondeterministic metadata.
- Automated fix generation is guardrailed and intentionally conservative.
- No automatic merge or branch-protection bypass is supported.
- No CI rerun orchestration is included in MVP.
Contribution standards are documented in CONTRIBUTING.md.