ci(llm-gate): matrix jobs failing with 'Exit prior to config file resolving' on ~50% of recent runs

## Symptom

The LLM-Based Quality Gate (`.github/workflows/llm-based-quality-gate.yml`, added in #253) has been failing at the workflow level on ~5 of the last 6 PRs. The failures aren't the LLM finding issues — they're the Gemini CLI failing to start, so every matrix job exits with no result and the aggregate posts a `NO RESULTS` comment.

## Evidence

Last 6 LLM-gate runs on `main` and recent PRs (collected 2026-05-27):

| Run | PR | Conclusion | Aggregate verdict |
|---|---|---|---|
| [26490355255](https://github.com/UseJunior/safe-docx/actions/runs/26490355255) | #271 | success | ✅ PASS (14/14 with real justifications) |
| [26487987643](https://github.com/UseJunior/safe-docx/actions/runs/26487987643) | (PR closed) | failure | `NO RESULTS` |
| [26481977858](https://github.com/UseJunior/safe-docx/actions/runs/26481977858) | (PR closed) | failure | `NO RESULTS` |
| [26481976773](https://github.com/UseJunior/safe-docx/actions/runs/26481976773) | (PR closed) | failure | not checked |
| [26479483400](https://github.com/UseJunior/safe-docx/actions/runs/26479483400) | (`fix/allure-labels-validator-false-positives`) | failure | not checked |
| [26447905249](https://github.com/UseJunior/safe-docx/actions/runs/26447905249) | (`llm-gate-phase1`) | failure | not checked |

`NO RESULTS` comments observed on PR #270 and #269 ([safe-docx#270 LLM-gate comment](https://github.com/UseJunior/safe-docx/pull/270#issuecomment-4550837889), [safe-docx#269 LLM-gate comment](https://github.com/UseJunior/safe-docx/pull/269#issuecomment-4550007026)).

## Root cause snippet

From a representative failed matrix job (`Check 02 — Live DOM namespace-safe OOXML writes` on run 26487987643, job 77999406077):

```
[...checkout + setup steps succeed...]
Exit prior to config file resolving
##[error]Process completed with exit code 1.
```

The error is emitted by the Gemini CLI (`@google/gemini-cli@0.39.1`) BEFORE any model invocation. It happens before the `.gemini/settings.json` file the composite action writes is even read.

The Gemini CLI's "Exit prior to config file resolving" message is internal — it surfaces when the CLI's own config-resolution layer can't bootstrap. Common causes documented in [google-gemini/gemini-cli issues](https://github.com/google-gemini/gemini-cli/issues): API quota exhaustion at the project level, malformed CLI config, missing `GEMINI_API_KEY` env var, npm install corruption.

## Why this blocks LLM-gate promotion

The user asked whether to promote safe-docx from advisory (`LLM_GATE_BLOCKING=0`) to enforcing (`LLM_GATE_BLOCKING=1` + required status check). Promotion now would create a worst-of-both-worlds state:

- `Aggregate and post review` still exits 0 on `NO RESULTS` because the blocking check fires on `warns > 0`, not on `total == 0`.
- A required status check `Aggregate and post review` would pass on `NO RESULTS` runs → PRs merge without any real LLM review having occurred.
- The setup looks enforcing, but actually isn't.

The gate must consistently PRODUCE results (PASS or WARN, not `NO RESULTS`) before promotion is safe.

## Investigation directions (in order of likelihood)

1. **API quota exhaustion on the Google project**. As of 2026-05-27 a second repo (`UseJunior/tests-renderer`) now consumes the same Google project's free-tier quota via a separate `GEMINI_API_KEY`. The free tier for `gemini-3.5-flash` is rate-limited per project. Verify in [Google AI Studio → Project quotas](https://aistudio.google.com/) whether quota was already constrained when these runs happened; if so, either upgrade the project tier or rotate the gate to a dedicated Google project per repo.

2. **`@google/gemini-cli@0.39.1` regression**. Test whether bumping `LLM_GATE_CLI_VERSION` to a newer Gemini CLI version fixes the bootstrap. Pin candidate: latest stable from `npm view @google/gemini-cli versions`. If a newer version resolves the bootstrap, update via `gh variable set LLM_GATE_CLI_VERSION --body <new> --repo UseJunior/safe-docx`.

3. **npm install corruption under the `$RUNNER_TEMP` empty-config pattern**. The composite action installs Gemini CLI from `$RUNNER_TEMP` with `NPM_CONFIG_USERCONFIG` / `NPM_CONFIG_GLOBALCONFIG` forced to empty files (security hardening in #253). Verify the install actually completes — add a `gemini --version` check after install, and surface the output to the Actions log so we can see whether the binary even runs.

4. **A missing `GEMINI_API_KEY` env var on certain runner spawns**. The composite action passes `GEMINI_API_KEY: \${{ inputs.gemini-api-key }}`. Verify the workflow secret is set at the org or repo level (not user-level) so it survives across matrix-job spawns.

## Acceptance criteria

- 10 consecutive LLM-gate runs across real PRs end with the aggregate posting a verdict comment that is NOT `NO RESULTS` (i.e., at least one PASS or WARN per matrix item).
- The "Exit prior to config file resolving" line does not appear in any matrix job's log across those 10 runs.
- Once that holds for ~2 weeks, the gate can be promoted via:

  ```bash
  gh variable set LLM_GATE_BLOCKING --body 1 --repo UseJunior/safe-docx
  # plus add 'Aggregate and post review' to branch protection required_status_checks
  ```

## Related

- `UseJunior/tests-renderer` now runs the same gate workflow pattern (verbatim copy, see [tests-renderer#13](https://github.com/UseJunior/tests-renderer/pull/13)). Its smoke run [PR #14](https://github.com/UseJunior/tests-renderer/pull/14) succeeded end-to-end with a real `GEMINI_API_KEY`, so the workflow itself is sound; the failure is specific to safe-docx's API-key + project + version combination.
- The same Google project (`projects/522706245871`) now backs two API keys (safe-docx's existing key + the newly-created `Tests Renderer CI Gemini API Key`). Watching for cross-repo quota interaction may be necessary once both repos are active.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci(llm-gate): matrix jobs failing with 'Exit prior to config file resolving' on ~50% of recent runs #272

Symptom

Evidence

Root cause snippet

Why this blocks LLM-gate promotion

Investigation directions (in order of likelihood)

Acceptance criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Run	PR	Conclusion	Aggregate verdict
26490355255	#271	success	✅ PASS (14/14 with real justifications)
26487987643	(PR closed)	failure	`NO RESULTS`
26481977858	(PR closed)	failure	`NO RESULTS`
26481976773	(PR closed)	failure	not checked
26479483400	(`fix/allure-labels-validator-false-positives`)	failure	not checked
26447905249	(`llm-gate-phase1`)	failure	not checked

ci(llm-gate): matrix jobs failing with 'Exit prior to config file resolving' on ~50% of recent runs #272

Description

Symptom

Evidence

Root cause snippet

Why this blocks LLM-gate promotion

Investigation directions (in order of likelihood)

Acceptance criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions