fix(ci): add mock LLM service and switch drills to config-driven agents#207
Conversation
GitHub Actions Linux runners cannot reach host.docker.internal:11434 (Ollama). The conversation agent's chatModel.Generate() call hangs indefinitely, causing all 6 perf-gate jobs to timeout with completion_ratio=0.00, failing the P0 gate. Fix: - cmd/llm-mock/main.go: minimal Go HTTP server on :11434 that returns valid OpenAI-format chat completion responses instantly (no model required, pure stdlib, zero external deps) - deployments/compose/Dockerfile: build llm-mock binary in builder stage and include it in the runtime image - deployments/compose/docker-compose.ci.yml: CI Compose overlay that adds llm-mock as a service and redirects MODEL_LLM_PROVIDERS_OPENAI_BASE_URL from host.docker.internal:11434 to http://llm-mock:11434/v1 for api, worker1, and worker2 - scripts/local-2.0-stack.sh: auto-detect CI=true (set by GitHub Actions) and activate the CI overlay; extend health-wait from 30s to 90s in CI to allow time for the mock service build No application code changes. local (non-CI) behaviour is unchanged.
…ved) Agents are now config-driven; POST /api/agents endpoint no longer exists. Replace all create_agent() calls in the drill script with DRILL_AGENT_ID (defaults to 'conversation') so the drill uses the pre-loaded config agent. Changes: - scripts/release-p0-drill.sh: add DRILL_AGENT_ID env var, replace every create_agent() call in main() / Drill D / Drill E with the env var, simplify create_job_resilient to remove the 404-recreate-agent path - .github/workflows/release-gates.yml: add DRILL_AGENT_ID: 'conversation' to the release-gates job env block
|
@copilot resolve the merge conflicts in this pull request |
There was a problem hiding this comment.
Pull request overview
This PR aims to make the CI “release gates” runnable on Linux GitHub Actions runners by introducing an in-network mock OpenAI-compatible LLM service (avoiding host.docker.internal) and by switching release drills to use pre-configured, config-driven agents.
Changes:
- Add a minimal
llm-mockHTTP server binary and include it in the compose image build. - Add a CI compose override file and update the local stack script to automatically layer it when
CI=true. - Update the P0 drill script/workflow to use a pre-configured agent ID instead of dynamically creating agents.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/release-p0-drill.sh | Use DRILL_AGENT_ID (pre-configured agent) and treat 404 as a real failure. |
| scripts/local-2.0-stack.sh | Auto-detect CI to apply compose overlay; extend health wait time in CI. |
| deployments/compose/Dockerfile | Build and ship the new llm-mock binary in the compose image. |
| deployments/compose/docker-compose.ci.yml | Add llm-mock service and override OpenAI base URL for api/workers in CI. |
| cmd/llm-mock/main.go | New minimal OpenAI-compatible mock server for CI. |
| .github/workflows/release-gates.yml | Provide DRILL_AGENT_ID env for release drills. |
|
@copilot resolve the merge conflicts in this pull request |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
@copilot resolve the merge conflicts in this pull request |
…ment improvements
Merge conflicts resolved in 6d21aa8. Kept our improved versions for all three conflicted files: |
|
@copilot resolve the merge conflicts in this pull request |
Summary
Makes the CI "release gates" runnable on Linux GitHub Actions runners by introducing an in-network mock OpenAI-compatible LLM service (avoiding
host.docker.internal) and switching release drills to use pre-configured, config-driven agents.GitHub Actions Linux runners cannot reach
host.docker.internal:11434(Ollama). The conversation agent'schatModel.Generate()call hangs indefinitely, causing all perf-gate jobs to timeout withcompletion_ratio=0.00, failing the P0 gate.Changes
cmd/llm-mock/main.go: minimal Go HTTP server on:11434returning valid OpenAI-format chat completion responses instantly (no model required, pure stdlib, zero external deps). Useshttp.Serverwith explicitReadHeaderTimeout/ReadTimeout/WriteTimeout/IdleTimeoutto prevent hung builds.deployments/compose/Dockerfile: buildllm-mockbinary in builder stage and include it in the runtime image.deployments/compose/docker-compose.ci.yml: CI Compose overlay that addsllm-mockas a service and redirectsMODEL_LLM_PROVIDERS_OPENAI_BASE_URLfromhost.docker.internal:11434tohttp://llm-mock:11434/v1forapi,worker1, andworker2. Each service explicitly depends on bothpostgresandllm-mockfor unambiguous startup ordering.scripts/local-2.0-stack.sh: auto-detectCI=true(set by GitHub Actions) and activate the CI overlay; use consistentCI=truecheck for both overlay activation and health-wait extension (30s → 90s).scripts/release-p0-drill.sh: switch to pre-configured agent (DRILL_AGENT_ID) instead of dynamically creating agents via the removedPOST /api/agentsendpoint; remove unusedtsparameter fromcreate_job_resilientand all call sites; remove deadcreate_agenthelper..github/workflows/release-gates.yml: passDRILL_AGENT_ID=conversationto the drill job.No application code changes. Local (non-CI) behaviour is unchanged.
Validation
go test ./...go build ./...Compatibility / Risk
No breaking changes. The CI overlay is only activated when
CI=true; local developer workflows are unaffected. Thellm-mockbinary is included in the compose image but only started when the CI overlay is used.