diff --git a/.codex/evals/README.md b/.codex/evals/README.md new file mode 100644 index 0000000..3a9abbf --- /dev/null +++ b/.codex/evals/README.md @@ -0,0 +1,20 @@ +# Evals + +Use this directory for repo-local eval definitions that measure whether the AI +workflow and the product behavior are improving or regressing. + +Recommended layout: + +```text +.codex/evals/ + templates/ + .md + .log +``` + +For non-trivial changes, define: + +- capability evals for the new behavior +- regression evals for the old behavior that must keep working +- clear pass or fail evidence + diff --git a/.codex/evals/templates/feature-delivery.md b/.codex/evals/templates/feature-delivery.md new file mode 100644 index 0000000..529f0cd --- /dev/null +++ b/.codex/evals/templates/feature-delivery.md @@ -0,0 +1,22 @@ +# EVAL: + +## Capability evals + +- [ ] The intended user-visible behavior works end to end. +- [ ] The relevant Playwright journey passes. +- [ ] The expected log evidence is present. + +## Regression evals + +- [ ] Existing adjacent behavior still works. +- [ ] No new console or runtime errors appear. +- [ ] Build, lint, typecheck, and tests still pass. + +## Evidence + +- Plan: +- Playwright artifact path: +- CDP artifact path: +- Log query: +- Notes: + diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..c93012d --- /dev/null +++ b/.gitignore @@ -0,0 +1,8 @@ +.runtime/ +.worktrees/ +.artifacts/ +.idea/ +playwright-report/ +test-results/ +dist/ +coverage/ diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..4f8d989 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,82 @@ +# git-ranker-workflow AGENTS + +This repository is the control plane for the `git-ranker` backend and the +`git-ranker-client` frontend. Keep this file short. The system of record lives +in [ARCHITECTURE.md](ARCHITECTURE.md) and [docs/](docs/index.md). + +## What this repo owns + +- Repository-local knowledge store and operating rules for coding agents +- Cross-repo feature delivery workflow, QA loop, and observability workflow +- ExecPlan conventions for long-running tasks +- Guardrails for frontend/backend coordination across the two submodule repos + +## Repo map + +- `git-ranker/`: backend repo (submodule) +- `git-ranker-client/`: frontend repo (submodule) +- `ARCHITECTURE.md`: top-level control-plane architecture +- `PLANS.md`: rules for long-running ExecPlans +- `docs/`: knowledge store; treat this as the source of truth +- `scripts/`: lightweight verification and scaffolding helpers +- `harness/`: local observability and QA harness configuration +- `.codex/evals/`: eval definitions and templates + +## How to start a task + +1. Read [ARCHITECTURE.md](ARCHITECTURE.md). +2. Read [docs/index.md](docs/index.md) and the specific docs for the change + surface. +3. If the request spans multiple files, multiple repos, new behavior, or a + likely multi-hour effort, create an ExecPlan in + `docs/exec-plans/active/-.md` and follow [PLANS.md](PLANS.md). +4. Restate the request in terms of: + - user-visible outcome + - impacted repos + - acceptance checks + - required Playwright/CDP/Loki evidence +5. Work inside a task-specific isolated runtime footprint under `.runtime/` and + `.worktrees/`. + +## System of record + +- Product intent: [docs/product-specs/index.md](docs/product-specs/index.md) +- Architectural rules: [docs/design-docs/index.md](docs/design-docs/index.md) +- UX and UI behavior: [docs/DESIGN.md](docs/DESIGN.md), + [docs/FRONTEND.md](docs/FRONTEND.md) +- Backend and data behavior: [docs/BACKEND.md](docs/BACKEND.md), + [docs/SECURITY.md](docs/SECURITY.md), [docs/RELIABILITY.md](docs/RELIABILITY.md) +- Quality and cleanup rules: [docs/QUALITY_SCORE.md](docs/QUALITY_SCORE.md) +- Generated facts: [docs/generated/README.md](docs/generated/README.md) +- Workflow loop: [docs/workflows/feature-delivery-loop.md](docs/workflows/feature-delivery-loop.md), + [docs/workflows/qa-feedback-loop.md](docs/workflows/qa-feedback-loop.md) + +## Non-negotiables + +- Do not turn `AGENTS.md` into a large manual. Promote durable rules into + `docs/` or scripts. +- Do not implement from vague intent. Convert feature requests into explicit + acceptance criteria first. +- Do not ship a user-visible change without QA evidence from: + - automated tests + - Playwright + - browser inspection via CDP or equivalent + - worktree-local logs in Loki or the configured log backend +- Do not treat Slack, chat history, or memory as source of truth. If it matters + later, check it into the repo. +- Do not handwave cross-repo changes. Contract changes must be reflected in + backend, frontend, docs, and validation steps. + +## Delivery loop + +1. Intake and clarify the request. +2. Write or update an ExecPlan if the task is non-trivial. +3. Implement in backend/frontend worktrees. +4. Run build, typecheck, lint, and tests. +5. Boot the isolated stack for the task. +6. Run Playwright journeys. +7. Inspect UI, network, console, and DOM with CDP tooling. +8. Query logs, metrics, and traces for the same task runtime. +9. Feed findings back into code, docs, and the ExecPlan. +10. Record outcomes and remaining debt before handoff or merge. + diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000..5a96934 --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,153 @@ +# git-ranker Workflow Architecture + +## Purpose + +This repository is the orchestration layer for an agent-first development +workflow across two application repositories: + +- `git-ranker`: backend system of record for APIs, jobs, persistence, and domain + rules +- `git-ranker-client`: frontend system of record for routes, components, user + flows, and client-side state + +The control plane in this repo exists to make the product legible to coding +agents, not to store application logic. + +## Current repo facts + +The submodules are initialized in this workspace and currently expose these +high-level facts: + +- backend: Spring Boot 3.4, Java 21, JPA, Batch, Security, Actuator, Prometheus, + structured JSON logging, Testcontainers, ArchUnit +- frontend: Next.js App Router, React 19, TypeScript, ESLint, React Query, + Zustand, Tailwind, Radix UI + +Those facts should shape the workflow and harness choices instead of generic +defaults. + +## Core principle + +Repository-local knowledge is the system of record. A coding agent should be +able to understand the product, architecture, quality bar, and execution flow +from versioned artifacts in this repository plus the checked-out submodules. + +## Control-plane flow + +```text +feature request + -> request intake and acceptance contract + -> ExecPlan for non-trivial work + -> backend contract / behavior changes + -> frontend integration / UI changes + -> isolated task runtime + -> Playwright + CDP validation + -> logs / metrics / traces review + -> fix loop + -> PR / merge / debt update +``` + +## Worktree model + +Every non-trivial task should use an isolated runtime footprint keyed by a task +slug, for example `rank-comparison-filtering`. + +Expected layout: + +```text +.worktrees/ + backend// + frontend// +.runtime/ + / + logs/ + traces/ + screenshots/ + videos/ + playwright/ + observability/ +``` + +The goal matches OpenAI's harness model: + +- one isolated app instance per task +- one isolated observability context per task +- artifacts are disposable once the task is complete + +## Knowledge-store layout + +```text +AGENTS.md +ARCHITECTURE.md +PLANS.md +docs/ + design-docs/ + exec-plans/ + generated/ + product-specs/ + references/ + workflows/ +``` + +`AGENTS.md` is only the table of contents. The durable knowledge lives in +`docs/`. + +## Cross-repo contract + +The repositories are versioned independently, but the workflow treats them as a +single product system. A change request must identify which of the following are +affected: + +- backend domain rules +- backend API or event contracts +- frontend route or component behavior +- shared product language and acceptance criteria +- reliability, security, or QA evidence + +Any contract change must update both sides of the boundary plus the knowledge +store if the change affects future tasks. + +## Layering model + +The two repos should converge on one directional dependency model: + +```text +Types -> Schemas/Contracts -> Repository/Gateway -> Service/Use Case + -> Runtime/Delivery -> UI or HTTP surface + +Cross-cutting concerns enter only through Providers: +auth, feature flags, telemetry, configuration, external connectors +``` + +This is intentionally rigid. Agents move faster when the allowed edges are +obvious and mechanically enforceable. + +## QA and observability loop + +Every user-visible change is expected to produce: + +- automated regression evidence +- a Playwright run over the affected journey +- CDP evidence for DOM, console, network, and screenshot state +- log evidence from the isolated task runtime +- metrics and trace evidence when performance or async flow matters + +The recommended local stack is documented in +[docs/workflows/local-observability-stack.md](docs/workflows/local-observability-stack.md). +The implementation provided in `harness/` uses Loki, Prometheus, Tempo, and +Grafana to preserve the same agent-facing query model described by OpenAI: +LogQL, PromQL, and TraceQL. + +## What stays out of this repo + +- application code that belongs in `git-ranker` or `git-ranker-client` +- private tribal knowledge that should instead be turned into docs +- ad hoc task notes that never graduate into reusable rules + +## Current limitations + +- the frontend repo does not yet contain committed Playwright or test config +- the harness knows the backend metrics endpoint, but frontend metrics and trace + export wiring are still generic +- repo-specific start scripts and local env bootstrapping still need to be + codified into the harness diff --git a/PLANS.md b/PLANS.md new file mode 100644 index 0000000..a9c52d3 --- /dev/null +++ b/PLANS.md @@ -0,0 +1,83 @@ +# ExecPlans for git-ranker-workflow + +This document adapts OpenAI's `PLANS.md` pattern to a two-repository product +workflow. Use it for any task that is likely to take more than one session, +spans multiple files or repos, changes contracts, or requires non-trivial QA. + +## When to create an ExecPlan + +Create an ExecPlan when any of the following are true: + +- the request spans backend and frontend +- the request changes API, schema, routing, or product behavior +- the work is expected to last more than 30 minutes +- you need a reproducible QA and feedback loop +- you expect to stop and resume later + +Store plans in `docs/exec-plans/active/-.md`. + +## Non-negotiable rules + +- Every ExecPlan must be self-contained. +- Every ExecPlan must remain a living document. +- Every ExecPlan must let a novice continue from only the working tree and the + plan file. +- Every ExecPlan must describe observable outcomes, not just code edits. +- Every ExecPlan must define the validation loop clearly. + +## Repo-specific additions + +Every plan in this repository must also include: + +- impacted repo list: backend, frontend, or both +- request intake summary in plain language +- contract boundary notes +- exact task runtime slug +- expected Playwright journeys +- expected CDP evidence +- expected Loki or log-backend queries +- rollback or retry notes for each risky step + +## Required sections + +Every ExecPlan must keep these sections current: + +- `Purpose / Big Picture` +- `Progress` +- `Surprises & Discoveries` +- `Decision Log` +- `Outcomes & Retrospective` +- `Context and Orientation` +- `Plan of Work` +- `Concrete Steps` +- `Validation and Acceptance` +- `Idempotence and Recovery` +- `Artifacts and Notes` +- `Interfaces and Dependencies` + +## Formatting + +The plan file itself should contain one single fenced code block labeled `md`. +Do not nest other fenced blocks inside the plan. Use indentation for commands, +snippets, and transcripts. + +## Required execution rhythm + +1. Clarify the user's request in product language. +2. Identify impacted repos and documents. +3. Research before implementation. +4. Update the plan before and after every material milestone. +5. Validate behavior in the isolated task runtime. +6. Record the evidence path for screenshots, videos, traces, and logs. +7. Update docs when a new durable rule or system fact is discovered. + +## Plan naming + +Use a sortable filename: + +`docs/exec-plans/active/2026-03-07-rank-comparison-filtering.md` + +## Template + +Start from `docs/exec-plans/_template.md`. + diff --git a/docs/BACKEND.md b/docs/BACKEND.md new file mode 100644 index 0000000..0454426 --- /dev/null +++ b/docs/BACKEND.md @@ -0,0 +1,59 @@ +--- +summary: Backend implementation, contract, and observability rules for git-ranker. +read_when: + - working in the backend repo + - modifying APIs, jobs, persistence, or ranking logic +--- + +# Backend + +## Current repo facts + +- framework: Spring Boot 3.4 +- language/runtime: Java 21 +- architecture hints: domain, infrastructure, global, and batch packages +- observability already present: Actuator, Prometheus endpoint, structured + logback JSON encoder, trace-id MDC support +- test stack already present: JUnit 5, Testcontainers, ArchUnit, Jacoco + +## What agents must optimize for + +- explicit contracts +- narrow IO boundaries +- observable behavior +- safe migrations +- reproducible startup and request behavior + +## Required workflow for backend changes + +1. Define the affected contract and acceptance behavior. +2. Identify the layer changes required: contract, repository, service, runtime. +3. Implement the change. +4. Run backend build and tests. +5. Boot the isolated task runtime. +6. Exercise the changed API or worker path. +7. Query logs, metrics, and traces for the affected path. +8. Record evidence and findings in the ExecPlan. + +## Contract rules + +- Parse inputs at the boundary. +- Version or clearly document contract changes. +- Never let controllers or handlers own business logic. +- Prefer explicit repositories or gateways over ad hoc IO scattered through the + codebase. + +## Observability bar + +Every important backend change should leave behind: + +- structured log evidence for the changed flow +- at least one metric or timing check for latency-sensitive paths +- trace evidence when multiple async steps or external calls are involved +- a note describing which log query proves the behavior worked + +## Expected commands to codify + +- `./gradlew build` +- `./gradlew test` +- `./gradlew integrationTest` when Docker-backed integration coverage matters diff --git a/docs/DESIGN.md b/docs/DESIGN.md new file mode 100644 index 0000000..96b482a --- /dev/null +++ b/docs/DESIGN.md @@ -0,0 +1,32 @@ +--- +summary: UI and interaction design rules for agent-authored frontend changes. +read_when: + - changing visible UI + - altering copy, layout, or interaction flow +--- + +# Design + +## Goal + +Frontend changes must be legible to users and to future agents. Design choices +should be deliberate enough that screenshots, DOM snapshots, and acceptance docs +all tell the same story. + +## Rules + +- Start from user journeys, not component churn. +- Reuse existing visual patterns unless a design doc says the new pattern is + intentionally different. +- Capture meaningful empty, loading, success, and error states. +- Name visual states explicitly in docs and tests. +- If a visible workflow changes, update the relevant product and QA docs in the + same task. + +## Required evidence for visual changes + +- before/after screenshots or the first implementation screenshot plus expected + final state +- a Playwright assertion for the intended state +- a CDP check for console cleanliness and final DOM state + diff --git a/docs/FRONTEND.md b/docs/FRONTEND.md new file mode 100644 index 0000000..e801d80 --- /dev/null +++ b/docs/FRONTEND.md @@ -0,0 +1,68 @@ +--- +summary: Frontend implementation and QA rules for git-ranker-client. +read_when: + - working in the frontend repo + - validating a user-visible change +--- + +# Frontend + +## Current repo facts + +- framework: Next.js App Router under `src/app` +- language: TypeScript with strict mode +- runtime: React 19 +- data/state: React Query and Zustand +- linting: ESLint via `eslint.config.mjs` + +## Current gap + +No committed unit-test or Playwright config was found in `git-ranker-client` +during this setup. That means the workflow requirement is stricter than the +current repo state. For any meaningful frontend feature, part of the work should +be adding or wiring the missing QA harness. + +## What agents must optimize for + +- predictable route behavior +- explicit data loading and failure handling +- testable UI states +- clear contract boundaries with the backend + +## Required workflow for user-visible changes + +1. Confirm or create acceptance criteria in product language. +2. Identify affected routes, components, and client contracts. +3. Implement the change. +4. Run frontend build, lint, and any available tests. +5. Boot the isolated task runtime. +6. Run the Playwright journey for the changed surface. +7. Inspect the final state with CDP: + - screenshot + - DOM snapshot + - console logs + - failed network requests +8. Record artifact paths in the ExecPlan. + +## Frontend contract rules + +- Parse and validate incoming server data at the boundary. +- Do not let raw backend payloads leak through the UI tree. +- Put orchestration in loaders, hooks, or services; keep components focused on + rendering and event wiring. +- Make loading, empty, and error states explicit. + +## Minimum QA bar + +Every frontend feature should leave behind: + +- at least one Playwright path for the happy path +- at least one assertion for the most important failure or empty state +- a reproducible screenshot or video path +- a CDP artifact path for the final DOM and console state + +## Expected commands to codify + +- `npm run build` +- `npm run lint` +- a future committed Playwright command such as `npx playwright test` diff --git a/docs/PRODUCT_SENSE.md b/docs/PRODUCT_SENSE.md new file mode 100644 index 0000000..7bd19d3 --- /dev/null +++ b/docs/PRODUCT_SENSE.md @@ -0,0 +1,31 @@ +--- +summary: Product framing rules that convert requests into stable acceptance criteria. +read_when: + - clarifying scope + - deciding whether a change is complete +--- + +# Product Sense + +## Principle + +Do not implement from ambiguous desire statements. Convert requests into stable, +testable behavior statements first. + +## Required questions + +- Who benefits from the change? +- What exact workflow improves? +- What is the smallest observable version of the outcome? +- What must remain unchanged? +- How will we know the feature actually works? + +## Completion test + +A feature is not complete when the code exists. It is complete when: + +- the user-visible outcome is real +- the acceptance checks pass +- the QA evidence exists +- the docs explain the new durable behavior + diff --git a/docs/QUALITY_SCORE.md b/docs/QUALITY_SCORE.md new file mode 100644 index 0000000..945afa1 --- /dev/null +++ b/docs/QUALITY_SCORE.md @@ -0,0 +1,37 @@ +--- +summary: Quality scoring rubric and continuous cleanup loop. +read_when: + - reviewing architecture drift + - scheduling cleanup or follow-up refactors +--- + +# Quality Score + +Use a simple A to F score per major surface: + +- contract clarity +- test coverage +- docs freshness +- observability coverage +- layering discipline +- UX state completeness + +## Grade meanings + +- `A`: clear boundaries, current docs, strong tests, observable behavior +- `B`: acceptable but missing one non-critical reinforcement +- `C`: functional but agent legibility is degraded +- `D`: drift is visible and likely to spread +- `F`: unsafe to scale without cleanup + +## Garbage-collection rule + +If a change uncovers a durable bad pattern, either: + +- fix it in the same task, or +- add it to [exec-plans/tech-debt-tracker.md](exec-plans/tech-debt-tracker.md) + with a clear trigger and consequence + +The desired operating mode is continuous small cleanup, not occasional large +rewrite weeks. + diff --git a/docs/RELIABILITY.md b/docs/RELIABILITY.md new file mode 100644 index 0000000..1f482ef --- /dev/null +++ b/docs/RELIABILITY.md @@ -0,0 +1,35 @@ +--- +summary: Reliability expectations and evidence rules for runtime behavior. +read_when: + - changing startup flow + - touching async jobs, APIs, or critical user journeys +--- + +# Reliability + +## Principle + +Reliability requirements must be phrased as observable behavior, not vague +intent. + +## Every reliability-sensitive task should answer + +- which journey matters? +- what latency or failure budget matters? +- how will logs, metrics, or traces prove compliance? + +## Default expectations + +- startup should be measured, not assumed +- critical journeys should have named owners and evidence +- no regression claim is valid without an artifact path + +## Evidence examples + +- `LogQL`: service startup completed without retries or fatal errors +- `PromQL`: request or job latency remained under the target threshold +- `TraceQL`: no span in the named journey exceeded the agreed threshold + +Exact thresholds belong in the relevant ExecPlan until stable enough to promote +into a permanent doc. + diff --git a/docs/SECURITY.md b/docs/SECURITY.md new file mode 100644 index 0000000..3b18850 --- /dev/null +++ b/docs/SECURITY.md @@ -0,0 +1,33 @@ +--- +summary: Security baseline for frontend/backend workflow changes. +read_when: + - handling user input + - changing auth, secrets, or external integrations +--- + +# Security + +## Baseline + +- validate all untrusted input at the boundary +- keep secrets out of the frontend +- log safely; do not leak secrets or raw credentials +- prefer least-privilege connectors +- treat auth and authorization as explicit requirements, not assumptions + +## Required review triggers + +Do an explicit security pass when the change touches: + +- authentication +- authorization +- user-generated content +- file upload or download +- external webhooks or callbacks +- tokens, API keys, or cookies + +## Documentation rule + +If a task changes a security-relevant behavior, the relevant doc and ExecPlan +must say what changed and how it was verified. + diff --git a/docs/design-docs/core-beliefs.md b/docs/design-docs/core-beliefs.md new file mode 100644 index 0000000..cd4879c --- /dev/null +++ b/docs/design-docs/core-beliefs.md @@ -0,0 +1,46 @@ +--- +summary: Core beliefs for an agent-first repository. +read_when: + - making architecture or workflow decisions + - deciding whether a rule belongs in code, docs, or a prompt +--- + +# Core Beliefs + +## 1. Repository-local knowledge wins + +If a fact matters to future work, it must live in versioned files inside this +repository or the application repos. Chat logs, memory, and oral tradition do +not count. + +## 2. Legibility beats cleverness + +Prefer technologies, abstractions, and folder structures that a stateless agent +can inspect, understand, and modify without hidden context. + +## 3. AGENTS is a map, not the encyclopedia + +Keep `AGENTS.md` concise. Promote durable instructions into purpose-built docs or +mechanical checks. + +## 4. Boundaries are leverage + +Strict layering, naming, and evidence requirements are not bureaucracy. They are +what lets agents move quickly without spreading architectural drift. + +## 5. Behavior matters more than code motion + +Every change must end in observable behavior: a journey that passes, an error +that disappears, a metric that stays below target, or a trace that no longer +regresses. + +## 6. Feedback loops are part of the product + +Playwright specs, CDP inspection, logs, metrics, traces, and review loops are +first-class system components. If they are missing, the workflow is incomplete. + +## 7. Continuous cleanup is mandatory + +Bad patterns compound quickly in an AI-heavy codebase. Capture taste once, +enforce it repeatedly, and keep the debt surface small. + diff --git a/docs/design-docs/domain-layering.md b/docs/design-docs/domain-layering.md new file mode 100644 index 0000000..cb16080 --- /dev/null +++ b/docs/design-docs/domain-layering.md @@ -0,0 +1,95 @@ +--- +summary: Required dependency direction and layer meanings across backend and frontend. +read_when: + - adding a new module + - reviewing dependency direction + - designing cross-repo contracts +--- + +# Domain Layering + +## Dependency direction + +All code should depend only in the following forward direction: + +```text +Types -> Schemas/Contracts -> Repository/Gateway -> Service/Use Case + -> Runtime/Delivery -> UI or HTTP surface +``` + +Cross-cutting concerns enter only through explicitly named provider interfaces: + +- auth providers +- feature-flag providers +- telemetry providers +- config providers +- external connector providers + +No other reverse or sideways dependencies are allowed. + +## Layer meaning + +### Types + +Pure domain types and names. No IO. No framework imports. + +### Schemas/Contracts + +Validation rules, request/response shapes, event payloads, serialized forms, and +frontend/backend contract models. + +### Repository/Gateway + +Persistence or remote access layers. Database clients, HTTP clients, queues, and +third-party APIs live here behind narrow interfaces. + +### Service/Use Case + +Business logic. The place where ranking behavior, orchestration, and policy are +implemented. + +### Runtime/Delivery + +The runtime boundary that wires providers and use cases into the actual program. +Examples: + +- backend handlers, jobs, schedulers +- frontend loaders, route state wiring, query orchestration + +### UI or HTTP surface + +The final user or network surface: + +- frontend components and route shells +- backend controllers, route modules, or transport handlers + +## Repo-specific mapping + +### `git-ranker` backend + +- Types: domain entities and value objects +- Schemas/Contracts: DTOs, validation schemas, API contracts, job payloads +- Repository/Gateway: DB access, cache, queue, external API connectors +- Service/Use Case: ranking algorithms, workflows, business rules +- Runtime/Delivery: route wiring, workers, scheduled tasks +- HTTP surface: API endpoints and transport adapters + +### `git-ranker-client` frontend + +- Types: view-agnostic domain models +- Schemas/Contracts: API client contracts, form schemas, router payloads +- Repository/Gateway: API clients and local persistence adapters +- Service/Use Case: client-side orchestration and derived state logic +- Runtime/Delivery: route loaders, providers, suspense/query setup +- UI surface: pages, sections, components, and interaction handlers + +## Guardrails to encode later + +These rules should eventually become lint rules or structural tests: + +- boundary parsing happens at contracts, not ad hoc in UI or handlers +- services may not import UI modules +- repositories may not import runtime or surface layers +- no direct third-party IO from UI or business logic +- cross-repo contracts must be named and versioned explicitly + diff --git a/docs/design-docs/index.md b/docs/design-docs/index.md new file mode 100644 index 0000000..dacb699 --- /dev/null +++ b/docs/design-docs/index.md @@ -0,0 +1,12 @@ +--- +summary: Index of durable architectural and workflow design rules. +read_when: + - changing architecture + - deciding where new code or docs should live +--- + +# design-docs index + +- [core-beliefs.md](core-beliefs.md) +- [domain-layering.md](domain-layering.md) + diff --git a/docs/exec-plans/README.md b/docs/exec-plans/README.md new file mode 100644 index 0000000..b50a154 --- /dev/null +++ b/docs/exec-plans/README.md @@ -0,0 +1,31 @@ +--- +summary: How ExecPlans are stored and maintained in this repository. +read_when: + - creating or resuming a non-trivial task +--- + +# Exec Plans + +## Layout + +- `active/`: plans that are still being executed +- `completed/`: plans whose work and retrospective are complete +- `_template.md`: starting point for new plans +- `tech-debt-tracker.md`: backlog of durable issues discovered by the workflow + +## Naming + +`-.md` + +Example: + +`2026-03-07-ranking-filter-panel.md` + +## Workflow + +1. Create from `_template.md`. +2. Fill in the request intake and context before coding. +3. Update `Progress`, `Decision Log`, and `Surprises & Discoveries` during work. +4. Record artifact paths for Playwright, CDP, logs, metrics, and traces. +5. Move the plan to `completed/` when the work and retrospective are finished. + diff --git a/docs/exec-plans/_template.md b/docs/exec-plans/_template.md new file mode 100644 index 0000000..8aa5c9e --- /dev/null +++ b/docs/exec-plans/_template.md @@ -0,0 +1,95 @@ +```md +# + +This ExecPlan is a living document. Maintain it according to `PLANS.md` at the +repository root. + +## Purpose / Big Picture + +Explain what user-visible behavior will exist after this change and how to see +it working. + +Request intake: + + Problem: + User-visible outcome: + Affected repos: + Contract surface: + Acceptance checks: + QA evidence: + Non-goals: + Risks: + +Task runtime slug: + + + +## Progress + +- [ ] Example incomplete step. +- [ ] Example partially completed step. +- [x] Example completed step with timestamp. + +## Surprises & Discoveries + +- Observation: + Evidence: + +## Decision Log + +- Decision: + Rationale: + Date/Author: + +## Outcomes & Retrospective + +- Outcome: + Remaining gap: + Lesson: + +## Context and Orientation + +Describe the current system state, key files, terms, and constraints as if the +reader knows nothing about the repo. + +## Plan of Work + +Describe the sequence of changes in prose. Name exact files, modules, and +surfaces to edit. + +## Concrete Steps + +List exact commands, working directories, and expected observations. + +## Validation and Acceptance + +Describe: + + - test commands + - Playwright journeys + - CDP checks + - Loki or log-backend queries + - metrics or trace checks when relevant + +## Idempotence and Recovery + +Describe which steps are safe to rerun and how to recover from partial failure. + +## Artifacts and Notes + +Record paths for: + + - screenshots + - videos + - Playwright reports + - DOM snapshots + - console captures + - log query output + - metric or trace captures + +## Interfaces and Dependencies + +List the interfaces, contracts, libraries, providers, and service boundaries the +task depends on or creates. +``` + diff --git a/docs/exec-plans/active/README.md b/docs/exec-plans/active/README.md new file mode 100644 index 0000000..875227b --- /dev/null +++ b/docs/exec-plans/active/README.md @@ -0,0 +1,4 @@ +# Active Exec Plans + +Put in-progress or not-yet-retired plans here. + diff --git a/docs/exec-plans/completed/README.md b/docs/exec-plans/completed/README.md new file mode 100644 index 0000000..edf7d81 --- /dev/null +++ b/docs/exec-plans/completed/README.md @@ -0,0 +1,4 @@ +# Completed Exec Plans + +Move plans here after the work and retrospective are complete. + diff --git a/docs/exec-plans/tech-debt-tracker.md b/docs/exec-plans/tech-debt-tracker.md new file mode 100644 index 0000000..1e46d0c --- /dev/null +++ b/docs/exec-plans/tech-debt-tracker.md @@ -0,0 +1,42 @@ +--- +summary: Durable debt surfaced by the AI workflow and not yet encoded away. +read_when: + - deciding whether to open a cleanup plan + - looking for recurring workflow friction +--- + +# Tech Debt Tracker + +## Open items + +### 1. Generated fact pipelines are not implemented + +- Problem: `docs/generated/` is defined but not populated automatically. +- Consequence: agents must still read code directly for many repo facts. +- Desired fix: add generators for API surface, schema, route map, and dependency + graph documents. + +### 2. Harness command wiring is still generic + +- Problem: the workflow defines Playwright/CDP/Loki loops, but the exact app + commands are not yet wired to this project's real scripts. +- Consequence: the docs are operationally ready, but runtime automation still + needs repo-specific command binding. +- Desired fix: once submodules are initialized, codify build, start, and test + commands in `harness/` and verification scripts. + +### 3. Frontend automated QA harness is missing from the repo + +- Problem: `git-ranker-client` currently has no committed Playwright config or + frontend test files. +- Consequence: the OpenAI-style browser feedback loop cannot yet be enforced by + code in the frontend repo itself. +- Desired fix: add Playwright, artifact paths, and at least one critical user + journey spec to `git-ranker-client`. + +## Resolved recently + +### Submodule bootstrap + +- `scripts/bootstrap-submodules.sh` was added. +- `git-ranker` and `git-ranker-client` are initialized in this workspace. diff --git a/docs/generated/README.md b/docs/generated/README.md new file mode 100644 index 0000000..4ffb76d --- /dev/null +++ b/docs/generated/README.md @@ -0,0 +1,27 @@ +--- +summary: Placeholder contract for machine-generated repository facts. +read_when: + - you need repo facts faster than code search + - deciding whether to hand-edit generated docs +--- + +# generated + +This directory is reserved for machine-generated facts that agents should prefer +over broad code search once generation pipelines exist. + +Expected outputs: + +- `backend-api-surface.md` +- `backend-schema.md` +- `frontend-route-map.md` +- `frontend-state-surfaces.md` +- `dependency-graph.md` + +## Rules + +- Generated docs should be reproducible from the application repos. +- Once a generator exists, do not hand-edit the generated file. +- If a durable fact is still missing here, document it in a design or product doc + until the generator exists. + diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000..04b096d --- /dev/null +++ b/docs/index.md @@ -0,0 +1,52 @@ +--- +summary: Repository knowledge-store index for agents and humans. +read_when: + - starting work in this repository + - looking for the source of truth for a requirement or rule +--- + +# docs index + +This directory is the system of record for the AI workflow in +`git-ranker-workflow`. + +## Read first + +- [../ARCHITECTURE.md](../ARCHITECTURE.md) +- [../PLANS.md](../PLANS.md) + +## Product + +- [product-specs/index.md](product-specs/index.md) +- [PRODUCT_SENSE.md](PRODUCT_SENSE.md) + +## Architecture and rules + +- [design-docs/index.md](design-docs/index.md) +- [FRONTEND.md](FRONTEND.md) +- [BACKEND.md](BACKEND.md) +- [SECURITY.md](SECURITY.md) +- [RELIABILITY.md](RELIABILITY.md) +- [QUALITY_SCORE.md](QUALITY_SCORE.md) + +## Workflow loop + +- [workflows/feature-delivery-loop.md](workflows/feature-delivery-loop.md) +- [workflows/qa-feedback-loop.md](workflows/qa-feedback-loop.md) +- [workflows/local-observability-stack.md](workflows/local-observability-stack.md) + +## Planning + +- [exec-plans/README.md](exec-plans/README.md) +- [exec-plans/tech-debt-tracker.md](exec-plans/tech-debt-tracker.md) + +## Generated facts + +- [generated/README.md](generated/README.md) + +## Source analyses that informed this structure + +- [references/openai-harness-engineering-analysis.md](references/openai-harness-engineering-analysis.md) +- [references/openai-execplans-analysis.md](references/openai-execplans-analysis.md) +- [references/steipete-ai-workflow-analysis.md](references/steipete-ai-workflow-analysis.md) + diff --git a/docs/product-specs/index.md b/docs/product-specs/index.md new file mode 100644 index 0000000..c001754 --- /dev/null +++ b/docs/product-specs/index.md @@ -0,0 +1,11 @@ +--- +summary: Product-intent documents and intake rules for feature requests. +read_when: + - translating a user request into acceptance criteria + - deciding what behavior must exist before coding +--- + +# product-specs index + +- [request-intake.md](request-intake.md) + diff --git a/docs/product-specs/request-intake.md b/docs/product-specs/request-intake.md new file mode 100644 index 0000000..9abe9ef --- /dev/null +++ b/docs/product-specs/request-intake.md @@ -0,0 +1,66 @@ +--- +summary: Intake contract for feature requests before implementation begins. +read_when: + - before starting any new feature + - when the request is vague or spans frontend and backend +--- + +# Feature Request Intake + +Before implementation begins, rewrite every request in the following shape. + +## 1. Problem + +What user problem or business problem is being solved? + +## 2. User-visible outcome + +What can the user do after this change that they could not do before? + +## 3. Affected repos + +- `git-ranker` +- `git-ranker-client` +- `knowledge-store only` + +## 4. Contract surface + +List any API, schema, event, route, or copy changes. + +## 5. Acceptance checks + +Describe the exact behavior to observe when the change works. + +## 6. QA evidence + +List the minimum required evidence: + +- test commands +- Playwright journey names +- CDP checks +- log queries +- performance or trace checks when relevant + +## 7. Non-goals + +State what this request is not trying to solve. + +## 8. Risks + +List migration, compatibility, reliability, or security risks. + +## Output format + +Use this checklist at the top of an ExecPlan or task note: + +```text +Problem: +User-visible outcome: +Affected repos: +Contract surface: +Acceptance checks: +QA evidence: +Non-goals: +Risks: +``` + diff --git a/docs/references/openai-execplans-analysis.md b/docs/references/openai-execplans-analysis.md new file mode 100644 index 0000000..37dc6c0 --- /dev/null +++ b/docs/references/openai-execplans-analysis.md @@ -0,0 +1,38 @@ +--- +summary: Analysis of OpenAI's PLANS.md article and how it is adapted here. +read_when: + - writing or reviewing an ExecPlan + - deciding what a self-contained plan must include +--- + +# OpenAI ExecPlans analysis + +Source: + +- https://developers.openai.com/cookbook/articles/codex_exec_plans + +## Extracted rules + +- plans are for multi-hour or multi-session work +- plans are living documents +- plans must be self-contained for a novice reader +- plans must describe observable outcomes +- plans must track progress, discoveries, decisions, and retrospective outcomes + +## Adaptation in this repository + +The generic OpenAI structure is extended with repository-specific requirements: + +- impacted repo list +- task runtime slug +- explicit frontend/backend contract notes +- Playwright, CDP, and log-query evidence +- recovery notes for cross-repo or environment failures + +## Why the adaptation is needed + +This repository coordinates two separately versioned repos. A plan that only +describes code edits is not enough. It must also describe the integration +surface, the QA harness, and the observability evidence that proves the change +worked across the system. + diff --git a/docs/references/openai-harness-engineering-analysis.md b/docs/references/openai-harness-engineering-analysis.md new file mode 100644 index 0000000..3bb0c12 --- /dev/null +++ b/docs/references/openai-harness-engineering-analysis.md @@ -0,0 +1,43 @@ +--- +summary: Analysis of OpenAI's February 11, 2026 harness-engineering article and how it maps to this repo. +read_when: + - understanding why this repository is structured this way + - checking whether a workflow rule is grounded in the source article +--- + +# OpenAI harness-engineering analysis + +Source: + +- https://openai.com/index/harness-engineering/ + +## Extracted operating principles + +1. Start from an empty repo and make the environment, not the human, do the + heavy lifting. +2. Treat repository-local knowledge as the system of record. +3. Keep `AGENTS.md` short and use it as a map into richer docs. +4. Increase agent legibility through bootable isolated runtimes, browser + inspection, and queryable observability. +5. Enforce architecture and taste mechanically instead of depending on memory. +6. Prefer fast correction loops over heavyweight blocking gates. +7. Continuously garbage-collect drift through explicit quality rules. + +## Direct mappings in this repository + +- `AGENTS.md` is intentionally short and points into `docs/`. +- `docs/` mirrors the knowledge-store layout described in the article. +- `PLANS.md` and `docs/exec-plans/` provide the living-document execution model. +- `harness/` is reserved for per-task runtime and observability wiring. +- `docs/workflows/qa-feedback-loop.md` encodes the Playwright plus CDP plus log + analysis loop. +- `docs/QUALITY_SCORE.md` and `docs/exec-plans/tech-debt-tracker.md` encode the + continuous cleanup model. + +## Important nuance + +The article describes a single large app repository. This project is split into +two application repositories plus one orchestration repository. The adaptation +here is to make the control plane live in this repo while keeping application +logic in the backend and frontend repos. + diff --git a/docs/references/steipete-ai-workflow-analysis.md b/docs/references/steipete-ai-workflow-analysis.md new file mode 100644 index 0000000..26c0519 --- /dev/null +++ b/docs/references/steipete-ai-workflow-analysis.md @@ -0,0 +1,46 @@ +--- +summary: Useful workflow patterns adapted from steipete repositories. +read_when: + - refining docs structure + - deciding how much detail belongs in AGENTS versus docs +--- + +# steipete workflow analysis + +Sources reviewed: + +- https://github.com/steipete +- https://raw.githubusercontent.com/steipete/oracle/main/AGENTS.md +- https://raw.githubusercontent.com/steipete/agent-scripts/main/AGENTS.MD +- https://raw.githubusercontent.com/steipete/Peekaboo/main/docs/ARCHITECTURE.md +- https://raw.githubusercontent.com/steipete/Peekaboo/main/docs/testing/tools.md + +## Patterns worth reusing + +### 1. Short AGENTS, dense docs + +steipete keeps `AGENTS` directive-heavy but still points to focused docs rather +than collapsing everything into one file. That supports fast context loading. + +### 2. `read_when` front matter + +Purpose-driven docs are easier for agents to load when the file itself states +when it should be consulted. This repository adopts that pattern in `docs/`. + +### 3. Artifact-oriented testing + +The Peekaboo testing docs record exact logs, artifact paths, execution loops, and +pass criteria. That pattern maps well to Playwright, CDP, and local observability +artifacts here. + +### 4. Docs are part of the implementation + +steipete's repos treat doc updates as part of finishing a feature. This +repository adopts the same rule. + +## What was not copied directly + +Those repos are optimized for different products and toolchains. The structure +here preserves the transferable workflow ideas while staying aligned to OpenAI's +harness-engineering model as the primary source of truth. + diff --git a/docs/workflows/feature-delivery-loop.md b/docs/workflows/feature-delivery-loop.md new file mode 100644 index 0000000..0fc45be --- /dev/null +++ b/docs/workflows/feature-delivery-loop.md @@ -0,0 +1,60 @@ +--- +summary: End-to-end feature workflow from user request to QA feedback loop. +read_when: + - starting a feature + - deciding what the next task phase should be +--- + +# Feature Delivery Loop + +## Phase 1: Intake + +Convert the request into the shape defined in +[../product-specs/request-intake.md](../product-specs/request-intake.md). + +## Phase 2: Plan + +For any non-trivial task, create an ExecPlan. The plan becomes the living record +for progress, discoveries, decisions, evidence, and outcomes. + +## Phase 3: Implement + +Make the smallest set of backend and frontend changes that satisfy the plan +while preserving the architectural layer rules. + +## Phase 4: Verify locally + +Run build, typecheck, lint, and automated tests in the affected repos. + +## Phase 5: Boot isolated runtime + +Launch the backend, frontend, and observability stack for the task-specific +runtime slug. + +## Phase 6: QA + +Run Playwright for the changed user journeys. Inspect the final UI and network +state using CDP tooling. + +## Phase 7: Observe + +Query the isolated runtime's logs, metrics, and traces. Confirm the system +behavior matches the plan, not just the UI. + +## Phase 8: Feedback + +If QA or observability reveals a gap: + +- update code +- update the ExecPlan +- rerun the relevant checks +- capture the new evidence + +## Phase 9: Retrospective + +At the end of the task: + +- update `Outcomes & Retrospective` +- promote durable lessons into docs or scripts +- add remaining debt to the tracker if it cannot be fixed now + diff --git a/docs/workflows/local-observability-stack.md b/docs/workflows/local-observability-stack.md new file mode 100644 index 0000000..8fe9360 --- /dev/null +++ b/docs/workflows/local-observability-stack.md @@ -0,0 +1,75 @@ +--- +summary: Local per-task observability model using Loki, Prometheus, Tempo, and Grafana. +read_when: + - setting up the task runtime + - deciding where logs, metrics, and traces should go +--- + +# Local Observability Stack + +## Goal + +Mirror the agent-facing observability workflow described by OpenAI: + +- one isolated observability context per task +- logs, metrics, and traces queryable by the agent +- disposable runtime after the task completes + +## Implementation choice + +OpenAI's article describes local queryable logs, metrics, and traces with +LogQL, PromQL, and TraceQL. This repository uses Loki, Prometheus, Tempo, and +Grafana as a practical equivalent for self-hosted local development. + +This is an implementation inference, not a direct quote from OpenAI. + +## Directory contract + +```text +.runtime// + logs/ + metrics/ + traces/ + observability/ +``` + +## Required labels + +Every emitted signal should carry at least: + +- `task_slug` +- `service` +- `repo` +- `environment=local` + +## Example queries + +### LogQL + +```text +{task_slug="",service="backend"} |= "startup complete" +``` + +### PromQL + +```text +histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket{task_slug=""}[5m]))) +``` + +### TraceQL + +```text +{ resource.task_slug = "" } | duration > 2s +``` + +## Config location + +See `harness/observability/`. + +## Current project-specific wiring + +- backend metrics are available from Spring Actuator on port `9090` at + `/actuator/prometheus` +- backend logs are already structured through `logback-spring.xml` +- frontend metrics and trace export are not yet committed as part of the client + repo, so the harness keeps those pieces generic for now diff --git a/docs/workflows/qa-feedback-loop.md b/docs/workflows/qa-feedback-loop.md new file mode 100644 index 0000000..3926763 --- /dev/null +++ b/docs/workflows/qa-feedback-loop.md @@ -0,0 +1,47 @@ +--- +summary: Playwright, CDP, and log-driven feedback loop for user-visible changes. +read_when: + - validating a fix or feature + - debugging a regression +--- + +# QA Feedback Loop + +## Required inputs + +- task runtime slug +- affected user journey name +- expected final UI state +- expected backend or log behavior + +## Loop + +1. Boot the isolated backend and frontend for the task. +2. Run the Playwright journey that exercises the change. +3. Capture: + - screenshots + - video when useful + - Playwright traces or reports +4. Inspect the same run through CDP: + - DOM snapshot + - console output + - failed requests + - final URL and app state +5. Query logs for the same time window and task slug. +6. If performance or async orchestration matters, inspect metrics and traces too. +7. Compare observations against the acceptance section of the ExecPlan. +8. If anything disagrees, fix the system and rerun the loop. + +## Minimum artifact set + +- `.runtime//playwright/` +- `.runtime//screenshots/` +- `.runtime//videos/` when relevant +- `.runtime//logs/` +- `.runtime//observability/queries.md` or equivalent note + +## Why this exists + +This is the core OpenAI harness pattern applied locally: the agent must be able +to see the product behavior directly, not infer success from code changes alone. + diff --git a/git-ranker b/git-ranker index d0197ca..3a5f37f 160000 --- a/git-ranker +++ b/git-ranker @@ -1 +1 @@ -Subproject commit d0197caa2f2c4c67f74082e09a4343c7ed30f3ba +Subproject commit 3a5f37f1a06d756de5bd58ca3afb1a9ba3b9d2c6 diff --git a/git-ranker-client b/git-ranker-client index 6ec13d8..378ace0 160000 --- a/git-ranker-client +++ b/git-ranker-client @@ -1 +1 @@ -Subproject commit 6ec13d8ea85af1e0ee952c4b82fcd54c0e2b925b +Subproject commit 378ace0283755649a92a8a0fb7e7f9d3b54afdb8 diff --git a/harness/README.md b/harness/README.md new file mode 100644 index 0000000..92f2048 --- /dev/null +++ b/harness/README.md @@ -0,0 +1,36 @@ +# Harness + +This directory contains the local runtime and observability skeleton for the +OpenAI-style harness workflow used by this repository. + +## What lives here + +- `task.env.example`: environment template for a task-scoped runtime +- `observability/`: Docker Compose and config for Loki, Prometheus, Tempo, and + Grafana + +## Suggested flow + +1. Create a task runtime: + + ./scripts/init-task-runtime.sh + +2. Review and adjust `.runtime//task.env`. + +3. Start the observability stack: + + docker compose \ + --env-file .runtime//task.env \ + -f harness/observability/docker-compose.yml \ + up -d + +4. Start backend and frontend using the same task slug and matching ports. + +5. Run Playwright, inspect with CDP, and query the local stack. + +## Current state + +The config is intentionally generic because the application submodules are not +yet initialized in this workspace. Once the real app commands are available, +bind their log, metric, and trace endpoints into this harness. + diff --git a/harness/observability/docker-compose.yml b/harness/observability/docker-compose.yml new file mode 100644 index 0000000..55a91ee --- /dev/null +++ b/harness/observability/docker-compose.yml @@ -0,0 +1,61 @@ +services: + loki: + image: grafana/loki + command: + - -config.file=/etc/loki/loki-config.yml + ports: + - "${LOKI_PORT:-3100}:3100" + volumes: + - ./loki-config.yml:/etc/loki/loki-config.yml:ro + - ../../.runtime/${TASK_SLUG:-example-task}/observability/loki:/loki + + promtail: + image: grafana/promtail + command: + - -config.file=/etc/promtail/promtail-config.yml + - -config.expand-env=true + depends_on: + - loki + volumes: + - ./promtail-config.yml:/etc/promtail/promtail-config.yml:ro + - ../../.runtime/${TASK_SLUG:-example-task}/logs:/var/log/task:ro + + prometheus: + image: prom/prometheus + command: + - --config.file=/etc/prometheus/prometheus.yml + - --web.enable-remote-write-receiver + ports: + - "${PROMETHEUS_PORT:-9090}:9090" + volumes: + - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro + - ../../.runtime/${TASK_SLUG:-example-task}/observability/prometheus:/prometheus + + tempo: + image: grafana/tempo + command: + - -config.file=/etc/tempo/tempo.yml + ports: + - "${TEMPO_PORT:-3200}:3200" + - "${OTLP_GRPC_PORT:-4317}:4317" + - "${OTLP_HTTP_PORT:-4318}:4318" + volumes: + - ./tempo.yml:/etc/tempo/tempo.yml:ro + - ../../.runtime/${TASK_SLUG:-example-task}/observability/tempo:/var/tempo + + grafana: + image: grafana/grafana-oss + depends_on: + - loki + - prometheus + - tempo + environment: + GF_AUTH_ANONYMOUS_ENABLED: "true" + GF_AUTH_DISABLE_LOGIN_FORM: "true" + GF_FEATURE_TOGGLES_ENABLE: traceqlEditor + ports: + - "${GRAFANA_PORT:-3001}:3000" + volumes: + - ./grafana/provisioning:/etc/grafana/provisioning:ro + - ../../.runtime/${TASK_SLUG:-example-task}/observability/grafana:/var/lib/grafana + diff --git a/harness/observability/grafana/provisioning/datasources/datasources.yml b/harness/observability/grafana/provisioning/datasources/datasources.yml new file mode 100644 index 0000000..0781b9e --- /dev/null +++ b/harness/observability/grafana/provisioning/datasources/datasources.yml @@ -0,0 +1,30 @@ +apiVersion: 1 + +datasources: + - name: Loki + uid: loki + type: loki + access: proxy + url: http://loki:3100 + isDefault: true + + - name: Prometheus + uid: prometheus + type: prometheus + access: proxy + url: http://prometheus:9090 + + - name: Tempo + uid: tempo + type: tempo + access: proxy + url: http://tempo:3200 + jsonData: + tracesToLogsV2: + datasourceUid: loki + tracesToMetrics: + datasourceUid: prometheus + nodeGraph: + enabled: true + serviceMap: + datasourceUid: prometheus diff --git a/harness/observability/loki-config.yml b/harness/observability/loki-config.yml new file mode 100644 index 0000000..4d652bd --- /dev/null +++ b/harness/observability/loki-config.yml @@ -0,0 +1,33 @@ +auth_enabled: false + +server: + http_listen_port: 3100 + +common: + path_prefix: /loki + storage: + filesystem: + chunks_directory: /loki/chunks + rules_directory: /loki/rules + replication_factor: 1 + ring: + kvstore: + store: inmemory + +schema_config: + configs: + - from: 2024-01-01 + store: tsdb + object_store: filesystem + schema: v13 + index: + prefix: index_ + period: 24h + +ruler: + storage: + type: local + local: + directory: /loki/rules + rule_path: /tmp/loki/rules-temp + diff --git a/harness/observability/prometheus.yml b/harness/observability/prometheus.yml new file mode 100644 index 0000000..ab667ad --- /dev/null +++ b/harness/observability/prometheus.yml @@ -0,0 +1,18 @@ +global: + scrape_interval: 15s + +scrape_configs: + - job_name: prometheus + static_configs: + - targets: + - localhost:9090 + + - job_name: backend + metrics_path: /actuator/prometheus + static_configs: + - targets: + - host.docker.internal:9090 + labels: + repo: git-ranker + service: backend + environment: local diff --git a/harness/observability/promtail-config.yml b/harness/observability/promtail-config.yml new file mode 100644 index 0000000..f52d9d3 --- /dev/null +++ b/harness/observability/promtail-config.yml @@ -0,0 +1,21 @@ +server: + http_listen_port: 9080 + grpc_listen_port: 0 + +positions: + filename: /tmp/positions.yml + +clients: + - url: http://loki:3100/loki/api/v1/push + +scrape_configs: + - job_name: task-logs + static_configs: + - targets: + - localhost + labels: + job: task-logs + task_slug: ${TASK_SLUG} + environment: local + __path__: ${TASK_LOG_DIR}/*.log + diff --git a/harness/observability/tempo.yml b/harness/observability/tempo.yml new file mode 100644 index 0000000..bcdcd11 --- /dev/null +++ b/harness/observability/tempo.yml @@ -0,0 +1,33 @@ +server: + http_listen_port: 3200 + +distributor: + receivers: + otlp: + protocols: + grpc: + endpoint: 0.0.0.0:4317 + http: + endpoint: 0.0.0.0:4318 + +storage: + trace: + backend: local + local: + path: /var/tempo/traces + +compactor: + compaction: + block_retention: 24h + +metrics_generator: + storage: + path: /var/tempo/generator + +overrides: + defaults: + metrics_generator: + processors: + - service-graphs + - span-metrics + diff --git a/harness/task.env.example b/harness/task.env.example new file mode 100644 index 0000000..60a530f --- /dev/null +++ b/harness/task.env.example @@ -0,0 +1,10 @@ +TASK_SLUG=example-task +BACKEND_PORT=4000 +FRONTEND_PORT=3000 +LOKI_PORT=13100 +PROMETHEUS_PORT=19090 +TEMPO_PORT=13200 +OTLP_GRPC_PORT=14317 +OTLP_HTTP_PORT=14318 +GRAFANA_PORT=13001 +TASK_LOG_DIR=/var/log/task diff --git a/scripts/bootstrap-submodules.sh b/scripts/bootstrap-submodules.sh new file mode 100755 index 0000000..00ad653 --- /dev/null +++ b/scripts/bootstrap-submodules.sh @@ -0,0 +1,7 @@ +#!/usr/bin/env sh +set -eu + +git submodule sync --recursive +git submodule update --init --recursive + +printf 'submodule bootstrap complete\n' diff --git a/scripts/check-submodules.sh b/scripts/check-submodules.sh new file mode 100755 index 0000000..e337583 --- /dev/null +++ b/scripts/check-submodules.sh @@ -0,0 +1,24 @@ +#!/usr/bin/env sh +set -eu + +check_dir() { + path="$1" + + if [ ! -d "$path" ]; then + printf 'missing directory: %s\n' "$path" >&2 + return 1 + fi + + if [ -z "$(find "$path" -mindepth 1 -maxdepth 1 2>/dev/null)" ]; then + printf 'submodule appears uninitialized: %s\n' "$path" >&2 + return 2 + fi + + printf 'submodule looks populated: %s\n' "$path" +} + +status=0 +check_dir git-ranker || status=$? +check_dir git-ranker-client || status=$? +exit "$status" + diff --git a/scripts/init-task-runtime.sh b/scripts/init-task-runtime.sh new file mode 100755 index 0000000..592f47a --- /dev/null +++ b/scripts/init-task-runtime.sh @@ -0,0 +1,28 @@ +#!/usr/bin/env sh +set -eu + +if [ "$#" -ne 1 ]; then + printf 'usage: %s \n' "$0" >&2 + exit 1 +fi + +slug="$1" +runtime_dir=".runtime/$slug" +env_file="$runtime_dir/task.env" + +mkdir -p "$runtime_dir/logs" +mkdir -p "$runtime_dir/metrics" +mkdir -p "$runtime_dir/traces" +mkdir -p "$runtime_dir/screenshots" +mkdir -p "$runtime_dir/videos" +mkdir -p "$runtime_dir/playwright" +mkdir -p "$runtime_dir/observability" +mkdir -p ".worktrees/backend" +mkdir -p ".worktrees/frontend" + +if [ ! -f "$env_file" ]; then + sed "s/^TASK_SLUG=.*/TASK_SLUG=$slug/" harness/task.env.example > "$env_file" +fi + +printf 'initialized task runtime: %s\n' "$runtime_dir" +printf 'environment file: %s\n' "$env_file" diff --git a/scripts/new-exec-plan.sh b/scripts/new-exec-plan.sh new file mode 100755 index 0000000..d830b80 --- /dev/null +++ b/scripts/new-exec-plan.sh @@ -0,0 +1,20 @@ +#!/usr/bin/env sh +set -eu + +if [ "$#" -ne 1 ]; then + printf 'usage: %s \n' "$0" >&2 + exit 1 +fi + +slug="$1" +date_prefix="$(date +%F)" +target="docs/exec-plans/active/${date_prefix}-${slug}.md" + +if [ -e "$target" ]; then + printf 'plan already exists: %s\n' "$target" >&2 + exit 1 +fi + +cp docs/exec-plans/_template.md "$target" +printf 'created %s\n' "$target" + diff --git a/scripts/validate-knowledge-store.sh b/scripts/validate-knowledge-store.sh new file mode 100755 index 0000000..4a229b9 --- /dev/null +++ b/scripts/validate-knowledge-store.sh @@ -0,0 +1,53 @@ +#!/usr/bin/env sh +set -eu + +required_files=' +AGENTS.md +ARCHITECTURE.md +PLANS.md +harness/README.md +harness/task.env.example +harness/observability/docker-compose.yml +scripts/bootstrap-submodules.sh +docs/index.md +docs/DESIGN.md +docs/FRONTEND.md +docs/BACKEND.md +docs/PRODUCT_SENSE.md +docs/QUALITY_SCORE.md +docs/RELIABILITY.md +docs/SECURITY.md +docs/design-docs/index.md +docs/product-specs/index.md +docs/exec-plans/README.md +docs/exec-plans/_template.md +docs/generated/README.md +docs/workflows/feature-delivery-loop.md +docs/workflows/qa-feedback-loop.md +docs/workflows/local-observability-stack.md +' + +missing=0 + +for file in $required_files; do + if [ ! -f "$file" ]; then + printf 'missing required file: %s\n' "$file" >&2 + missing=1 + fi +done + +if [ ! -d "docs/exec-plans/active" ]; then + printf 'missing required directory: docs/exec-plans/active\n' >&2 + missing=1 +fi + +if [ ! -d "docs/exec-plans/completed" ]; then + printf 'missing required directory: docs/exec-plans/completed\n' >&2 + missing=1 +fi + +if [ "$missing" -ne 0 ]; then + exit 1 +fi + +printf 'knowledge store validation passed\n' diff --git a/scripts/verify-workflow.sh b/scripts/verify-workflow.sh new file mode 100755 index 0000000..79b61e7 --- /dev/null +++ b/scripts/verify-workflow.sh @@ -0,0 +1,11 @@ +#!/usr/bin/env sh +set -eu + +./scripts/validate-knowledge-store.sh + +if ./scripts/check-submodules.sh; then + printf 'workflow verification passed\n' +else + printf 'workflow verification is partial: submodules are not initialized\n' >&2 + exit 2 +fi