diff --git a/.codex/evals/README.md b/.codex/evals/README.md
new file mode 100644
index 0000000..3a9abbf
--- /dev/null
+++ b/.codex/evals/README.md
@@ -0,0 +1,20 @@
+# Evals
+
+Use this directory for repo-local eval definitions that measure whether the AI
+workflow and the product behavior are improving or regressing.
+
+Recommended layout:
+
+```text
+.codex/evals/
+  templates/
+  <feature-name>.md
+  <feature-name>.log
+```
+
+For non-trivial changes, define:
+
+- capability evals for the new behavior
+- regression evals for the old behavior that must keep working
+- clear pass or fail evidence
+
diff --git a/.codex/evals/templates/feature-delivery.md b/.codex/evals/templates/feature-delivery.md
new file mode 100644
index 0000000..529f0cd
--- /dev/null
+++ b/.codex/evals/templates/feature-delivery.md
@@ -0,0 +1,22 @@
+# EVAL: <feature-name>
+
+## Capability evals
+
+- [ ] The intended user-visible behavior works end to end.
+- [ ] The relevant Playwright journey passes.
+- [ ] The expected log evidence is present.
+
+## Regression evals
+
+- [ ] Existing adjacent behavior still works.
+- [ ] No new console or runtime errors appear.
+- [ ] Build, lint, typecheck, and tests still pass.
+
+## Evidence
+
+- Plan:
+- Playwright artifact path:
+- CDP artifact path:
+- Log query:
+- Notes:
+
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..c93012d
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,8 @@
+.runtime/
+.worktrees/
+.artifacts/
+.idea/
+playwright-report/
+test-results/
+dist/
+coverage/
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 0000000..4f8d989
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,82 @@
+# git-ranker-workflow AGENTS
+
+This repository is the control plane for the `git-ranker` backend and the
+`git-ranker-client` frontend. Keep this file short. The system of record lives
+in [ARCHITECTURE.md](ARCHITECTURE.md) and [docs/](docs/index.md).
+
+## What this repo owns
+
+- Repository-local knowledge store and operating rules for coding agents
+- Cross-repo feature delivery workflow, QA loop, and observability workflow
+- ExecPlan conventions for long-running tasks
+- Guardrails for frontend/backend coordination across the two submodule repos
+
+## Repo map
+
+- `git-ranker/`: backend repo (submodule)
+- `git-ranker-client/`: frontend repo (submodule)
+- `ARCHITECTURE.md`: top-level control-plane architecture
+- `PLANS.md`: rules for long-running ExecPlans
+- `docs/`: knowledge store; treat this as the source of truth
+- `scripts/`: lightweight verification and scaffolding helpers
+- `harness/`: local observability and QA harness configuration
+- `.codex/evals/`: eval definitions and templates
+
+## How to start a task
+
+1. Read [ARCHITECTURE.md](ARCHITECTURE.md).
+2. Read [docs/index.md](docs/index.md) and the specific docs for the change
+   surface.
+3. If the request spans multiple files, multiple repos, new behavior, or a
+   likely multi-hour effort, create an ExecPlan in
+   `docs/exec-plans/active/<yyyy-mm-dd>-<slug>.md` and follow [PLANS.md](PLANS.md).
+4. Restate the request in terms of:
+   - user-visible outcome
+   - impacted repos
+   - acceptance checks
+   - required Playwright/CDP/Loki evidence
+5. Work inside a task-specific isolated runtime footprint under `.runtime/` and
+   `.worktrees/`.
+
+## System of record
+
+- Product intent: [docs/product-specs/index.md](docs/product-specs/index.md)
+- Architectural rules: [docs/design-docs/index.md](docs/design-docs/index.md)
+- UX and UI behavior: [docs/DESIGN.md](docs/DESIGN.md),
+  [docs/FRONTEND.md](docs/FRONTEND.md)
+- Backend and data behavior: [docs/BACKEND.md](docs/BACKEND.md),
+  [docs/SECURITY.md](docs/SECURITY.md), [docs/RELIABILITY.md](docs/RELIABILITY.md)
+- Quality and cleanup rules: [docs/QUALITY_SCORE.md](docs/QUALITY_SCORE.md)
+- Generated facts: [docs/generated/README.md](docs/generated/README.md)
+- Workflow loop: [docs/workflows/feature-delivery-loop.md](docs/workflows/feature-delivery-loop.md),
+  [docs/workflows/qa-feedback-loop.md](docs/workflows/qa-feedback-loop.md)
+
+## Non-negotiables
+
+- Do not turn `AGENTS.md` into a large manual. Promote durable rules into
+  `docs/` or scripts.
+- Do not implement from vague intent. Convert feature requests into explicit
+  acceptance criteria first.
+- Do not ship a user-visible change without QA evidence from:
+  - automated tests
+  - Playwright
+  - browser inspection via CDP or equivalent
+  - worktree-local logs in Loki or the configured log backend
+- Do not treat Slack, chat history, or memory as source of truth. If it matters
+  later, check it into the repo.
+- Do not handwave cross-repo changes. Contract changes must be reflected in
+  backend, frontend, docs, and validation steps.
+
+## Delivery loop
+
+1. Intake and clarify the request.
+2. Write or update an ExecPlan if the task is non-trivial.
+3. Implement in backend/frontend worktrees.
+4. Run build, typecheck, lint, and tests.
+5. Boot the isolated stack for the task.
+6. Run Playwright journeys.
+7. Inspect UI, network, console, and DOM with CDP tooling.
+8. Query logs, metrics, and traces for the same task runtime.
+9. Feed findings back into code, docs, and the ExecPlan.
+10. Record outcomes and remaining debt before handoff or merge.
+
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
new file mode 100644
index 0000000..5a96934
--- /dev/null
+++ b/ARCHITECTURE.md
@@ -0,0 +1,153 @@
+# git-ranker Workflow Architecture
+
+## Purpose
+
+This repository is the orchestration layer for an agent-first development
+workflow across two application repositories:
+
+- `git-ranker`: backend system of record for APIs, jobs, persistence, and domain
+  rules
+- `git-ranker-client`: frontend system of record for routes, components, user
+  flows, and client-side state
+
+The control plane in this repo exists to make the product legible to coding
+agents, not to store application logic.
+
+## Current repo facts
+
+The submodules are initialized in this workspace and currently expose these
+high-level facts:
+
+- backend: Spring Boot 3.4, Java 21, JPA, Batch, Security, Actuator, Prometheus,
+  structured JSON logging, Testcontainers, ArchUnit
+- frontend: Next.js App Router, React 19, TypeScript, ESLint, React Query,
+  Zustand, Tailwind, Radix UI
+
+Those facts should shape the workflow and harness choices instead of generic
+defaults.
+
+## Core principle
+
+Repository-local knowledge is the system of record. A coding agent should be
+able to understand the product, architecture, quality bar, and execution flow
+from versioned artifacts in this repository plus the checked-out submodules.
+
+## Control-plane flow
+
+```text
+feature request
+  -> request intake and acceptance contract
+  -> ExecPlan for non-trivial work
+  -> backend contract / behavior changes
+  -> frontend integration / UI changes
+  -> isolated task runtime
+  -> Playwright + CDP validation
+  -> logs / metrics / traces review
+  -> fix loop
+  -> PR / merge / debt update
+```
+
+## Worktree model
+
+Every non-trivial task should use an isolated runtime footprint keyed by a task
+slug, for example `rank-comparison-filtering`.
+
+Expected layout:
+
+```text
+.worktrees/
+  backend/<task-slug>/
+  frontend/<task-slug>/
+.runtime/
+  <task-slug>/
+    logs/
+    traces/
+    screenshots/
+    videos/
+    playwright/
+    observability/
+```
+
+The goal matches OpenAI's harness model:
+
+- one isolated app instance per task
+- one isolated observability context per task
+- artifacts are disposable once the task is complete
+
+## Knowledge-store layout
+
+```text
+AGENTS.md
+ARCHITECTURE.md
+PLANS.md
+docs/
+  design-docs/
+  exec-plans/
+  generated/
+  product-specs/
+  references/
+  workflows/
+```
+
+`AGENTS.md` is only the table of contents. The durable knowledge lives in
+`docs/`.
+
+## Cross-repo contract
+
+The repositories are versioned independently, but the workflow treats them as a
+single product system. A change request must identify which of the following are
+affected:
+
+- backend domain rules
+- backend API or event contracts
+- frontend route or component behavior
+- shared product language and acceptance criteria
+- reliability, security, or QA evidence
+
+Any contract change must update both sides of the boundary plus the knowledge
+store if the change affects future tasks.
+
+## Layering model
+
+The two repos should converge on one directional dependency model:
+
+```text
+Types -> Schemas/Contracts -> Repository/Gateway -> Service/Use Case
+      -> Runtime/Delivery -> UI or HTTP surface
+
+Cross-cutting concerns enter only through Providers:
+auth, feature flags, telemetry, configuration, external connectors
+```
+
+This is intentionally rigid. Agents move faster when the allowed edges are
+obvious and mechanically enforceable.
+
+## QA and observability loop
+
+Every user-visible change is expected to produce:
+
+- automated regression evidence
+- a Playwright run over the affected journey
+- CDP evidence for DOM, console, network, and screenshot state
+- log evidence from the isolated task runtime
+- metrics and trace evidence when performance or async flow matters
+
+The recommended local stack is documented in
+[docs/workflows/local-observability-stack.md](docs/workflows/local-observability-stack.md).
+The implementation provided in `harness/` uses Loki, Prometheus, Tempo, and
+Grafana to preserve the same agent-facing query model described by OpenAI:
+LogQL, PromQL, and TraceQL.
+
+## What stays out of this repo
+
+- application code that belongs in `git-ranker` or `git-ranker-client`
+- private tribal knowledge that should instead be turned into docs
+- ad hoc task notes that never graduate into reusable rules
+
+## Current limitations
+
+- the frontend repo does not yet contain committed Playwright or test config
+- the harness knows the backend metrics endpoint, but frontend metrics and trace
+  export wiring are still generic
+- repo-specific start scripts and local env bootstrapping still need to be
+  codified into the harness
diff --git a/PLANS.md b/PLANS.md
new file mode 100644
index 0000000..a9c52d3
--- /dev/null
+++ b/PLANS.md
@@ -0,0 +1,83 @@
+# ExecPlans for git-ranker-workflow
+
+This document adapts OpenAI's `PLANS.md` pattern to a two-repository product
+workflow. Use it for any task that is likely to take more than one session,
+spans multiple files or repos, changes contracts, or requires non-trivial QA.
+
+## When to create an ExecPlan
+
+Create an ExecPlan when any of the following are true:
+
+- the request spans backend and frontend
+- the request changes API, schema, routing, or product behavior
+- the work is expected to last more than 30 minutes
+- you need a reproducible QA and feedback loop
+- you expect to stop and resume later
+
+Store plans in `docs/exec-plans/active/<yyyy-mm-dd>-<slug>.md`.
+
+## Non-negotiable rules
+
+- Every ExecPlan must be self-contained.
+- Every ExecPlan must remain a living document.
+- Every ExecPlan must let a novice continue from only the working tree and the
+  plan file.
+- Every ExecPlan must describe observable outcomes, not just code edits.
+- Every ExecPlan must define the validation loop clearly.
+
+## Repo-specific additions
+
+Every plan in this repository must also include:
+
+- impacted repo list: backend, frontend, or both
+- request intake summary in plain language
+- contract boundary notes
+- exact task runtime slug
+- expected Playwright journeys
+- expected CDP evidence
+- expected Loki or log-backend queries
+- rollback or retry notes for each risky step
+
+## Required sections
+
+Every ExecPlan must keep these sections current:
+
+- `Purpose / Big Picture`
+- `Progress`
+- `Surprises & Discoveries`
+- `Decision Log`
+- `Outcomes & Retrospective`
+- `Context and Orientation`
+- `Plan of Work`
+- `Concrete Steps`
+- `Validation and Acceptance`
+- `Idempotence and Recovery`
+- `Artifacts and Notes`
+- `Interfaces and Dependencies`
+
+## Formatting
+
+The plan file itself should contain one single fenced code block labeled `md`.
+Do not nest other fenced blocks inside the plan. Use indentation for commands,
+snippets, and transcripts.
+
+## Required execution rhythm
+
+1. Clarify the user's request in product language.
+2. Identify impacted repos and documents.
+3. Research before implementation.
+4. Update the plan before and after every material milestone.
+5. Validate behavior in the isolated task runtime.
+6. Record the evidence path for screenshots, videos, traces, and logs.
+7. Update docs when a new durable rule or system fact is discovered.
+
+## Plan naming
+
+Use a sortable filename:
+
+`docs/exec-plans/active/2026-03-07-rank-comparison-filtering.md`
+
+## Template
+
+Start from `docs/exec-plans/_template.md`.
+
diff --git a/docs/BACKEND.md b/docs/BACKEND.md
new file mode 100644
index 0000000..0454426
--- /dev/null
+++ b/docs/BACKEND.md
@@ -0,0 +1,59 @@
+---
+summary: Backend implementation, contract, and observability rules for git-ranker.
+read_when:
+  - working in the backend repo
+  - modifying APIs, jobs, persistence, or ranking logic
+---
+
+# Backend
+
+## Current repo facts
+
+- framework: Spring Boot 3.4
+- language/runtime: Java 21
+- architecture hints: domain, infrastructure, global, and batch packages
+- observability already present: Actuator, Prometheus endpoint, structured
+  logback JSON encoder, trace-id MDC support
+- test stack already present: JUnit 5, Testcontainers, ArchUnit, Jacoco
+
+## What agents must optimize for
+
+- explicit contracts
+- narrow IO boundaries
+- observable behavior
+- safe migrations
+- reproducible startup and request behavior
+
+## Required workflow for backend changes
+
+1. Define the affected contract and acceptance behavior.
+2. Identify the layer changes required: contract, repository, service, runtime.
+3. Implement the change.
+4. Run backend build and tests.
+5. Boot the isolated task runtime.
+6. Exercise the changed API or worker path.
+7. Query logs, metrics, and traces for the affected path.
+8. Record evidence and findings in the ExecPlan.
+
+## Contract rules
+
+- Parse inputs at the boundary.
+- Version or clearly document contract changes.
+- Never let controllers or handlers own business logic.
+- Prefer explicit repositories or gateways over ad hoc IO scattered through the
+  codebase.
+
+## Observability bar
+
+Every important backend change should leave behind:
+
+- structured log evidence for the changed flow
+- at least one metric or timing check for latency-sensitive paths
+- trace evidence when multiple async steps or external calls are involved
+- a note describing which log query proves the behavior worked
+
+## Expected commands to codify
+
+- `./gradlew build`
+- `./gradlew test`
+- `./gradlew integrationTest` when Docker-backed integration coverage matters
diff --git a/docs/DESIGN.md b/docs/DESIGN.md
new file mode 100644
index 0000000..96b482a
--- /dev/null
+++ b/docs/DESIGN.md
@@ -0,0 +1,32 @@
+---
+summary: UI and interaction design rules for agent-authored frontend changes.
+read_when:
+  - changing visible UI
+  - altering copy, layout, or interaction flow
+---
+
+# Design
+
+## Goal
+
+Frontend changes must be legible to users and to future agents. Design choices
+should be deliberate enough that screenshots, DOM snapshots, and acceptance docs
+all tell the same story.
+
+## Rules
+
+- Start from user journeys, not component churn.
+- Reuse existing visual patterns unless a design doc says the new pattern is
+  intentionally different.
+- Capture meaningful empty, loading, success, and error states.
+- Name visual states explicitly in docs and tests.
+- If a visible workflow changes, update the relevant product and QA docs in the
+  same task.
+
+## Required evidence for visual changes
+
+- before/after screenshots or the first implementation screenshot plus expected
+  final state
+- a Playwright assertion for the intended state
+- a CDP check for console cleanliness and final DOM state
+
diff --git a/docs/FRONTEND.md b/docs/FRONTEND.md
new file mode 100644
index 0000000..e801d80
--- /dev/null
+++ b/docs/FRONTEND.md
@@ -0,0 +1,68 @@
+---
+summary: Frontend implementation and QA rules for git-ranker-client.
+read_when:
+  - working in the frontend repo
+  - validating a user-visible change
+---
+
+# Frontend
+
+## Current repo facts
+
+- framework: Next.js App Router under `src/app`
+- language: TypeScript with strict mode
+- runtime: React 19
+- data/state: React Query and Zustand
+- linting: ESLint via `eslint.config.mjs`
+
+## Current gap
+
+No committed unit-test or Playwright config was found in `git-ranker-client`
+during this setup. That means the workflow requirement is stricter than the
+current repo state. For any meaningful frontend feature, part of the work should
+be adding or wiring the missing QA harness.
+
+## What agents must optimize for
+
+- predictable route behavior
+- explicit data loading and failure handling
+- testable UI states
+- clear contract boundaries with the backend
+
+## Required workflow for user-visible changes
+
+1. Confirm or create acceptance criteria in product language.
+2. Identify affected routes, components, and client contracts.
+3. Implement the change.
+4. Run frontend build, lint, and any available tests.
+5. Boot the isolated task runtime.
+6. Run the Playwright journey for the changed surface.
+7. Inspect the final state with CDP:
+   - screenshot
+   - DOM snapshot
+   - console logs
+   - failed network requests
+8. Record artifact paths in the ExecPlan.
+
+## Frontend contract rules
+
+- Parse and validate incoming server data at the boundary.
+- Do not let raw backend payloads leak through the UI tree.
+- Put orchestration in loaders, hooks, or services; keep components focused on
+  rendering and event wiring.
+- Make loading, empty, and error states explicit.
+
+## Minimum QA bar
+
+Every frontend feature should leave behind:
+
+- at least one Playwright path for the happy path
+- at least one assertion for the most important failure or empty state
+- a reproducible screenshot or video path
+- a CDP artifact path for the final DOM and console state
+
+## Expected commands to codify
+
+- `npm run build`
+- `npm run lint`
+- a future committed Playwright command such as `npx playwright test`
diff --git a/docs/PRODUCT_SENSE.md b/docs/PRODUCT_SENSE.md
new file mode 100644
index 0000000..7bd19d3
--- /dev/null
+++ b/docs/PRODUCT_SENSE.md
@@ -0,0 +1,31 @@
+---
+summary: Product framing rules that convert requests into stable acceptance criteria.
+read_when:
+  - clarifying scope
+  - deciding whether a change is complete
+---
+
+# Product Sense
+
+## Principle
+
+Do not implement from ambiguous desire statements. Convert requests into stable,
+testable behavior statements first.
+
+## Required questions
+
+- Who benefits from the change?
+- What exact workflow improves?
+- What is the smallest observable version of the outcome?
+- What must remain unchanged?
+- How will we know the feature actually works?
+
+## Completion test
+
+A feature is not complete when the code exists. It is complete when:
+
+- the user-visible outcome is real
+- the acceptance checks pass
+- the QA evidence exists
+- the docs explain the new durable behavior
+
diff --git a/docs/QUALITY_SCORE.md b/docs/QUALITY_SCORE.md
new file mode 100644
index 0000000..945afa1
--- /dev/null
+++ b/docs/QUALITY_SCORE.md
@@ -0,0 +1,37 @@
+---
+summary: Quality scoring rubric and continuous cleanup loop.
+read_when:
+  - reviewing architecture drift
+  - scheduling cleanup or follow-up refactors
+---
+
+# Quality Score
+
+Use a simple A to F score per major surface:
+
+- contract clarity
+- test coverage
+- docs freshness
+- observability coverage
+- layering discipline
+- UX state completeness
+
+## Grade meanings
+
+- `A`: clear boundaries, current docs, strong tests, observable behavior
+- `B`: acceptable but missing one non-critical reinforcement
+- `C`: functional but agent legibility is degraded
+- `D`: drift is visible and likely to spread
+- `F`: unsafe to scale without cleanup
+
+## Garbage-collection rule
+
+If a change uncovers a durable bad pattern, either:
+
+- fix it in the same task, or
+- add it to [exec-plans/tech-debt-tracker.md](exec-plans/tech-debt-tracker.md)
+  with a clear trigger and consequence
+
+The desired operating mode is continuous small cleanup, not occasional large
+rewrite weeks.
+
diff --git a/docs/RELIABILITY.md b/docs/RELIABILITY.md
new file mode 100644
index 0000000..1f482ef
--- /dev/null
+++ b/docs/RELIABILITY.md
@@ -0,0 +1,35 @@
+---
+summary: Reliability expectations and evidence rules for runtime behavior.
+read_when:
+  - changing startup flow
+  - touching async jobs, APIs, or critical user journeys
+---
+
+# Reliability
+
+## Principle
+
+Reliability requirements must be phrased as observable behavior, not vague
+intent.
+
+## Every reliability-sensitive task should answer
+
+- which journey matters?
+- what latency or failure budget matters?
+- how will logs, metrics, or traces prove compliance?
+
+## Default expectations
+
+- startup should be measured, not assumed
+- critical journeys should have named owners and evidence
+- no regression claim is valid without an artifact path
+
+## Evidence examples
+
+- `LogQL`: service startup completed without retries or fatal errors
+- `PromQL`: request or job latency remained under the target threshold
+- `TraceQL`: no span in the named journey exceeded the agreed threshold
+
+Exact thresholds belong in the relevant ExecPlan until stable enough to promote
+into a permanent doc.
+
diff --git a/docs/SECURITY.md b/docs/SECURITY.md
new file mode 100644
index 0000000..3b18850
--- /dev/null
+++ b/docs/SECURITY.md
@@ -0,0 +1,33 @@
+---
+summary: Security baseline for frontend/backend workflow changes.
+read_when:
+  - handling user input
+  - changing auth, secrets, or external integrations
+---
+
+# Security
+
+## Baseline
+
+- validate all untrusted input at the boundary
+- keep secrets out of the frontend
+- log safely; do not leak secrets or raw credentials
+- prefer least-privilege connectors
+- treat auth and authorization as explicit requirements, not assumptions
+
+## Required review triggers
+
+Do an explicit security pass when the change touches:
+
+- authentication
+- authorization
+- user-generated content
+- file upload or download
+- external webhooks or callbacks
+- tokens, API keys, or cookies
+
+## Documentation rule
+
+If a task changes a security-relevant behavior, the relevant doc and ExecPlan
+must say what changed and how it was verified.
+
diff --git a/docs/design-docs/core-beliefs.md b/docs/design-docs/core-beliefs.md
new file mode 100644
index 0000000..cd4879c
--- /dev/null
+++ b/docs/design-docs/core-beliefs.md
@@ -0,0 +1,46 @@
+---
+summary: Core beliefs for an agent-first repository.
+read_when:
+  - making architecture or workflow decisions
+  - deciding whether a rule belongs in code, docs, or a prompt
+---
+
+# Core Beliefs
+
+## 1. Repository-local knowledge wins
+
+If a fact matters to future work, it must live in versioned files inside this
+repository or the application repos. Chat logs, memory, and oral tradition do
+not count.
+
+## 2. Legibility beats cleverness
+
+Prefer technologies, abstractions, and folder structures that a stateless agent
+can inspect, understand, and modify without hidden context.
+
+## 3. AGENTS is a map, not the encyclopedia
+
+Keep `AGENTS.md` concise. Promote durable instructions into purpose-built docs or
+mechanical checks.
+
+## 4. Boundaries are leverage
+
+Strict layering, naming, and evidence requirements are not bureaucracy. They are
+what lets agents move quickly without spreading architectural drift.
+
+## 5. Behavior matters more than code motion
+
+Every change must end in observable behavior: a journey that passes, an error
+that disappears, a metric that stays below target, or a trace that no longer
+regresses.
+
+## 6. Feedback loops are part of the product
+
+Playwright specs, CDP inspection, logs, metrics, traces, and review loops are
+first-class system components. If they are missing, the workflow is incomplete.
+
+## 7. Continuous cleanup is mandatory
+
+Bad patterns compound quickly in an AI-heavy codebase. Capture taste once,
+enforce it repeatedly, and keep the debt surface small.
+
diff --git a/docs/design-docs/domain-layering.md b/docs/design-docs/domain-layering.md
new file mode 100644
index 0000000..cb16080
--- /dev/null
+++ b/docs/design-docs/domain-layering.md
@@ -0,0 +1,95 @@
+---
+summary: Required dependency direction and layer meanings across backend and frontend.
+read_when:
+  - adding a new module
+  - reviewing dependency direction
+  - designing cross-repo contracts
+---
+
+# Domain Layering
+
+## Dependency direction
+
+All code should depend only in the following forward direction:
+
+```text
+Types -> Schemas/Contracts -> Repository/Gateway -> Service/Use Case
+      -> Runtime/Delivery -> UI or HTTP surface
+```
+
+Cross-cutting concerns enter only through explicitly named provider interfaces:
+
+- auth providers
+- feature-flag providers
+- telemetry providers
+- config providers
+- external connector providers
+
+No other reverse or sideways dependencies are allowed.
+
+## Layer meaning
+
+### Types
+
+Pure domain types and names. No IO. No framework imports.
+
+### Schemas/Contracts
+
+Validation rules, request/response shapes, event payloads, serialized forms, and
+frontend/backend contract models.
+
+### Repository/Gateway
+
+Persistence or remote access layers. Database clients, HTTP clients, queues, and
+third-party APIs live here behind narrow interfaces.
+
+### Service/Use Case
+
+Business logic. The place where ranking behavior, orchestration, and policy are
+implemented.
+
+### Runtime/Delivery
+
+The runtime boundary that wires providers and use cases into the actual program.
+Examples:
+
+- backend handlers, jobs, schedulers
+- frontend loaders, route state wiring, query orchestration
+
+### UI or HTTP surface
+
+The final user or network surface:
+
+- frontend components and route shells
+- backend controllers, route modules, or transport handlers
+
+## Repo-specific mapping
+
+### `git-ranker` backend
+
+- Types: domain entities and value objects
+- Schemas/Contracts: DTOs, validation schemas, API contracts, job payloads
+- Repository/Gateway: DB access, cache, queue, external API connectors
+- Service/Use Case: ranking algorithms, workflows, business rules
+- Runtime/Delivery: route wiring, workers, scheduled tasks
+- HTTP surface: API endpoints and transport adapters
+
+### `git-ranker-client` frontend
+
+- Types: view-agnostic domain models
+- Schemas/Contracts: API client contracts, form schemas, router payloads
+- Repository/Gateway: API clients and local persistence adapters
+- Service/Use Case: client-side orchestration and derived state logic
+- Runtime/Delivery: route loaders, providers, suspense/query setup
+- UI surface: pages, sections, components, and interaction handlers
+
+## Guardrails to encode later
+
+These rules should eventually become lint rules or structural tests:
+
+- boundary parsing happens at contracts, not ad hoc in UI or handlers
+- services may not import UI modules
+- repositories may not import runtime or surface layers
+- no direct third-party IO from UI or business logic
+- cross-repo contracts must be named and versioned explicitly
+
diff --git a/docs/design-docs/index.md b/docs/design-docs/index.md
new file mode 100644
index 0000000..dacb699
--- /dev/null
+++ b/docs/design-docs/index.md
@@ -0,0 +1,12 @@
+---
+summary: Index of durable architectural and workflow design rules.
+read_when:
+  - changing architecture
+  - deciding where new code or docs should live
+---
+
+# design-docs index
+
+- [core-beliefs.md](core-beliefs.md)
+- [domain-layering.md](domain-layering.md)
+
diff --git a/docs/exec-plans/README.md b/docs/exec-plans/README.md
new file mode 100644
index 0000000..b50a154
--- /dev/null
+++ b/docs/exec-plans/README.md
@@ -0,0 +1,31 @@
+---
+summary: How ExecPlans are stored and maintained in this repository.
+read_when:
+  - creating or resuming a non-trivial task
+---
+
+# Exec Plans
+
+## Layout
+
+- `active/`: plans that are still being executed
+- `completed/`: plans whose work and retrospective are complete
+- `_template.md`: starting point for new plans
+- `tech-debt-tracker.md`: backlog of durable issues discovered by the workflow
+
+## Naming
+
+`<yyyy-mm-dd>-<slug>.md`
+
+Example:
+
+`2026-03-07-ranking-filter-panel.md`
+
+## Workflow
+
+1. Create from `_template.md`.
+2. Fill in the request intake and context before coding.
+3. Update `Progress`, `Decision Log`, and `Surprises & Discoveries` during work.
+4. Record artifact paths for Playwright, CDP, logs, metrics, and traces.
+5. Move the plan to `completed/` when the work and retrospective are finished.
+
diff --git a/docs/exec-plans/_template.md b/docs/exec-plans/_template.md
new file mode 100644
index 0000000..8aa5c9e
--- /dev/null
+++ b/docs/exec-plans/_template.md
@@ -0,0 +1,95 @@
+```md
+# <Short, action-oriented title>
+
+This ExecPlan is a living document. Maintain it according to `PLANS.md` at the
+repository root.
+
+## Purpose / Big Picture
+
+Explain what user-visible behavior will exist after this change and how to see
+it working.
+
+Request intake:
+
+    Problem:
+    User-visible outcome:
+    Affected repos:
+    Contract surface:
+    Acceptance checks:
+    QA evidence:
+    Non-goals:
+    Risks:
+
+Task runtime slug:
+
+    <task-slug>
+
+## Progress
+
+- [ ] Example incomplete step.
+- [ ] Example partially completed step.
+- [x] Example completed step with timestamp.
+
+## Surprises & Discoveries
+
+- Observation:
+  Evidence:
+
+## Decision Log
+
+- Decision:
+  Rationale:
+  Date/Author:
+
+## Outcomes & Retrospective
+
+- Outcome:
+  Remaining gap:
+  Lesson:
+
+## Context and Orientation
+
+Describe the current system state, key files, terms, and constraints as if the
+reader knows nothing about the repo.
+
+## Plan of Work
+
+Describe the sequence of changes in prose. Name exact files, modules, and
+surfaces to edit.
+
+## Concrete Steps
+
+List exact commands, working directories, and expected observations.
+
+## Validation and Acceptance
+
+Describe:
+
+    - test commands
+    - Playwright journeys
+    - CDP checks
+    - Loki or log-backend queries
+    - metrics or trace checks when relevant
+
+## Idempotence and Recovery
+
+Describe which steps are safe to rerun and how to recover from partial failure.
+
+## Artifacts and Notes
+
+Record paths for:
+
+    - screenshots
+    - videos
+    - Playwright reports
+    - DOM snapshots
+    - console captures
+    - log query output
+    - metric or trace captures
+
+## Interfaces and Dependencies
+
+List the interfaces, contracts, libraries, providers, and service boundaries the
+task depends on or creates.
+```
+
diff --git a/docs/exec-plans/active/README.md b/docs/exec-plans/active/README.md
new file mode 100644
index 0000000..875227b
--- /dev/null
+++ b/docs/exec-plans/active/README.md
@@ -0,0 +1,4 @@
+# Active Exec Plans
+
+Put in-progress or not-yet-retired plans here.
+
diff --git a/docs/exec-plans/completed/README.md b/docs/exec-plans/completed/README.md
new file mode 100644
index 0000000..edf7d81
--- /dev/null
+++ b/docs/exec-plans/completed/README.md
@@ -0,0 +1,4 @@
+# Completed Exec Plans
+
+Move plans here after the work and retrospective are complete.
+
diff --git a/docs/exec-plans/tech-debt-tracker.md b/docs/exec-plans/tech-debt-tracker.md
new file mode 100644
index 0000000..1e46d0c
--- /dev/null
+++ b/docs/exec-plans/tech-debt-tracker.md
@@ -0,0 +1,42 @@
+---
+summary: Durable debt surfaced by the AI workflow and not yet encoded away.
+read_when:
+  - deciding whether to open a cleanup plan
+  - looking for recurring workflow friction
+---
+
+# Tech Debt Tracker
+
+## Open items
+
+### 1. Generated fact pipelines are not implemented
+
+- Problem: `docs/generated/` is defined but not populated automatically.
+- Consequence: agents must still read code directly for many repo facts.
+- Desired fix: add generators for API surface, schema, route map, and dependency
+  graph documents.
+
+### 2. Harness command wiring is still generic
+
+- Problem: the workflow defines Playwright/CDP/Loki loops, but the exact app
+  commands are not yet wired to this project's real scripts.
+- Consequence: the docs are operationally ready, but runtime automation still
+  needs repo-specific command binding.
+- Desired fix: once submodules are initialized, codify build, start, and test
+  commands in `harness/` and verification scripts.
+
+### 3. Frontend automated QA harness is missing from the repo
+
+- Problem: `git-ranker-client` currently has no committed Playwright config or
+  frontend test files.
+- Consequence: the OpenAI-style browser feedback loop cannot yet be enforced by
+  code in the frontend repo itself.
+- Desired fix: add Playwright, artifact paths, and at least one critical user
+  journey spec to `git-ranker-client`.
+
+## Resolved recently
+
+### Submodule bootstrap
+
+- `scripts/bootstrap-submodules.sh` was added.
+- `git-ranker` and `git-ranker-client` are initialized in this workspace.
diff --git a/docs/generated/README.md b/docs/generated/README.md
new file mode 100644
index 0000000..4ffb76d
--- /dev/null
+++ b/docs/generated/README.md
@@ -0,0 +1,27 @@
+---
+summary: Placeholder contract for machine-generated repository facts.
+read_when:
+  - you need repo facts faster than code search
+  - deciding whether to hand-edit generated docs
+---
+
+# generated
+
+This directory is reserved for machine-generated facts that agents should prefer
+over broad code search once generation pipelines exist.
+
+Expected outputs:
+
+- `backend-api-surface.md`
+- `backend-schema.md`
+- `frontend-route-map.md`
+- `frontend-state-surfaces.md`
+- `dependency-graph.md`
+
+## Rules
+
+- Generated docs should be reproducible from the application repos.
+- Once a generator exists, do not hand-edit the generated file.
+- If a durable fact is still missing here, document it in a design or product doc
+  until the generator exists.
+
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 0000000..04b096d
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,52 @@
+---
+summary: Repository knowledge-store index for agents and humans.
+read_when:
+  - starting work in this repository
+  - looking for the source of truth for a requirement or rule
+---
+
+# docs index
+
+This directory is the system of record for the AI workflow in
+`git-ranker-workflow`.
+
+## Read first
+
+- [../ARCHITECTURE.md](../ARCHITECTURE.md)
+- [../PLANS.md](../PLANS.md)
+
+## Product
+
+- [product-specs/index.md](product-specs/index.md)
+- [PRODUCT_SENSE.md](PRODUCT_SENSE.md)
+
+## Architecture and rules
+
+- [design-docs/index.md](design-docs/index.md)
+- [FRONTEND.md](FRONTEND.md)
+- [BACKEND.md](BACKEND.md)
+- [SECURITY.md](SECURITY.md)
+- [RELIABILITY.md](RELIABILITY.md)
+- [QUALITY_SCORE.md](QUALITY_SCORE.md)
+
+## Workflow loop
+
+- [workflows/feature-delivery-loop.md](workflows/feature-delivery-loop.md)
+- [workflows/qa-feedback-loop.md](workflows/qa-feedback-loop.md)
+- [workflows/local-observability-stack.md](workflows/local-observability-stack.md)
+
+## Planning
+
+- [exec-plans/README.md](exec-plans/README.md)
+- [exec-plans/tech-debt-tracker.md](exec-plans/tech-debt-tracker.md)
+
+## Generated facts
+
+- [generated/README.md](generated/README.md)
+
+## Source analyses that informed this structure
+
+- [references/openai-harness-engineering-analysis.md](references/openai-harness-engineering-analysis.md)
+- [references/openai-execplans-analysis.md](references/openai-execplans-analysis.md)
+- [references/steipete-ai-workflow-analysis.md](references/steipete-ai-workflow-analysis.md)
+
diff --git a/docs/product-specs/index.md b/docs/product-specs/index.md
new file mode 100644
index 0000000..c001754
--- /dev/null
+++ b/docs/product-specs/index.md
@@ -0,0 +1,11 @@
+---
+summary: Product-intent documents and intake rules for feature requests.
+read_when:
+  - translating a user request into acceptance criteria
+  - deciding what behavior must exist before coding
+---
+
+# product-specs index
+
+- [request-intake.md](request-intake.md)
+
diff --git a/docs/product-specs/request-intake.md b/docs/product-specs/request-intake.md
new file mode 100644
index 0000000..9abe9ef
--- /dev/null
+++ b/docs/product-specs/request-intake.md
@@ -0,0 +1,66 @@
+---
+summary: Intake contract for feature requests before implementation begins.
+read_when:
+  - before starting any new feature
+  - when the request is vague or spans frontend and backend
+---
+
+# Feature Request Intake
+
+Before implementation begins, rewrite every request in the following shape.
+
+## 1. Problem
+
+What user problem or business problem is being solved?
+
+## 2. User-visible outcome
+
+What can the user do after this change that they could not do before?
+
+## 3. Affected repos
+
+- `git-ranker`
+- `git-ranker-client`
+- `knowledge-store only`
+
+## 4. Contract surface
+
+List any API, schema, event, route, or copy changes.
+
+## 5. Acceptance checks
+
+Describe the exact behavior to observe when the change works.
+
+## 6. QA evidence
+
+List the minimum required evidence:
+
+- test commands
+- Playwright journey names
+- CDP checks
+- log queries
+- performance or trace checks when relevant
+
+## 7. Non-goals
+
+State what this request is not trying to solve.
+
+## 8. Risks
+
+List migration, compatibility, reliability, or security risks.
+
+## Output format
+
+Use this checklist at the top of an ExecPlan or task note:
+
+```text
+Problem:
+User-visible outcome:
+Affected repos:
+Contract surface:
+Acceptance checks:
+QA evidence:
+Non-goals:
+Risks:
+```
+
diff --git a/docs/references/openai-execplans-analysis.md b/docs/references/openai-execplans-analysis.md
new file mode 100644
index 0000000..37dc6c0
--- /dev/null
+++ b/docs/references/openai-execplans-analysis.md
@@ -0,0 +1,38 @@
+---
+summary: Analysis of OpenAI's PLANS.md article and how it is adapted here.
+read_when:
+  - writing or reviewing an ExecPlan
+  - deciding what a self-contained plan must include
+---
+
+# OpenAI ExecPlans analysis
+
+Source:
+
+- https://developers.openai.com/cookbook/articles/codex_exec_plans
+
+## Extracted rules
+
+- plans are for multi-hour or multi-session work
+- plans are living documents
+- plans must be self-contained for a novice reader
+- plans must describe observable outcomes
+- plans must track progress, discoveries, decisions, and retrospective outcomes
+
+## Adaptation in this repository
+
+The generic OpenAI structure is extended with repository-specific requirements:
+
+- impacted repo list
+- task runtime slug
+- explicit frontend/backend contract notes
+- Playwright, CDP, and log-query evidence
+- recovery notes for cross-repo or environment failures
+
+## Why the adaptation is needed
+
+This repository coordinates two separately versioned repos. A plan that only
+describes code edits is not enough. It must also describe the integration
+surface, the QA harness, and the observability evidence that proves the change
+worked across the system.
+
diff --git a/docs/references/openai-harness-engineering-analysis.md b/docs/references/openai-harness-engineering-analysis.md
new file mode 100644
index 0000000..3bb0c12
--- /dev/null
+++ b/docs/references/openai-harness-engineering-analysis.md
@@ -0,0 +1,43 @@
+---
+summary: Analysis of OpenAI's February 11, 2026 harness-engineering article and how it maps to this repo.
+read_when:
+  - understanding why this repository is structured this way
+  - checking whether a workflow rule is grounded in the source article
+---
+
+# OpenAI harness-engineering analysis
+
+Source:
+
+- https://openai.com/index/harness-engineering/
+
+## Extracted operating principles
+
+1. Start from an empty repo and make the environment, not the human, do the
+   heavy lifting.
+2. Treat repository-local knowledge as the system of record.
+3. Keep `AGENTS.md` short and use it as a map into richer docs.
+4. Increase agent legibility through bootable isolated runtimes, browser
+   inspection, and queryable observability.
+5. Enforce architecture and taste mechanically instead of depending on memory.
+6. Prefer fast correction loops over heavyweight blocking gates.
+7. Continuously garbage-collect drift through explicit quality rules.
+
+## Direct mappings in this repository
+
+- `AGENTS.md` is intentionally short and points into `docs/`.
+- `docs/` mirrors the knowledge-store layout described in the article.
+- `PLANS.md` and `docs/exec-plans/` provide the living-document execution model.
+- `harness/` is reserved for per-task runtime and observability wiring.
+- `docs/workflows/qa-feedback-loop.md` encodes the Playwright plus CDP plus log
+  analysis loop.
+- `docs/QUALITY_SCORE.md` and `docs/exec-plans/tech-debt-tracker.md` encode the
+  continuous cleanup model.
+
+## Important nuance
+
+The article describes a single large app repository. This project is split into
+two application repositories plus one orchestration repository. The adaptation
+here is to make the control plane live in this repo while keeping application
+logic in the backend and frontend repos.
+
diff --git a/docs/references/steipete-ai-workflow-analysis.md b/docs/references/steipete-ai-workflow-analysis.md
new file mode 100644
index 0000000..26c0519
--- /dev/null
+++ b/docs/references/steipete-ai-workflow-analysis.md
@@ -0,0 +1,46 @@
+---
+summary: Useful workflow patterns adapted from steipete repositories.
+read_when:
+  - refining docs structure
+  - deciding how much detail belongs in AGENTS versus docs
+---
+
+# steipete workflow analysis
+
+Sources reviewed:
+
+- https://github.com/steipete
+- https://raw.githubusercontent.com/steipete/oracle/main/AGENTS.md
+- https://raw.githubusercontent.com/steipete/agent-scripts/main/AGENTS.MD
+- https://raw.githubusercontent.com/steipete/Peekaboo/main/docs/ARCHITECTURE.md
+- https://raw.githubusercontent.com/steipete/Peekaboo/main/docs/testing/tools.md
+
+## Patterns worth reusing
+
+### 1. Short AGENTS, dense docs
+
+steipete keeps `AGENTS` directive-heavy but still points to focused docs rather
+than collapsing everything into one file. That supports fast context loading.
+
+### 2. `read_when` front matter
+
+Purpose-driven docs are easier for agents to load when the file itself states
+when it should be consulted. This repository adopts that pattern in `docs/`.
+
+### 3. Artifact-oriented testing
+
+The Peekaboo testing docs record exact logs, artifact paths, execution loops, and
+pass criteria. That pattern maps well to Playwright, CDP, and local observability
+artifacts here.
+
+### 4. Docs are part of the implementation
+
+steipete's repos treat doc updates as part of finishing a feature. This
+repository adopts the same rule.
+
+## What was not copied directly
+
+Those repos are optimized for different products and toolchains. The structure
+here preserves the transferable workflow ideas while staying aligned to OpenAI's
+harness-engineering model as the primary source of truth.
+
diff --git a/docs/workflows/feature-delivery-loop.md b/docs/workflows/feature-delivery-loop.md
new file mode 100644
index 0000000..0fc45be
--- /dev/null
+++ b/docs/workflows/feature-delivery-loop.md
@@ -0,0 +1,60 @@
+---
+summary: End-to-end feature workflow from user request to QA feedback loop.
+read_when:
+  - starting a feature
+  - deciding what the next task phase should be
+---
+
+# Feature Delivery Loop
+
+## Phase 1: Intake
+
+Convert the request into the shape defined in
+[../product-specs/request-intake.md](../product-specs/request-intake.md).
+
+## Phase 2: Plan
+
+For any non-trivial task, create an ExecPlan. The plan becomes the living record
+for progress, discoveries, decisions, evidence, and outcomes.
+
+## Phase 3: Implement
+
+Make the smallest set of backend and frontend changes that satisfy the plan
+while preserving the architectural layer rules.
+
+## Phase 4: Verify locally
+
+Run build, typecheck, lint, and automated tests in the affected repos.
+
+## Phase 5: Boot isolated runtime
+
+Launch the backend, frontend, and observability stack for the task-specific
+runtime slug.
+
+## Phase 6: QA
+
+Run Playwright for the changed user journeys. Inspect the final UI and network
+state using CDP tooling.
+
+## Phase 7: Observe
+
+Query the isolated runtime's logs, metrics, and traces. Confirm the system
+behavior matches the plan, not just the UI.
+
+## Phase 8: Feedback
+
+If QA or observability reveals a gap:
+
+- update code
+- update the ExecPlan
+- rerun the relevant checks
+- capture the new evidence
+
+## Phase 9: Retrospective
+
+At the end of the task:
+
+- update `Outcomes & Retrospective`
+- promote durable lessons into docs or scripts
+- add remaining debt to the tracker if it cannot be fixed now
+
diff --git a/docs/workflows/local-observability-stack.md b/docs/workflows/local-observability-stack.md
new file mode 100644
index 0000000..8fe9360
--- /dev/null
+++ b/docs/workflows/local-observability-stack.md
@@ -0,0 +1,75 @@
+---
+summary: Local per-task observability model using Loki, Prometheus, Tempo, and Grafana.
+read_when:
+  - setting up the task runtime
+  - deciding where logs, metrics, and traces should go
+---
+
+# Local Observability Stack
+
+## Goal
+
+Mirror the agent-facing observability workflow described by OpenAI:
+
+- one isolated observability context per task
+- logs, metrics, and traces queryable by the agent
+- disposable runtime after the task completes
+
+## Implementation choice
+
+OpenAI's article describes local queryable logs, metrics, and traces with
+LogQL, PromQL, and TraceQL. This repository uses Loki, Prometheus, Tempo, and
+Grafana as a practical equivalent for self-hosted local development.
+
+This is an implementation inference, not a direct quote from OpenAI.
+
+## Directory contract
+
+```text
+.runtime/<task-slug>/
+  logs/
+  metrics/
+  traces/
+  observability/
+```
+
+## Required labels
+
+Every emitted signal should carry at least:
+
+- `task_slug`
+- `service`
+- `repo`
+- `environment=local`
+
+## Example queries
+
+### LogQL
+
+```text
+{task_slug="<task-slug>",service="backend"} |= "startup complete"
+```
+
+### PromQL
+
+```text
+histogram_quantile(0.95, sum by (le) (rate(http_request_duration_seconds_bucket{task_slug="<task-slug>"}[5m])))
+```
+
+### TraceQL
+
+```text
+{ resource.task_slug = "<task-slug>" } | duration > 2s
+```
+
+## Config location
+
+See `harness/observability/`.
+
+## Current project-specific wiring
+
+- backend metrics are available from Spring Actuator on port `9090` at
+  `/actuator/prometheus`
+- backend logs are already structured through `logback-spring.xml`
+- frontend metrics and trace export are not yet committed as part of the client
+  repo, so the harness keeps those pieces generic for now
diff --git a/docs/workflows/qa-feedback-loop.md b/docs/workflows/qa-feedback-loop.md
new file mode 100644
index 0000000..3926763
--- /dev/null
+++ b/docs/workflows/qa-feedback-loop.md
@@ -0,0 +1,47 @@
+---
+summary: Playwright, CDP, and log-driven feedback loop for user-visible changes.
+read_when:
+  - validating a fix or feature
+  - debugging a regression
+---
+
+# QA Feedback Loop
+
+## Required inputs
+
+- task runtime slug
+- affected user journey name
+- expected final UI state
+- expected backend or log behavior
+
+## Loop
+
+1. Boot the isolated backend and frontend for the task.
+2. Run the Playwright journey that exercises the change.
+3. Capture:
+   - screenshots
+   - video when useful
+   - Playwright traces or reports
+4. Inspect the same run through CDP:
+   - DOM snapshot
+   - console output
+   - failed requests
+   - final URL and app state
+5. Query logs for the same time window and task slug.
+6. If performance or async orchestration matters, inspect metrics and traces too.
+7. Compare observations against the acceptance section of the ExecPlan.
+8. If anything disagrees, fix the system and rerun the loop.
+
+## Minimum artifact set
+
+- `.runtime/<slug>/playwright/`
+- `.runtime/<slug>/screenshots/`
+- `.runtime/<slug>/videos/` when relevant
+- `.runtime/<slug>/logs/`
+- `.runtime/<slug>/observability/queries.md` or equivalent note
+
+## Why this exists
+
+This is the core OpenAI harness pattern applied locally: the agent must be able
+to see the product behavior directly, not infer success from code changes alone.
+
diff --git a/git-ranker b/git-ranker
index d0197ca..3a5f37f 160000
--- a/git-ranker
+++ b/git-ranker
@@ -1 +1 @@
-Subproject commit d0197caa2f2c4c67f74082e09a4343c7ed30f3ba
+Subproject commit 3a5f37f1a06d756de5bd58ca3afb1a9ba3b9d2c6
diff --git a/git-ranker-client b/git-ranker-client
index 6ec13d8..378ace0 160000
--- a/git-ranker-client
+++ b/git-ranker-client
@@ -1 +1 @@
-Subproject commit 6ec13d8ea85af1e0ee952c4b82fcd54c0e2b925b
+Subproject commit 378ace0283755649a92a8a0fb7e7f9d3b54afdb8
diff --git a/harness/README.md b/harness/README.md
new file mode 100644
index 0000000..92f2048
--- /dev/null
+++ b/harness/README.md
@@ -0,0 +1,36 @@
+# Harness
+
+This directory contains the local runtime and observability skeleton for the
+OpenAI-style harness workflow used by this repository.
+
+## What lives here
+
+- `task.env.example`: environment template for a task-scoped runtime
+- `observability/`: Docker Compose and config for Loki, Prometheus, Tempo, and
+  Grafana
+
+## Suggested flow
+
+1. Create a task runtime:
+
+       ./scripts/init-task-runtime.sh <task-slug>
+
+2. Review and adjust `.runtime/<task-slug>/task.env`.
+
+3. Start the observability stack:
+
+       docker compose \
+         --env-file .runtime/<task-slug>/task.env \
+         -f harness/observability/docker-compose.yml \
+         up -d
+
+4. Start backend and frontend using the same task slug and matching ports.
+
+5. Run Playwright, inspect with CDP, and query the local stack.
+
+## Current state
+
+The config is intentionally generic because the application submodules are not
+yet initialized in this workspace. Once the real app commands are available,
+bind their log, metric, and trace endpoints into this harness.
+
diff --git a/harness/observability/docker-compose.yml b/harness/observability/docker-compose.yml
new file mode 100644
index 0000000..55a91ee
--- /dev/null
+++ b/harness/observability/docker-compose.yml
@@ -0,0 +1,61 @@
+services:
+  loki:
+    image: grafana/loki
+    command:
+      - -config.file=/etc/loki/loki-config.yml
+    ports:
+      - "${LOKI_PORT:-3100}:3100"
+    volumes:
+      - ./loki-config.yml:/etc/loki/loki-config.yml:ro
+      - ../../.runtime/${TASK_SLUG:-example-task}/observability/loki:/loki
+
+  promtail:
+    image: grafana/promtail
+    command:
+      - -config.file=/etc/promtail/promtail-config.yml
+      - -config.expand-env=true
+    depends_on:
+      - loki
+    volumes:
+      - ./promtail-config.yml:/etc/promtail/promtail-config.yml:ro
+      - ../../.runtime/${TASK_SLUG:-example-task}/logs:/var/log/task:ro
+
+  prometheus:
+    image: prom/prometheus
+    command:
+      - --config.file=/etc/prometheus/prometheus.yml
+      - --web.enable-remote-write-receiver
+    ports:
+      - "${PROMETHEUS_PORT:-9090}:9090"
+    volumes:
+      - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
+      - ../../.runtime/${TASK_SLUG:-example-task}/observability/prometheus:/prometheus
+
+  tempo:
+    image: grafana/tempo
+    command:
+      - -config.file=/etc/tempo/tempo.yml
+    ports:
+      - "${TEMPO_PORT:-3200}:3200"
+      - "${OTLP_GRPC_PORT:-4317}:4317"
+      - "${OTLP_HTTP_PORT:-4318}:4318"
+    volumes:
+      - ./tempo.yml:/etc/tempo/tempo.yml:ro
+      - ../../.runtime/${TASK_SLUG:-example-task}/observability/tempo:/var/tempo
+
+  grafana:
+    image: grafana/grafana-oss
+    depends_on:
+      - loki
+      - prometheus
+      - tempo
+    environment:
+      GF_AUTH_ANONYMOUS_ENABLED: "true"
+      GF_AUTH_DISABLE_LOGIN_FORM: "true"
+      GF_FEATURE_TOGGLES_ENABLE: traceqlEditor
+    ports:
+      - "${GRAFANA_PORT:-3001}:3000"
+    volumes:
+      - ./grafana/provisioning:/etc/grafana/provisioning:ro
+      - ../../.runtime/${TASK_SLUG:-example-task}/observability/grafana:/var/lib/grafana
+
diff --git a/harness/observability/grafana/provisioning/datasources/datasources.yml b/harness/observability/grafana/provisioning/datasources/datasources.yml
new file mode 100644
index 0000000..0781b9e
--- /dev/null
+++ b/harness/observability/grafana/provisioning/datasources/datasources.yml
@@ -0,0 +1,30 @@
+apiVersion: 1
+
+datasources:
+  - name: Loki
+    uid: loki
+    type: loki
+    access: proxy
+    url: http://loki:3100
+    isDefault: true
+
+  - name: Prometheus
+    uid: prometheus
+    type: prometheus
+    access: proxy
+    url: http://prometheus:9090
+
+  - name: Tempo
+    uid: tempo
+    type: tempo
+    access: proxy
+    url: http://tempo:3200
+    jsonData:
+      tracesToLogsV2:
+        datasourceUid: loki
+      tracesToMetrics:
+        datasourceUid: prometheus
+      nodeGraph:
+        enabled: true
+      serviceMap:
+        datasourceUid: prometheus
diff --git a/harness/observability/loki-config.yml b/harness/observability/loki-config.yml
new file mode 100644
index 0000000..4d652bd
--- /dev/null
+++ b/harness/observability/loki-config.yml
@@ -0,0 +1,33 @@
+auth_enabled: false
+
+server:
+  http_listen_port: 3100
+
+common:
+  path_prefix: /loki
+  storage:
+    filesystem:
+      chunks_directory: /loki/chunks
+      rules_directory: /loki/rules
+  replication_factor: 1
+  ring:
+    kvstore:
+      store: inmemory
+
+schema_config:
+  configs:
+    - from: 2024-01-01
+      store: tsdb
+      object_store: filesystem
+      schema: v13
+      index:
+        prefix: index_
+        period: 24h
+
+ruler:
+  storage:
+    type: local
+    local:
+      directory: /loki/rules
+  rule_path: /tmp/loki/rules-temp
+
diff --git a/harness/observability/prometheus.yml b/harness/observability/prometheus.yml
new file mode 100644
index 0000000..ab667ad
--- /dev/null
+++ b/harness/observability/prometheus.yml
@@ -0,0 +1,18 @@
+global:
+  scrape_interval: 15s
+
+scrape_configs:
+  - job_name: prometheus
+    static_configs:
+      - targets:
+          - localhost:9090
+
+  - job_name: backend
+    metrics_path: /actuator/prometheus
+    static_configs:
+      - targets:
+          - host.docker.internal:9090
+        labels:
+          repo: git-ranker
+          service: backend
+          environment: local
diff --git a/harness/observability/promtail-config.yml b/harness/observability/promtail-config.yml
new file mode 100644
index 0000000..f52d9d3
--- /dev/null
+++ b/harness/observability/promtail-config.yml
@@ -0,0 +1,21 @@
+server:
+  http_listen_port: 9080
+  grpc_listen_port: 0
+
+positions:
+  filename: /tmp/positions.yml
+
+clients:
+  - url: http://loki:3100/loki/api/v1/push
+
+scrape_configs:
+  - job_name: task-logs
+    static_configs:
+      - targets:
+          - localhost
+        labels:
+          job: task-logs
+          task_slug: ${TASK_SLUG}
+          environment: local
+          __path__: ${TASK_LOG_DIR}/*.log
+
diff --git a/harness/observability/tempo.yml b/harness/observability/tempo.yml
new file mode 100644
index 0000000..bcdcd11
--- /dev/null
+++ b/harness/observability/tempo.yml
@@ -0,0 +1,33 @@
+server:
+  http_listen_port: 3200
+
+distributor:
+  receivers:
+    otlp:
+      protocols:
+        grpc:
+          endpoint: 0.0.0.0:4317
+        http:
+          endpoint: 0.0.0.0:4318
+
+storage:
+  trace:
+    backend: local
+    local:
+      path: /var/tempo/traces
+
+compactor:
+  compaction:
+    block_retention: 24h
+
+metrics_generator:
+  storage:
+    path: /var/tempo/generator
+
+overrides:
+  defaults:
+    metrics_generator:
+      processors:
+        - service-graphs
+        - span-metrics
+
diff --git a/harness/task.env.example b/harness/task.env.example
new file mode 100644
index 0000000..60a530f
--- /dev/null
+++ b/harness/task.env.example
@@ -0,0 +1,10 @@
+TASK_SLUG=example-task
+BACKEND_PORT=4000
+FRONTEND_PORT=3000
+LOKI_PORT=13100
+PROMETHEUS_PORT=19090
+TEMPO_PORT=13200
+OTLP_GRPC_PORT=14317
+OTLP_HTTP_PORT=14318
+GRAFANA_PORT=13001
+TASK_LOG_DIR=/var/log/task
diff --git a/scripts/bootstrap-submodules.sh b/scripts/bootstrap-submodules.sh
new file mode 100755
index 0000000..00ad653
--- /dev/null
+++ b/scripts/bootstrap-submodules.sh
@@ -0,0 +1,7 @@
+#!/usr/bin/env sh
+set -eu
+
+git submodule sync --recursive
+git submodule update --init --recursive
+
+printf 'submodule bootstrap complete\n'
diff --git a/scripts/check-submodules.sh b/scripts/check-submodules.sh
new file mode 100755
index 0000000..e337583
--- /dev/null
+++ b/scripts/check-submodules.sh
@@ -0,0 +1,24 @@
+#!/usr/bin/env sh
+set -eu
+
+check_dir() {
+  path="$1"
+
+  if [ ! -d "$path" ]; then
+    printf 'missing directory: %s\n' "$path" >&2
+    return 1
+  fi
+
+  if [ -z "$(find "$path" -mindepth 1 -maxdepth 1 2>/dev/null)" ]; then
+    printf 'submodule appears uninitialized: %s\n' "$path" >&2
+    return 2
+  fi
+
+  printf 'submodule looks populated: %s\n' "$path"
+}
+
+status=0
+check_dir git-ranker || status=$?
+check_dir git-ranker-client || status=$?
+exit "$status"
+
diff --git a/scripts/init-task-runtime.sh b/scripts/init-task-runtime.sh
new file mode 100755
index 0000000..592f47a
--- /dev/null
+++ b/scripts/init-task-runtime.sh
@@ -0,0 +1,28 @@
+#!/usr/bin/env sh
+set -eu
+
+if [ "$#" -ne 1 ]; then
+  printf 'usage: %s <task-slug>\n' "$0" >&2
+  exit 1
+fi
+
+slug="$1"
+runtime_dir=".runtime/$slug"
+env_file="$runtime_dir/task.env"
+
+mkdir -p "$runtime_dir/logs"
+mkdir -p "$runtime_dir/metrics"
+mkdir -p "$runtime_dir/traces"
+mkdir -p "$runtime_dir/screenshots"
+mkdir -p "$runtime_dir/videos"
+mkdir -p "$runtime_dir/playwright"
+mkdir -p "$runtime_dir/observability"
+mkdir -p ".worktrees/backend"
+mkdir -p ".worktrees/frontend"
+
+if [ ! -f "$env_file" ]; then
+  sed "s/^TASK_SLUG=.*/TASK_SLUG=$slug/" harness/task.env.example > "$env_file"
+fi
+
+printf 'initialized task runtime: %s\n' "$runtime_dir"
+printf 'environment file: %s\n' "$env_file"
diff --git a/scripts/new-exec-plan.sh b/scripts/new-exec-plan.sh
new file mode 100755
index 0000000..d830b80
--- /dev/null
+++ b/scripts/new-exec-plan.sh
@@ -0,0 +1,20 @@
+#!/usr/bin/env sh
+set -eu
+
+if [ "$#" -ne 1 ]; then
+  printf 'usage: %s <slug>\n' "$0" >&2
+  exit 1
+fi
+
+slug="$1"
+date_prefix="$(date +%F)"
+target="docs/exec-plans/active/${date_prefix}-${slug}.md"
+
+if [ -e "$target" ]; then
+  printf 'plan already exists: %s\n' "$target" >&2
+  exit 1
+fi
+
+cp docs/exec-plans/_template.md "$target"
+printf 'created %s\n' "$target"
+
diff --git a/scripts/validate-knowledge-store.sh b/scripts/validate-knowledge-store.sh
new file mode 100755
index 0000000..4a229b9
--- /dev/null
+++ b/scripts/validate-knowledge-store.sh
@@ -0,0 +1,53 @@
+#!/usr/bin/env sh
+set -eu
+
+required_files='
+AGENTS.md
+ARCHITECTURE.md
+PLANS.md
+harness/README.md
+harness/task.env.example
+harness/observability/docker-compose.yml
+scripts/bootstrap-submodules.sh
+docs/index.md
+docs/DESIGN.md
+docs/FRONTEND.md
+docs/BACKEND.md
+docs/PRODUCT_SENSE.md
+docs/QUALITY_SCORE.md
+docs/RELIABILITY.md
+docs/SECURITY.md
+docs/design-docs/index.md
+docs/product-specs/index.md
+docs/exec-plans/README.md
+docs/exec-plans/_template.md
+docs/generated/README.md
+docs/workflows/feature-delivery-loop.md
+docs/workflows/qa-feedback-loop.md
+docs/workflows/local-observability-stack.md
+'
+
+missing=0
+
+for file in $required_files; do
+  if [ ! -f "$file" ]; then
+    printf 'missing required file: %s\n' "$file" >&2
+    missing=1
+  fi
+done
+
+if [ ! -d "docs/exec-plans/active" ]; then
+  printf 'missing required directory: docs/exec-plans/active\n' >&2
+  missing=1
+fi
+
+if [ ! -d "docs/exec-plans/completed" ]; then
+  printf 'missing required directory: docs/exec-plans/completed\n' >&2
+  missing=1
+fi
+
+if [ "$missing" -ne 0 ]; then
+  exit 1
+fi
+
+printf 'knowledge store validation passed\n'
diff --git a/scripts/verify-workflow.sh b/scripts/verify-workflow.sh
new file mode 100755
index 0000000..79b61e7
--- /dev/null
+++ b/scripts/verify-workflow.sh
@@ -0,0 +1,11 @@
+#!/usr/bin/env sh
+set -eu
+
+./scripts/validate-knowledge-store.sh
+
+if ./scripts/check-submodules.sh; then
+  printf 'workflow verification passed\n'
+else
+  printf 'workflow verification is partial: submodules are not initialized\n' >&2
+  exit 2
+fi