Skip to content

feat: adopt harness engineering practices for agent-first development #3

@logosc

Description

@logosc

Overview

Adopt key practices from OpenAI's harness engineering approach to make the repo more agent-friendly. The codebase already has solid docs (docs/design.md, docs/plans/) and comprehensive tests — but lacks the glue that lets agents (Codex, Claude Code, etc.) self-orient and self-validate.

Tasks

1. Add CLAUDE.md as the sole agent instruction file (~100 lines)

The core insight from the article: treat the agent instruction file as a map, not an encyclopedia. A short root-level file that points to deeper sources of truth.

Note: No AGENTS.md exists in this repo. CLAUDE.md is the only agent instruction file — it serves as both the navigational map and the authoritative quick-reference. Keep it tool-agnostic (useful for Codex, Claude Code, Cursor, etc.) and link out to docs/ for anything longer than a paragraph to avoid frequent churn.

Should include:

  • Package layout and key interfaces (LLMProvider, Tool[S], ChatInterface)
  • Dependency layering rule (see task 3 for the full matrix)
  • How to run tests: go test -race ./..., cd debug/frontend && npm ci && npx tsc --noEmit
  • Go version: must match go.mod (currently 1.25.6)
  • Key invariants (Message.ImageData is raw bytes, providers are stateless, etc.)
  • Pointers to docs/design.md and docs/plans/
  • Feedback loop protocol (see task 5)

2. Add CI pipeline (.github/workflows/ci.yml)

Agents need a fast feedback loop.

on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with: { go-version-file: go.mod }
      - run: go vet ./...
      - run: go test -race ./...

CI Go version must come from go.mod (currently 1.25.6) — use go-version-file: go.mod rather than hardcoding a version string to avoid drift.

TypeScript check: Keep out of CI for now — npx tsc --noEmit requires npm ci and a Node setup step, adding complexity for a small frontend. Document it in CLAUDE.md pre-submit checks instead. Can be added to CI later if the frontend grows.

3. Add TestDependencyLayers structural test

Mechanically enforce the dependency layering invariant (the article's "enforce invariants, not implementations" pattern).

Explicit allowed/forbidden import matrix:

Source file(s) May import from root package? May import concrete providers? Notes
provider.go No (defines core types) No Zero intra-package deps
tool.go, chat.go Only provider.go types No
agent.go (engine) provider.go, tool.go, chat.go types No (anthropic.go, gemini.go, openai.go) Engine must stay provider-agnostic
anthropic.go, gemini.go, openai.go Core types only No cross-provider imports Each provider is self-contained
debug/ May import root package No
Root package Must not import debug/

Since this is a single flat package (not sub-packages), the "imports" here means function/type references, not Go import paths. The test should use go/ast to parse files and verify that, e.g., agent.go never references AnthropicProvider, GeminiProvider, or OpenAICompatibleProvider.

This test should run as part of go test ./... (no build tags) so CI catches violations automatically.

4. Add pre-submit checks to CLAUDE.md

A machine-readable checklist in CLAUDE.md (not a separate file) that any agent or human runs before committing:

## Pre-submit checks
1. go vet ./...
2. go test -race ./...
3. cd debug/frontend && npx tsc --noEmit  (requires: npm ci)

Optionally wire up steps 1-2 as a git pre-commit hook later.

5. Add entropy management: doc-drift detection

The article describes agents that run periodically to find inconsistencies in documentation and constraint violations. Lightweight version for this repo:

Add a TestDocDrift test (runs in go test ./...) that validates CLAUDE.md and docs/design.md stay in sync with reality:

  • Every public interface mentioned in CLAUDE.md actually exists in code
  • Every provider listed in the dependency matrix has a corresponding *_test.go file
  • The Go version stated in CLAUDE.md matches go.mod

This catches the slow rot where docs describe a codebase that no longer exists. Keeping it as a Go test means CI enforces it automatically — no scheduled jobs or extra infrastructure.

6. Add "agent struggles = missing context" feedback protocol

The article's central mental model: when an agent produces bad output, treat it as a signal that something is missing (docs, guardrails, tools) and feed it back into the repo — don't just fix the output.

Add a section to CLAUDE.md:

## Feeding back agent failures

When an agent (or a human following this guide) makes a repeated mistake:
1. Identify what was missing — unclear invariant? undocumented convention? missing test?
2. Fix the root cause in this file, docs/, or tests — not just the generated code.
3. If a new invariant emerges, add it to TestDependencyLayers or TestDocDrift.

The goal: every class of mistake only happens once.

This is a process practice, not code — but encoding it in the instruction file means agents internalize it too.

Future directions (not in scope)

  • Dynamic context providers (observability data, runtime state) — the article emphasizes these but they're relevant at larger scale
  • Full development loop encoding (PR templates, review checklists, automated feedback/recovery) — worth revisiting once the basics are in place
  • Custom linter framework beyond go vet (repo is small, not worth the config overhead)
  • ArchUnit-style dependency (the structural test covers it without external deps)
  • Separate AGENTS.md / CHECKS.md files (everything lives in CLAUDE.md to avoid split guidance)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions