Skip to content

Latest commit

 

History

History
136 lines (84 loc) · 7.32 KB

File metadata and controls

136 lines (84 loc) · 7.32 KB

Harness Engineering Foundations

Harness engineering is the discipline of designing environments, repo knowledge, and feedback loops so an AI agent can execute work reliably. The work shifts from "write the code by hand" to "make the intended behavior legible, verifiable, and recoverable."

This repository is organized around three pillars: Legibility (navigation), Autonomy (verification and safe operational CLI access), and Human-AI Interface (the bidirectional channel between humans and the system, mediated by the agent — inbound for request refinement, outbound for operational signal interpretation). The six outcomes in docs/outcomes.md are the concrete proof model.

Principle 1: Legibility — Make The Codebase AI-Legible

If the agent cannot discover the structure, purpose, and boundaries of the repository from in-repo artifacts, it will guess. Durable repo knowledge matters more than chat explanations.

What Legibility Requires

  • a short root AGENTS.md that acts as a map, not an encyclopedia
  • subdirectory AGENTS.md files for complex or high-risk modules
  • references and examples that live in-repo rather than in people's heads
  • explicit UX intent for modules that affect user-visible behavior

Progressive Disclosure

Structure context in layers:

  • Root AGENTS.md: project overview, commands, directory map, conventions
  • Subdirectory AGENTS.md: local purpose, UX intent, key files, gotchas
  • Reference docs: deeper examples, history, schemas, and guides

Maintenance Rule

Any change to structure, key commands, conventions, or module boundaries should update the relevant AGENTS.md files. Stale guidance is worse than missing guidance because it teaches the agent the wrong thing confidently.

Principle 2: Autonomy — Make Self-Verification Possible

An agent that can only write code is not enough. It needs trusted ways to tell whether the repository is working now, whether a change passed the right tests, and whether a runtime path is meaningfully usable. Autonomy here includes tests and smoke paths, and—where applicable—using operational CLIs for evidence; it does not require full parity with production.

What Self-Verification Requires

  • baseline audits that describe the current state before remediation
  • fast targeted test execution for the code that changed
  • isolated or safe environments for verification
  • one trusted, non-destructive smoke path
  • explicit approval boundaries for higher-risk actions

The Required Proof Loops

The user should be able to point to concrete evidence for each north-star outcome. Names, proof criteria, and primary skills are defined only in docs/outcomes.md; in recommended order they are: Validate Current State, Navigate, Self-Test, Smoke Path, Bug Reproduction, and SRE Investigation.

Agent Authorization Tiers

Document three permission tiers in the root AGENTS.md:

  1. Autonomous: safe read, test, and local verification steps
  2. Supervised: actions that require review before they take effect
  3. Restricted: actions that require explicit approval every time

Never leave these boundaries implicit.

Principle 3: Human-AI Interface — The Bidirectional Channel

Legibility and Autonomy let the agent read and verify. The Human-AI Interface pillar is about the agent being a channel between humans and the system — in both directions. A codebase can be perfectly legible and fully self-verifying and still fail its people if vague requests arrive unchallenged or if production pain never surfaces back to the decision-makers.

What The Interface Requires

The Interface pillar compounds after Legibility and Autonomy are in place. Without AGENTS.md context or trustworthy tests/smoke paths, the agent cannot critique a request or interpret a signal with grounded confidence.

Inbound channel — human → system (interface-ticket-writer)

  • Refines vague requests into tickets an agent can one-shot
  • Surfaces missing edge cases, assumptions, success criteria before work starts
  • A workshop pattern, not an autocomplete

Outbound channel — system → human (interface-sre-agent)

  • Reproduce-before-fix discipline for bugs (fixes tie back to evidence the repo can rerun)
  • Reads logs, metrics, traces, CI, and cloud CLIs — surfaces ranked hypotheses with evidence
  • Relies on autonomy-sre-auditor having first proven that the required CLIs work

Why two channels

A system where humans can critique intake but production signals never reach the user through the agent is half-wired. A system where the agent diagnoses alerts but no one is refining the resulting work into executable tickets is also half-wired. Code-mint keeps both channels visible because agent readiness is not only about code execution; it is also about interpretation at the boundaries where humans and systems meet.

Outcome Map

Outcome names, proof criteria, and primary skill mappings are defined in docs/outcomes.md. Track progress and evidence in docs/onboarding-checklist.md. .agents/code-mint-status.json provides a machine-readable index of outcome statuses for cross-repo scanning.

Related AIDLC Practices

AWS AI-DLC is a useful external reference, but code-mint does not implement or vendor its full lifecycle. Two ideas are adopted directly because they improve audit quality without changing code-mint's outcome model:

Practice How code-mint uses it
Adaptive depth Auditors can run at quick, standard, or deep depth depending on repo age, risk, and recency of prior evidence.
Calibration Audit reports name confidence, what was not checked, and what would raise confidence.

Workspace heritage (Greenfield, Brownfield, Legacy) is also used during onboarding as a lightweight calibration aid. AIDLC construction workflows, opt-in extension systems, and operations-phase artifacts are intentionally not part of this repository.

Cross-Cutting Standards

These standards keep the repo legible as agent throughput increases.

Dependency Direction

Dependencies should flow one way:

Types → Config → Repository → Service → Runtime → UI

Agents are fast enough to create structural drift quickly. Clear dependency boundaries reduce that risk.

Extend Before Creating

Before creating a new file, the agent should:

  1. look for an existing home for the logic
  2. check local AGENTS.md guidance
  3. place any new file in the correct module structure

Single Source Of Truth

Keep types, config, business rules, and database access in one clear layer each. Duplicated logic creates contradictions that both humans and agents will keep reinforcing.

Error Handling

  • user-facing errors should be clear and actionable
  • internal failures should include enough context to debug
  • error handling patterns should be consistent and documented
  • swallowed errors and generic catch-all behavior should be treated as drift

Mechanical Enforcement

Documentation alone is not enough. The best harnesses gradually promote important guidance into tooling:

  • AGENTS.md for discoverability
  • rules for persistent context
  • linters for naming and architecture constraints
  • structural tests for dependency boundaries and coverage expectations
  • recurring cleanup work that catches drift before it spreads

When prose keeps getting ignored, encode the constraint directly into the system.