Skip to content

Latest commit

 

History

History
258 lines (164 loc) · 11.8 KB

File metadata and controls

258 lines (164 loc) · 11.8 KB

Forge Rules (Machine Reference)

Compact reference for AI coding agents. Every rule includes the problem it solves. For full methodology, see forge_process.md. For the narrative origin story, see ../docs/forge_coding.md.


What This Is

Forge is specification-driven development for AI-assisted projects. Documentation is the primary artifact. The spec must be precise enough that a cold AI session (no prior context) produces a correct implementation without guessing.

Core doctrine: Any specification gap that forces an AI agent to guess becomes a defect baked into the output.

Generality: Forge applies to any spec-to-output pipeline -- software, policy, legal, compliance. The output type doesn't change the methodology.


Document Hierarchy

When documents conflict, follow this priority:

Constitution (immutable architectural laws)
  > Conventions (implementation rules)
    > Architecture Domain Docs (domain specifications)
      > Engineering Plan (build sequence and phases)

Higher-priority documents override lower.


Vocabulary Precision

Problem: Vague language produces wide probability distributions in LLM output. The agent fills ambiguity with training data defaults.

Fix: Use mechanism-focused terms that name the property and the mechanism, not just the outcome.

Vague (wide distribution) Precise (narrow distribution)
"separate things so they don't affect each other" "fault containment via process isolation"
"it should be fast" "hot-path latency budget: sub-millisecond"
"keep it safe" "capability-based security at the runtime boundary"
"the system handles different sources" "source-agnostic pipeline"

Reference: See ../reference/ambiguous_language_dictionary.md for a comprehensive catalog of probabilistically wide language patterns to avoid in binding contexts.


Decision Tracking

Every architecture domain doc maintains two sections:

Decided -- Numbered entries. Each has: title, description, rationale (WHY block where alternatives exist), session reference.

Open Questions -- Numbered entries with status annotations:

  • Discovery: Phase N -- will be answered during implementation
  • Post-V1 -- explicitly deferred
  • RESOLVED: Decision N -- answered, cross-referenced to the decision
  • Blocked on: [dependency] -- waiting for another decision

Open questions are not defects. Unmarked ambiguity is.


Inline Ambiguity Markers

Problem: An agent reads a paragraph that sounds definitive but contains an unresolved assumption.

Fix: [OQ-N] markers in the spec body link to Open Questions. The marker interrupts assumption formation at the exact point of ambiguity.


WHY Blocks

Problem: Agents optimize. Without rationale, they "improve" the design by removing constraints they don't understand.

Fix: WHY blocks explain why a decision was made and why alternatives were rejected.

Coverage rule: Required where a reasonable agent might choose differently. Self-evident decisions don't need WHY blocks.

Feedback loop: Agent guesses wrong -> FINDINGS.md captures the guess -> postmortem reviews -> wrong guess becomes a WHY block or TONIC entry -> next agent doesn't guess on that one.


TONIC Errors

Problem: Technically Obvious, Not Intended Choice. The agent picks the ecosystem default instead of the project's intentional non-default choice.

Fix: State what to use AND what NOT to use, with why. The conventions doc carries a TONIC prevention table:

Use Do NOT Use Why
prost (raw protobuf) tonic (gRPC framework) No gRPC on this side; UDS framing only
chi router gin chi is stdlib-compatible; gin uses custom context
Direct API calls LangChain Scoped role doesn't need orchestration framework
CatBoost XGBoost Native categorical feature handling without encoding

Each entry predicts an exact mistake and pre-corrects it. The TONIC table grows organically: you start with the choices you know will trip agents, cold validation runs surface more, cold code runs surface even more. In practice it ends up with 15-25 entries.

The pattern: whenever (1) a reasonable ecosystem default exists, (2) the project made a specific non-default choice, and (3) the docs don't explicitly exclude the default, an agent will gravitate toward the default. Explicitly forbid it.


The Project Constitution

Short document (~10-15 articles) of immutable architectural laws. Different from conventions (which tell HOW to write code) and WHY blocks (which explain individual decisions). The constitution tells the agent what the system CANNOT do.

Test: If violating this principle would require redesigning multiple subsystems, it's constitutional.

The constitution is derived, not constructed upfront. It crystallizes when you notice certain principles are load-bearing across multiple domains. You discover your immutable laws; you don't invent them on day one.

When an immutable law has a legitimate exception, document it inline with the article. Don't bury exceptions in separate files.


The Conventions Document

Locks implementation choices to prevent inconsistency across models and sessions:

  • Canonical dependencies with explicit prohibition of alternatives
  • Language-specific conventions (naming, error handling, module structure)
  • Contested language idioms with WHY blocks for the position taken
  • Cross-language conventions (IPC format, timestamps, error taxonomy)
  • Terminology enforcement (correct terms mapped to forbidden synonyms)

Artifact Ordering

Artifacts trail the thinking. Domain docs emerge first from design discussions. Conventions solidify as patterns emerge. The glossary locks down as terms settle. The constitution crystallizes last, when load-bearing principles become apparent. You don't define your terms then ideate, and you don't write a constitution then design. You ideate, and the artifacts follow.


Cascading Consistency

Problem: A decision in one domain has implications across others. One stale cross-reference can cause a module to be built against wrong assumptions.

Fix: After every change, propagate across all cross-references. Three levels:

  • Level 1: Manual (5 domains) -- identify and update affected domains
  • Level 2: Dependency Matrix (10 domains) -- inline cascade tags, structured tracking
  • Level 3: Graph (15+ domains) -- Neo4j-backed deterministic dependency tracking (see ../tooling/graph/)

The Validation Pipeline

PHASE 1: IDEATE        Discuss, explore, capture verbatim. User says "Forge it" to
                       trigger formalization of the current ideation state.
PHASE 2: FORMALIZE     Domain docs, conventions, glossary, constitution, engineering plan.
PHASE 3: STRUCTURAL    Graph analysis + dictionary lint. Fix mechanically detectable issues.
                       (Optional tooling -- see ../tooling/ and ../reference/)
PHASE 4: COLD VALIDATE Fresh session reads docs. Produces disposable plan. Surfaces questions.
                       Hot session answers + fixes docs. Repeat until questions dry up.
                       Run again with alternate model.
                       (See cold_validation_protocol.md)
PHASE 5: PLAN CHECK    Evaluate disposable plan against architecture docs. Throw plan away.
PHASE 6: COLD CODE     Fresh session builds from spec. Code is disposable validation probe.
                       Evaluate adherence. Fix docs. Run with alternate model.
                       (See cold_code_run_protocol.md)
PHASE 7: SIGNAL        Convergence across models. No more spec-driven deviations.
PHASE 8: IMPLEMENT     Code generation is translation, not design. FINDINGS.md captures
                       discoveries. Spec drift reconciliation over time.

Cold Validation Rules

A cold question is proof of a doc defect. If the cold session asked, the docs failed to communicate. The hot session must answer AND fix the doc.

Hot session resistance. The hot session will claim questions "don't need a doc fix." It's wrong. The hot session has context the docs don't carry. If the cold session asked, the docs are ambiguous.

The disposable plan is a forcing function. Producing a plan forces the model to confront every ambiguity. The questions are the actual output. The plan is a probe.

Open questions don't affect review quality. A tracked OQ with a status is the opposite of ambiguity -- it's the spec being honest about what it doesn't know yet. Some decisions can't be made until implementation.

See cold_validation_protocol.md for the full protocol.


Semantic Review

Problem: Structural validation catches relationship problems. It doesn't catch linguistic ambiguity -- "should" in a Decided entry, "handle gracefully" without defining what graceful means.

Fix: A semantic review pass scans binding contexts for probabilistically wide language. Flag and report, don't auto-fix -- context determines whether usage is genuinely ambiguous.

Run before cold validation. Different models flag different patterns.

See semantic_review.md for the protocol. See ../reference/ambiguous_language_dictionary.md for the detection vocabulary.


Implementation Discipline

Phase Sequencing

Complete Phase N before starting Phase N+1.

Exit Criteria

Every phase has verifiable exit criteria: specific files, passing tests, integration points, scope boundaries.

FINDINGS.md

Every spec ambiguity, contradiction, and guess documented during implementation. The postmortem reviews FINDINGS.md and feeds fixes back into the spec.

Spec Drift Reconciliation

When code-driven changes accumulate to a threshold (a release, a milestone, or when the spec and code no longer agree), batch-assess against the spec. The spec was written for machines to read; machines can check it.


Edge Case Analysis

After the architecture is substantively complete and cross-domain dependencies are mapped:

  1. Query interaction surfaces between domains
  2. For each surface: what goes wrong when timing, resources, state, or data are unexpected?
  3. Classify each edge case:
Level When Runtime Cost
CODE Common (>1%), predictable, auto-recoverable On the critical path
DEGRADE Uncommon (<1%), detectable, partially recoverable Triggered by threshold
ALERT Rare, detectable, needs human intervention Zero until triggered
ACCEPT Theoretical, impractical to prevent Zero

CODE-level cases become exit criteria. Don't code around ACCEPT-level cases that add measurable hot-path latency for scenarios occurring less than once per million operations.


Anti-Patterns

Methodology-Level

  • Silent guessing: Document every assumption in FINDINGS.md
  • Building all phases at once: Complete Phase N before starting Phase N+1
  • Skipping cascade updates: One stale cross-reference can misalign an entire module
  • Declaring "good enough" prematurely: The most common failure mode, especially solo

Common Project-Level (put these in your conventions doc)

  • Blanket lint suppression: Override specific lints with reasons, never suppress all
  • Hardcoded strings: Use localization keys
  • Delta metrics: Use absolute values
  • Bare timestamps: Name the event: created_at, not timestamp

Readiness Criteria

Documentation is ready when:

  1. Cold sessions stop asking basic questions and start asking edge cases
  2. Cross-model implementations converge on the same architecture
  3. Divergence points to genuinely open decisions, not spec gaps
  4. Code artifacts generate without drift from the spec
  5. Cascading consistency is verified
  6. TONIC errors are eliminated
  7. Graph validation passes clean (if applicable)
  8. Semantic grooming is current (if applicable)
  9. Edge cases cataloged and classified