Skip to content

kmlaborat/AnchorGen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AnchorGen

Large-scale, agent-driven editing decomposes into three independent problems:

Locate   — determine where to act
Generate — determine what to produce
Apply    — commit the change safely

AnchorGen is the Generate stage. It is a generic transformation engine: given a source string and a task description, it produces a result string — nothing more.

source → transform → result

AnchorGen does not know or care what kind of transformation is being requested. Code editing, translation, structured extraction, and ontology slot-filling are all the same operation from AnchorGen's perspective.


The Three Stages

Stage Question Owner Examples
Locate Where do I act? Orchestrator-specific FastContext, RAG, Sliding Bisection, scope maps, grep
Generate What do I produce? AnchorGen (this repo) Local LLMs, Claude, GPT, FastApply, PLaMo Translate, NuExtract
Apply How do I commit it safely? AnchorEdit / AnchorScope Hash-verified anchor replacement

AnchorGen has no opinion on Locate or Apply. It is composed with them by an orchestrator (e.g. a future pi-anchorgen), but does not depend on either.


Why a Generic Engine?

The same shape — source → task → result — covers:

{ "type": "edit", "instruction": "add a NaN check" }
{ "type": "translate", "from": "ja", "to": "en" }
{ "type": "extract", "schema": { "...": "..." } }
{ "type": "fill_slots", "ontology": { "...": "..." } }

A code edit, a translation, a structured extraction, and an ontology slot-filling operation are not different problems requiring different engines. They are the same problem — produce a result from a source given a task — with different task payloads and different backends.


Core API

pub struct GenerationInput {
    pub source: String,
    pub task: serde_json::Value,
}

pub struct GenerationOutput {
    pub result: String,
}

#[async_trait]
pub trait Generator {
    async fn generate(
        &self,
        input: GenerationInput,
    ) -> Result<GenerationOutput, GenerationError>;
}

See docs/SPEC.md for the full specification, including GenerationError, the Task Convention, and Guarantees.


How AnchorGen Fits With AnchorScope and AnchorEdit

Orchestrator (e.g. pi-anchorgen)
  ↓ read content                                    (AnchorScope, or built-in read)
  ↓ Generator::generate(source, task) → result       (AnchorGen)
  ↓ anchoredit_apply(file, anchor, content=result)    (AnchorEdit → AnchorScope)

Each layer is independently testable and replaceable:

  • AnchorScope verifies — exact byte-level matching, hash-verified writes
  • AnchorEdit applies — a verified edit, given an anchor and replacement
  • AnchorGen generates — a replacement, given a source and a task

None of the three depends on the others. An orchestrator composes them.

Practical note: In large repositories, orchestrators will often invoke Locate before Generate and pass only the relevant scope as source. This is not required by AnchorGen itself — passing an entire file as source is valid — but it typically improves scalability and reduces generation cost. Early prototypes that skip Locate tend to hit backend-specific size limits (token limits, context windows) sooner than pipelines that localize first.


A Note on AnchorScope v1

AnchorScope v1 maintained persistent scope maps and multi-level anchor identities for navigating large files. This was removed in v2 — not because the idea was wrong, but because it belonged to the Locate stage, not Generate or Apply. Those ideas remain valid for Locate-stage use cases such as RAG, where edits don't invalidate the map. See docs/SPEC.md Section 2 and Section 9 for details.


Reference Implementation

EchoGenerator — returns source unchanged, ignoring task. Exists to validate the trait shape.

pub struct EchoGenerator;

#[async_trait]
impl Generator for EchoGenerator {
    async fn generate(
        &self,
        input: GenerationInput,
    ) -> Result<GenerationOutput, GenerationError> {
        Ok(GenerationOutput { result: input.source })
    }
}

Planned Adapters

Adapter Backend Task kind(s) Status
LocalLlmGenerator Local model (Qwen, Gemma, etc.) edit, general-purpose Planned
ClaudeGenerator Anthropic API edit, general-purpose Planned
GptGenerator OpenAI API edit, general-purpose Planned
FastApplyGenerator Fast Apply model edit (diff-style) Prototype exists as pi-fa-merge
PlamoTranslateGenerator PLaMo Translate translate Planned
NuExtractGenerator NuExtract model extract Planned

Adding an adapter never requires changing the core trait.

pi-fa-merge was built independently, before AnchorGen's Generator trait was finalized, as a standalone pi extension combining FastApply-style merging with anchoredit_apply. It validated the source → task → result shape in practice — including the scalability tradeoff described above — before AnchorGen's abstraction was written down. It is expected to be reintegrated as FastApplyGenerator once AnchorGen's core is implemented.


Status

v2.0.0 — Specification complete. Reference implementation (EchoGenerator) in progress. v1 (coupled to AnchorScope's old pipe command) is archived in v1/.

License

MIT License

About

A generator tool for the “Anchor” series

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages