ContinuumAI

Solving amnesia for coding agents. An open-weight stack with a learned-from-failure skill loop, measured at roughly 80 % of the top frontier model's score on Terminal-Bench 2.1 — at a fraction of the inference cost.

Read the launch post

→ BLOG.md — AI agents are amnesiacs. First post in a three-part series. Covers the amnesia problem, the skill-generation loop, the Terminal-Bench 2.1 measurement, and what comes next.

Headline result

Stack	TB-2.1 score	Lift
Claude Opus 4.8 (TB-2.1 leaderboard #2, frontier reference)	78.9 %	—
GLM-5.1 baseline (no skill)	59.4 %	—
GLM-5.1 + GLM-authored skill	62.8 %	+3.4 %
GLM-5.1 + Opus-authored skill	64.0 %	+4.6 %

89 long-horizon coding and systems tasks; K=3 attempts per task; 87 measured under every condition. Skills are generated only from failed runs, not from worked solutions — no solution leakage between skill authoring and measurement.

The series

Post	Topic	Status
Post 1	The amnesia problem and the accuracy measurement	Live in BLOG.md
Post 2	Cost economics: per-attempt and per-passed-task dollars	Coming
Post 3	Inside individual SKILL.md files — what each one teaches, and the regression cases	Coming after Post 2

The mechanism in one paragraph

After each agent session ends, a Reflector LLM call reads the session trajectory and extracts the single load-bearing lesson from the failure. A SkillCreator LLM call writes that lesson as a short Markdown file (SKILL.md). The next related session loads the file before starting. The library grows by one per session — no model weights touched, no fine-tuning, no RL, no vector store, no human curation step. Skills are plain Markdown files that work with any model your routing layer (OpenRouter, AWS Bedrock, a direct provider) hands the next request to.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
assets		assets
BLOG.md		BLOG.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ContinuumAI

Read the launch post

Headline result

The series

The mechanism in one paragraph

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

ContinuumAI

Read the launch post

Headline result

The series

The mechanism in one paragraph

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages