Stop AI coding agents from saying "done" when your repo has no evidence.
Devflow Native is a repo-local evidence gate, handoff layer, and reviewed mistake-memory loop for Codex, Claude Code, and shell sessions. It does not write code for you. It records what agents changed, what they actually verified, which mistakes they keep repeating, and what the next session should pick up.
Agent: done.
Devflow: not yet.
- no gate evidence recorded
- no review evidence recorded
- repeated mistake candidate: skipped the repo's PowerShell-safe command rule
Next: run the configured gate or record why it was skipped.
AI coding agents are getting better at generating code. The next bottleneck is often trust and continuity:
- What changed in the last session?
- Which tests, typechecks, builds, or reviews actually ran?
- What failed or was skipped?
- Which repo docs and project rules should the next agent trust?
- Is it really safe to say the task is done?
- Did a short maintainer command such as
ㄱㄱor끝내mean continue, finish, review, or hand off?
Long context, chat history, and session compaction help, but they are not the same as project-local workflow state. Devflow keeps that state in the repo so a new Codex, Claude Code, shell, or human review session can resume without rediscovering everything from scratch.
Devflow does not just remember that a session happened. It records repeated agent mistakes such as shell mismatch, unsafe commands, skipped setup, encoding issues, path handling failures, or wrong finish claims.
A mistake can stay local as evidence, become a promotion candidate after it
repeats, and only then be reviewed into AGENTS.md, a Devflow skill, or a hook
rule. The next agent session starts with that repo-specific lesson instead of
repeating the same failure.
Devflow is not trying to replace the tools around it.
| Tool layer | What it owns |
|---|---|
| Codex / Claude Code | Run coding agents inside the repo. |
| Claude hooks / Codex skills | Add host-specific automation and instructions. |
| Superpowers | Teach workflow habits such as TDD, debugging, planning, and review. |
| TaskMaster-style tools | Track tasks and agent work queues. |
| Devflow Native | Record repo-local evidence, block unsafe finish claims, promote reviewed mistake rules, and generate next-session handoffs. |
In plain terms: Devflow asks the repo to remember enough evidence that the next agent does not have to guess where the previous one stopped, and the current agent cannot honestly claim "done" without proof.
- Creates a
.devflow/config.jsonproject contract with gates and review policy. - Installs and checks local Codex/Claude plugin, hook, and MCP harness files.
- Shows repo status, changed files, work/session state, gates, and latest handoff.
- Records review evidence and gate evidence before work is called done.
- Provides
finish --dry-runto check whether a task can honestly be claimed complete. - Generates the prompt the next agent session should continue from.
- Captures repeated agent mistakes, aggregates repeated observations, and promotes durable repo-local rules, skills, or hooks only after review evidence.
The important path is agent-native setup. Devflow ships local harness files for
both Codex and Claude Code: plugin manifests, skills/commands, hooks, and MCP
configuration. A global npm install gives you the devflow CLI; the agent
experience starts when a target repo has the Devflow harness installed and the
agent host is restarted or reloaded if required.
Open Codex or Claude Code in the target repo and paste:
Install Devflow Native for this repository.
Inspect the repo first. Preserve existing AGENTS.md, CLAUDE.md, README, tests,
and project rules. Use npx devflow-native@latest if devflow is not already
installed.
Initialize the Devflow scaffold when missing, install only missing Codex/Claude
harness files, run doctor/status/harness health, and tell me exactly what files
changed and whether I need to restart the agent host.
Maintainer shorthand is part of the intended workflow. The prompt hook maps short commands into workflow intent and gives the agent the next Devflow actions to run:
ㄱㄱ,진행해,계속,continue,next,go-> continue/start fromdevflow status --jsonanddevflow prompt latest.끝내,마무리,완료,finish,done-> rundevflow finish --guidedand follow any review or gate blockers before claiming completion.다음 세션 프롬프트 줘,여기까지,handoff-> create a handoff withdevflow status --jsonanddevflow prompt next.리뷰,pr,pull request-> request and record review evidence.html,리포트,보드-> inspect status first and generate an artifact only when it is explicitly useful.
When a prompt contains mixed signals, Devflow uses this priority:
finish > handoff > review/pr > artifact > continue. The agent still has to
inspect repo state, run required gates, and record review evidence before
claiming that work is complete.
What to expect after Quick Try:
devflowworks as a CLI throughnpxor a global npm install.- The target repo gets local
.devflow/project state and, when confirmed, localplugins/devflow/harness files for Codex and Claude Code. - Codex can install Devflow from this repo's
.agents/plugins/marketplace.jsonand load its bundled skills, MCP server, and hooks after restart or a new thread. - Claude Code can install Devflow from the same repo marketplace and load its
bundled skills, commands, MCP server, and hooks after restart or
/reload-plugins. - Future sessions in that repo receive compact Devflow context automatically.
Host-native install commands:
codex plugin marketplace add Sungblab/devflow-native
codex plugin add devflow@devflow-native-local
claude plugin marketplace add Sungblab/devflow-native
claude plugin install devflow@devflow-native-localMCP-only fallback:
codex mcp add devflow -- npx --yes devflow-native@latest mcp stdio
claude mcp add devflow -- npx --yes devflow-native@latest mcp stdioThe fallback exposes tools only. The full Devflow experience requires the plugin or repo-local harness so skills, commands, and hooks are loaded too.
npx devflow-native@latest --help
npx devflow-native@latest init --confirm
npx devflow-native@latest harness install --confirm
npx devflow-native@latest harness health
npx devflow-native@latest status --simpleharness install --confirm keeps generated plugins/devflow/ harness files
local by default by adding them to .gitignore. Use --repo-visible only when
the target repository should commit those plugin files as part of its public
development workflow.
For repeated local use, a global install is still fine:
npm install -g devflow-native
devflow harness healthGlobal npm install is not the same thing as registering an agent plugin. It
only makes the devflow command available everywhere. Each Codex or Claude Code
environment still needs the Devflow harness/plugin registration that the Quick
Try prompt or devflow harness install --confirm verifies.
To update an existing install:
devflow update
npm install -g devflow-native@latest
devflow --version
devflow harness healthFor one-off use without changing a global install, run:
npx devflow-native@latest --version
npx devflow-native@latest update- It is not an autonomous coding agent.
- It does not replace Codex, Claude Code, Superpowers, git, tests, or PR review.
- It does not treat HTML dashboards or generated artifacts as source of truth.
- It is not tied to one agent runtime or one workflow methodology.
Codex or Claude Code opens the repo
-> Devflow session-start hook injects compact repo context
Maintainer says "continue" or "next"
-> Devflow prompt hook classifies workflow intent
-> the agent starts from status, active work, and handoff state
Maintainer says "finish" or "review"
-> Devflow finish flow checks docs impact, gates, risks, and next prompt
-> completion evidence is recorded in .devflow/state/events.jsonl
An agent repeats a repo-specific mistake
-> Devflow detects or records the mistake in .devflow/mistakes.json
-> repeated high-confidence observations become promotion candidates
-> promote --dry-run shows patch candidates without editing durable files
-> review evidence gates promote --apply into AGENTS.md, a Devflow skill, or a hook/config rule
Runtime state such as .devflow/state/ and .devflow/next-prompt.md is local
by default. Public project contracts such as .devflow/config.json can be
committed when a repository wants to adopt Devflow as part of its workflow.
Devflow Native is dogfooded against OpenCairn, a larger Windows/PowerShell monorepo, to keep the product honest on mature-repo adoption instead of only greenfield demos.
Latest local smoke, run from C:\Users\Sungbin\Documents\GitHub\opencairn-monorepo:
devflow harness inspect --json
devflow harness health --json
devflow gates run docs-check --work local-work --json
devflow finish --jsonObserved result on 2026-05-29:
- Codex and Claude harness targets reported
ready. - Harness health reported
status: ok; plugin manifests, MCP config, hook scripts, andreview.requiredpassed. docs-checkpassed when recorded againstlocal-work.finishkeptcanClaimDone: falseuntil review evidence was recorded, which is the intended guardrail.
This is a smoke record, not a performance benchmark. It proves the current harness can install, inspect, run hooks, record gate evidence, and block unsafe finish claims in a real repository.
devflow --help
devflow update
devflow doctor --platform windows-powershell --json
devflow status --simple
devflow finish --guided
devflow prompt latest
devflow mistakes detect --platform windows-powershell --command "node script.mjs << 'EOF'" --stderr "ParserError: Missing file specification after redirection operator." --record --json
devflow mistakes promote --id powershell-bash-heredoc-redirection --target agents --dry-run --json
devflow mistakes review --id powershell-bash-heredoc-redirection --status approved --summary "Repeated PowerShell heredoc correction is repo-relevant." --json
devflow mistakes promote --id powershell-bash-heredoc-redirection --target skill --apply --json
devflow mistakes rules --json
devflow harness health
devflow mcp stdio- Quickstart
- Release Checklist
- Product Plan
- Architecture
- Repeated Mistake Loop
- Harness
- Research Boundary
- Open Source Promotion Plan
- Roadmap
packages/core shared product model, local state, gates, handoff contracts
packages/cli terminal command surface over core contracts
packages/mcp MCP handler and stdio transport over the same contracts
packages/adapters agent/session history adapters
plugins/devflow dogfood Codex and Claude Code plugin drafts
docs product, architecture, roadmap, examples, and public notes
.devflow dogfood project contract; runtime state is gitignored
The current v0.1 foundation release includes the npm package, CLI, MCP handler, repo-local Codex/Claude plugin drafts, hooks, finish guard, and the first review-gated repeated-mistake promotion loop. Hosted sync, richer artifact generation, broader adapter coverage, and more detector families are later work.
Research notes, paper drafts, evaluation fixtures, and non-public data live in a separate private repository. This public repository contains product implementation and public product documentation only.