Bikin AI agent production-ready dalam 60 detik. Input deskripsi → output system prompt + tool definitions + test cases. Export ke Cursor, Hermes Agent, MCP, ChatGPT GPT, OpenAI Functions, Anthropic Tools.
The hardest part of building agents isn't the framework — it's the spec. A shallow system prompt loses to a well-structured one by 40-60% on real benchmarks. Most builders skip this step because writing rigorous prompts + tool schemas takes hours.
AgentForge does it in 60 seconds. Powered by Xiaomi MiMo V2.5 (reasoning + structured-output flagship), it converts a 1-3 sentence description into:
- ✅ System prompt — production-ready (400-1500 words), role + behavior + constraints + output format + edge case handling + anti-hallucination clauses
- ✅ Tool definitions — OpenAI/Anthropic/MCP-compatible JSON Schema, only the tools that make sense for the agent
- ✅ Test cases — 4-8 input/expected-behavior pairs covering happy path + edge cases + adversarial inputs
- ✅ Multi-format export — Cursor Rules, Hermes Skill, MCP Server boilerplate (TypeScript), Claude Code prompt, ChatGPT custom GPT, OpenAI Functions JSON, Anthropic Tools JSON
Try the presets (Customer Service Bot, Code Reviewer, Trading Sniffer, Hermes Skill Maker, ...) or describe your own agent in plain Indonesian/English.
- Frontend: Next.js 14 (App Router) + Tailwind CSS + TypeScript
- Reasoning model: Xiaomi MiMo V2.5 Pro (configurable via env)
- API: OpenAI-compatible streaming chat completions
- Deploy: Vercel Edge Runtime (60s max duration)
- Storage: localStorage (no DB, no auth, fully client-side state)
Requires Node.js 18+.
git clone https://github.com/agentforge-dev/agentforge.git
cd agentforge
npm install
# Configure model provider (any OpenAI-compatible endpoint works)
cat > .env.local <<EOF
# Xiaomi MiMo V2.5 (recommended)
OPENAI_API_KEY=mimo-XXXXXX
OPENAI_BASE_URL=https://api.xiaomimimo.com/v1
MODEL=xiaomi/mimo-v2.5-pro
# Or OpenAI:
# OPENAI_API_KEY=sk-XXXXXX
# OPENAI_BASE_URL=https://api.openai.com/v1
# MODEL=gpt-4o
# Or Anthropic-compatible (via gateway):
# OPENAI_API_KEY=sk-ant-XXXXXX
# OPENAI_BASE_URL=https://anthropic-gateway.example/v1
# MODEL=claude-sonnet-4.5
EOF
npm run dev
# → http://localhost:3025- Fork this repo
- Click Deploy to Vercel (or
vercel deployfrom CLI) - Add env vars in Vercel project settings:
OPENAI_API_KEY— your MiMo / OpenAI / Anthropic-compat keyOPENAI_BASE_URL— endpoint URL (defaults to MiMo)MODEL— model id (defaults toxiaomi/mimo-v2.5-pro)
- Done. Vercel handles edge runtime + global CDN automatically.
| Format | File | Use with |
|---|---|---|
| Hermes Skill | SKILL.md |
Hermes Agent skill registry |
| Cursor Rules | .cursorrules |
Cursor IDE project rules |
| MCP Server | server.ts |
Model Context Protocol server stub |
| Claude Code | agent.md |
Claude Code persona file |
| ChatGPT GPT | gpt.txt |
Custom GPT instructions field |
| OpenAI Functions | request.json |
tools parameter for /v1/chat/completions |
| Anthropic Tools | request.json |
tools parameter for Messages API |
Input: "Code review agent untuk PR Python/TypeScript. Cek security (SQL injection, XSS), performance, code style. Output structured: severity + line + suggestion."
Output (Hermes Skill format, abridged):
---
name: code-reviewer
description: Reviewer PR Python/TypeScript with security/perf/style checks, structured output
---
# CodeReviewer
## System prompt
You are a senior security-aware code reviewer. Analyze diffs in Python or TypeScript.
For each issue, output a structured finding:
- severity: critical | high | medium | low | info
- file: path
- line: line number
- category: security | performance | style | correctness
- issue: <description>
- suggestion: <concrete fix>
- confidence: 0.0-1.0
Security focus: SQL injection (parameterized queries), XSS (escaping/sanitization),
SSRF, path traversal, hardcoded secrets, unsafe deserialization, ...
## Tools
- `read_file(path: string)` — fetch file content for context
- `run_static_analysis(path: string, lang: 'py'|'ts')` — invoke ruff/eslint
- `git_blame(path: string, line: number)` — see who introduced the line
## Test cases
1. Input: `<diff with SQL string concat>` → flag as critical SQL injection
2. Input: `<clean diff>` → empty findings array
3. Input: `<TS file with eval>` → flag as high security
...This project itself was built using AI agent tooling — the same workflow AgentForge optimizes:
- Hermes Agent — workflow orchestration (skills, terminal, file ops)
- Claude Code (Sonnet 4.7) — TypeScript / React authoring
- 9router — local OpenAI-compatible LLM proxy for dev
- MiMo V2.5 — runtime spec generation (will become primary post-grant)
- Cursor — fine-grained code edits
- Vercel — preview deployments
- Tailwind CSS + shadcn/ui patterns — UI
- Multi-format export
- Streaming generation UX
- Live demo on Vercel
- Spec versioning + diff viewer (git-style)
- Auto-evaluator: run
testCasesagainst spec on N models, score - Multi-agent compositions (orchestrator + workers as a graph)
- 1-click deploy: spec → Cloudflare Worker / Vercel function
- Skill marketplace: share + remix specs (community CC0/MIT)
MIT — see LICENSE
Built with ❤️ by Klein · powered by Xiaomi MiMo V2.5