Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,6 @@ jobs:

- name: Build
run: npm run build

- name: Test
run: npm test
31 changes: 31 additions & 0 deletions README.ko.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# DUUL

[![npm version](https://img.shields.io/npm/v/@planningo/duul.svg)](https://www.npmjs.com/package/@planningo/duul)
[![npm downloads](https://img.shields.io/npm/dm/@planningo/duul.svg)](https://www.npmjs.com/package/@planningo/duul)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![CI](https://github.com/Planningo/duul/actions/workflows/ci.yml/badge.svg)](https://github.com/Planningo/duul/actions/workflows/ci.yml)
[![Node](https://img.shields.io/node/v/@planningo/duul.svg)](https://nodejs.org)

**D**ual-phase **U**pfront-plan & **U**nit-verify **L**oop — LLM을 개발 계획 및 코드의 리뷰어로 활용하는 MCP 서버. OpenAI, Anthropic, Google, OpenRouter 및 OpenAI 호환 프로바이더를 지원합니다.

> [English README](./README.md)
Expand Down Expand Up @@ -227,6 +233,31 @@ npm run build

---

## 비용 & 성능

이 레포에서 실제 DUUL을 돌리며 측정한 수치 (리뷰어 호출 42회, gpt-5.4, prompt caching 활성). 프로젝트 크기·리뷰 복잡도에 따라 달라지므로 대략적인 예산 가이드로 활용하세요.

| 툴 | 호출당 평균 토큰 | 호출당 평균 비용 | 캐시 hit rate |
|----|-----------------:|----------------:|--------------:|
| `plan_review` | 100,966 | **$0.065** | 79% |
| `code_review` | 179,837 | **$0.122** | 79% |
| **전체 평균** | 132,890 | **$0.088** | 79% |

일반적인 작업 (plan 1~3라운드 + code 1~2라운드)은 보통 **$0.30~$0.50** 정도의 리뷰어 비용이 듭니다.

**비용 절감 요인:**
- Anthropic / OpenAI prompt caching (반복 세션에서 약 30% 절감, cache read는 input의 0.1× 가격)
- 도구별 모델 오버라이드 (`reviewer_config.model = { code: "claude-opus-4" }`로 code만 강화)
- 옵션: 파일 읽기 예산 (`DUUL_MAX_REVIEWER_BYTES`)으로 비용 상한 강제

**직접 측정하기:**
```bash
node scripts/token-report.mjs --plan max20 --all-time
```
`~/.duul/usage.jsonl`(MCP env에 `DUUL_DEBUG_TOKEN=1` 설정 시 로깅 활성)과 `~/.claude/projects/<encoded-cwd>/*.jsonl`을 합쳐서 Claude Code + 리뷰어 통합 breakdown을 보여줍니다.

---

## 작동 방식

### 전체 리뷰 루프
Expand Down
31 changes: 31 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# DUUL

[![npm version](https://img.shields.io/npm/v/@planningo/duul.svg)](https://www.npmjs.com/package/@planningo/duul)
[![npm downloads](https://img.shields.io/npm/dm/@planningo/duul.svg)](https://www.npmjs.com/package/@planningo/duul)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![CI](https://github.com/Planningo/duul/actions/workflows/ci.yml/badge.svg)](https://github.com/Planningo/duul/actions/workflows/ci.yml)
[![Node](https://img.shields.io/node/v/@planningo/duul.svg)](https://nodejs.org)

**D**ual-phase **U**pfront-plan & **U**nit-verify **L**oop — an MCP server that uses LLMs as peer reviewers for development plans and code. Supports OpenAI, Anthropic, Google, OpenRouter, and any OpenAI-compatible provider.

> [한국어 README](./README.ko.md)
Expand Down Expand Up @@ -227,6 +233,31 @@ Each review request can include a `reviewer_config` object to override provider

---

## Cost & Performance

Empirical numbers from real DUUL usage in this repo (42 reviewer calls, gpt-5.4, prompt caching enabled). Treat as a rough budgeting guide — your numbers will vary with project size and review complexity.

| Tool | Avg tokens/call | Avg cost/call | Cache hit rate |
|------|----------------:|--------------:|---------------:|
| `plan_review` | 100,966 | **$0.065** | 79% |
| `code_review` | 179,837 | **$0.122** | 79% |
| **Combined avg** | 132,890 | **$0.088** | 79% |

A typical task (1–3 plan rounds + 1–2 code rounds) usually lands around **$0.30–$0.50** in reviewer cost.

**What drives cost down:**
- Anthropic / OpenAI prompt caching (~30% reduction on iterating sessions; cache reads billed at 0.1× input rate)
- Per-tool model override (`reviewer_config.model = { code: "claude-opus-4" }` to escalate code-only)
- Optional file-read budget (`DUUL_MAX_REVIEWER_BYTES`) for hard cost ceilings

**Measure your own usage:**
```bash
node scripts/token-report.mjs --plan max20 --all-time
```
Reads `~/.duul/usage.jsonl` (set `DUUL_DEBUG_TOKEN=1` in your MCP env to enable logging) and `~/.claude/projects/<encoded-cwd>/*.jsonl` for combined Claude Code + reviewer breakdown.

---

## How It Works

### Full Review Loop
Expand Down
Loading