Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
**Common scripts (see `package.json` for all):**

- `pnpm run:[cli-name-here]`
- `pnpm ai:usage` (summarize Claude/Codex usage logs for a repo)
- `pnpm typecheck`
- `pnpm lint` (use `pnpm lint:fix` if errors are auto-fixable)
- `pnpm format` / `pnpm format:check`
Expand Down
30 changes: 29 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# cli-agent-sandbox

A minimal TypeScript CLI sandbox for testing agent workflows and safe web scraping. This is a single-package repo built with [`@openai/agents`](https://github.com/openai/openai-agents-js), and it includes a guestbook demo, a Finnish name explorer CLI, a publication scraping pipeline with a Playwright-based scraper for JS-rendered pages, an ETF backtest CLI, an agent evals CLI, and agent tools scoped to `tmp` with strong safety checks.
A minimal TypeScript CLI sandbox for testing agent workflows and safe web scraping. This is a single-package repo built with [`@openai/agents`](https://github.com/openai/openai-agents-js), and it includes a guestbook demo, a Finnish name explorer CLI, a publication scraping pipeline with a Playwright-based scraper for JS-rendered pages, an ETF backtest CLI, an agent evals CLI, an AI usage summary CLI, and agent tools scoped to `tmp` with strong safety checks.

## Quick Start

Expand All @@ -13,6 +13,7 @@ A minimal TypeScript CLI sandbox for testing agent workflows and safe web scrapi
7. (Optional) Explore Finnish name stats: `pnpm run:name-explorer -- --mode ai|stats`
8. (Optional) Run publication scraping: `pnpm run:scrape-publications -- --url="https://example.com"`
9. (Optional) Run ETF backtest: `pnpm run:etf-backtest -- --isin=IE00B5BMR087` (requires Python setup below)
10. (Optional) Summarize AI usage: `pnpm ai:usage --since 7d`

### Python Setup (for ETF backtest)

Expand All @@ -34,6 +35,7 @@ pip install numpy pandas torch
| `pnpm run:name-explorer` | Explore Finnish name statistics (AI Q&A or stats) |
| `pnpm run:scrape-publications` | Scrape publication links and build a review page |
| `pnpm run:etf-backtest` | Run ETF backtest + feature optimizer (requires Python) |
| `pnpm ai:usage` | Summarize Claude/Codex token usage for a repo |
| `pnpm typecheck` | Run TypeScript type checking |
| `pnpm lint` | Run ESLint for code quality |
| `pnpm lint:fix` | Run ESLint and auto-fix issues |
Expand Down Expand Up @@ -100,6 +102,24 @@ pnpm run:agent-evals -- --suite=example
pnpm run:agent-evals -- --all
```

## AI usage

The `ai:usage` CLI summarizes Claude and Codex token usage for a repo based on local logs and `ai-usage.pricing.json`.

Usage:

```
pnpm ai:usage
pnpm ai:usage --since 24h
pnpm ai:usage --since 30d --repo /path/to/repo
pnpm ai:usage --json
```

Notes:

- Defaults to the last 7 days for the current git repo (or `cwd` when not in a git repo).
- Log sources: `~/.claude/projects/<encoded-repo>/` and `$CODEX_HOME/sessions` or `~/.codex/sessions`.

## Tools

File tools are sandboxed to the `tmp/` directory with path validation to prevent traversal and symlink attacks. The `fetchUrl` tool adds SSRF protections and HTML sanitization, and `runPython` executes whitelisted Python scripts from a configured directory.
Expand All @@ -123,6 +143,14 @@ File tools are sandboxed to the `tmp/` directory with path validation to prevent
```
src/
├── cli/
│ ├── ai-usage/
│ │ ├── main.ts # AI usage CLI entry point
│ │ ├── README.md # AI usage CLI docs
│ │ ├── ai-usage.pricing.json # Model pricing lookup
│ │ ├── constants.ts # CLI constants
│ │ ├── types/ # CLI schemas
│ │ │ └── schemas.ts # CLI args + pricing schemas
│ │ └── clients/ # Pipeline + log readers + aggregation + formatting
│ ├── agent-evals/
│ │ ├── main.ts # Agent evals CLI entry point
│ │ ├── README.md # Agent evals CLI docs
Expand Down
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
"run:scrape-publications": "tsx src/cli/scrape-publications/main.ts",
"run:etf-backtest": "tsx src/cli/etf-backtest/main.ts",
"run:agent-evals": "tsx src/cli/agent-evals/main.ts",
"ai:usage": "tsx src/cli/ai-usage/main.ts",
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description contains contradictory statements about dependencies. It first says "added. (New dependencies for logging and argument parsing.)" but then later states "No new dependencies and no network calls." Looking at the package.json, no new dependencies were actually added. Consider updating the PR description to remove the contradictory statement about new dependencies being added.

Copilot uses AI. Check for mistakes.
"scaffold:cli": "tsx scripts/scaffold-cli.ts",
"node:tsx": "node --disable-warning=ExperimentalWarning --import tsx",
"typecheck": "tsc --noEmit",
Expand Down
75 changes: 75 additions & 0 deletions src/cli/ai-usage/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# AI Usage CLI

Summarize Claude and Codex token usage for a repo, including estimated costs from
`ai-usage.pricing.json`.

## Run

```bash
# Default: last 7 days for current git repo (or cwd if not in git)
pnpm ai:usage

# With options
pnpm ai:usage --since 24h
pnpm ai:usage --since 30d --repo /path/to/repo
pnpm ai:usage --json
pnpm ai:usage --debug
```

## Arguments

- `--since` (optional): time window to include. One of `1h`, `24h`, `7d`, `30d`.
- `--repo` (optional): path to repo to match against log cwd.
- `--json` (optional): emit JSON instead of the summary + table.
- `--debug` (optional): verbose logging about discovery and filtering.

## Log Sources

- **Claude:** `~/.claude/projects/<encoded-repo>/` JSONL logs
- **Codex:** `$CODEX_HOME/sessions` or `~/.codex/sessions` (YYYY/MM/DD folders)

Only entries whose `cwd` matches the repo path are counted.

## Output

- Summary by provider and by model.
- Markdown table with input/output/cache tokens, totals, and estimated cost.
- If a model is missing from `ai-usage.pricing.json`, cost is `0` and a warning is printed.

## Internals

- `UsagePipeline` owns repo resolution, log collection, aggregation, and formatting.
- `OutputFormatter` returns strings (summary/table/JSON); `main.ts` prints the report.

## Flow

```mermaid
flowchart TD
A[Discover logs] --> B[Filter by repo + since]
B --> C[Aggregate tokens]
C --> D[Apply pricing]
D --> E[Render summary + table]
```

## Example Result

```text
AI Usage Summary (Last 30d)

By Provider:
claude: 216,462,575 tokens ($224.87)
codex: 73,995,660 tokens ($82.01)

By Model:
claude-opus-4-5-20251101: 216,462,575 tokens ($224.87)
gpt-5.2-codex: 73,370,636 tokens ($81.89)
gpt-5.1-codex-mini: 625,024 tokens ($0.12)

| Provider | Model | Input | Output | Cache R | Cache W | Total | Est. Cost |
|----------|--------------------------|------------|---------|-------------|------------|-------------|-----------|
| claude | claude-opus-4-5-20251101 | 267,949 | 47,358 | 204,118,277 | 12,028,991 | 216,462,575 | $224.87 |
| codex | gpt-5.2-codex | 37,768,935 | 691,877 | 34,909,824 | 0 | 73,370,636 | $81.89 |
| codex | gpt-5.1-codex-mini | 430,304 | 5,280 | 189,440 | 0 | 625,024 | $0.12 |
|----------|--------------------------|------------|---------|-------------|------------|-------------|-----------|
| TOTAL | | 38,467,188 | 744,515 | 239,217,541 | 12,028,991 | 290,458,235 | $306.89 |
```
38 changes: 38 additions & 0 deletions src/cli/ai-usage/ai-usage.pricing.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
{
"unit": "per_1m_tokens",
"models": {
"claude-opus-4-5-20251101": {
"input": 5.0,
"output": 25.0,
"cacheRead": 0.5,
"cacheWrite": 10.0
},
"claude-sonnet-4": {
"input": 3.0,
"output": 15.0,
"cacheRead": 0.3,
"cacheWrite": 6.0
},
"claude-haiku-4-5": {
"input": 1.0,
"output": 5.0,
"cacheRead": 0.1,
"cacheWrite": 2.0
},
"gpt-5.2-codex": {
"input": 1.75,
"cacheRead": 0.175,
"output": 14.0
},
"gpt-5-mini": {
"input": 0.25,
"cacheRead": 0.025,
"output": 2.0
},
"gpt-5.1-codex-mini": {
"input": 0.25,
"output": 2.0,
"cacheRead": 0.025
}
}
}
86 changes: 86 additions & 0 deletions src/cli/ai-usage/clients/claude-log-reader.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
import fs from "node:fs/promises";
import path from "node:path";
import { TMP_ROOT } from "~tools/utils/fs";
import { afterEach, describe, expect, it } from "vitest";

import { ClaudeLogReader } from "./claude-log-reader";

const mockLogger = {
debug: () => {
/* empty */
},
} as never;

const encodeRepoPath = (repoPath: string) =>
repoPath.replace(/\\/g, "/").replace(/\//g, "-");

const since = new Date("2024-01-01T00:00:00.000Z");
const repoPath = "/repo";

const buildEntry = (overrides: Record<string, unknown> = {}) => ({
type: "assistant",
timestamp: "2025-01-01T00:00:00.000Z",
message: {
model: "claude-3",
usage: {
input_tokens: 1,
output_tokens: 2,
cache_creation_input_tokens: 0,
cache_read_input_tokens: 0,
},
},
...overrides,
});

const writeLogFile = async (lines: unknown[]) => {
await fs.mkdir(TMP_ROOT, { recursive: true });
const baseDir = await fs.mkdtemp(path.join(TMP_ROOT, "vitest-claude-"));
const projectDir = path.join(baseDir, encodeRepoPath(repoPath));
await fs.mkdir(projectDir, { recursive: true });
const filePath = path.join(projectDir, "session.jsonl");
const content = lines.map((line) => JSON.stringify(line)).join("\n");
await fs.writeFile(filePath, content, "utf8");
return baseDir;
};

describe("ClaudeLogReader cwd filtering", () => {
let baseDir = "";

afterEach(async () => {
if (baseDir) {
await fs.rm(baseDir, { recursive: true, force: true });
baseDir = "";
}
});

it("skips entries missing cwd", async () => {
baseDir = await writeLogFile([buildEntry()]);
const reader = new ClaudeLogReader({
logger: mockLogger,
basePath: baseDir,
});
const records = await reader.getUsage({ since, repoPath });
expect(records).toHaveLength(0);
});

it("skips entries with mismatched cwd", async () => {
baseDir = await writeLogFile([buildEntry({ cwd: "/other" })]);
const reader = new ClaudeLogReader({
logger: mockLogger,
basePath: baseDir,
});
const records = await reader.getUsage({ since, repoPath });
expect(records).toHaveLength(0);
});

it("keeps entries with matching cwd", async () => {
baseDir = await writeLogFile([buildEntry({ cwd: "/repo/project" })]);
const reader = new ClaudeLogReader({
logger: mockLogger,
basePath: baseDir,
});
const records = await reader.getUsage({ since, repoPath });
expect(records).toHaveLength(1);
expect(records[0]?.inputTokens).toBe(1);
});
});
Loading