Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions .claude/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,12 @@
"defaultMode": "default",
"allow": [
"Bash(pnpm lint:*)",
"Bash(pnpm lint:fix:*)",
"Bash(pnpm typecheck:*)",
"Bash(pnpm build:*)",
"Bash(pnpm format:*)",
"Bash(pnpm format:check:*)",
"Bash(pnpm test:*)",
"Bash(pnpm agents:check:*)",
"Bash(pnpm agents:sync:*)"
"Bash(pnpm test:*)"
],
"ask": [
"Bash(pnpm install:*)",
Expand Down
1 change: 1 addition & 0 deletions .codex/skills/docs-sync/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Documentation files to consider:
- When adding new commands, include both the command and a brief explanation
- Do not introduce instructions that conflict with `AGENTS.md`
- Do not edit `CLAUDE.md` directly; update `AGENTS.md` instead
- Mermaid: wrap node text in quotes like `A["Label"]` and `B{"Question?"}` to avoid parse issues with punctuation

## Output Requirements

Expand Down
5 changes: 5 additions & 0 deletions .npmrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Don't use caret (^) or tilde (~) in package versions
save-prefix=

# Always save exact versions
save-exact=true
194 changes: 114 additions & 80 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -1,81 +1,115 @@
## Repository overview

- **Name:** cli-agent-sandbox
- **Purpose:** Minimal TypeScript CLI sandbox for testing agent workflows.
- **Entry points:** `src/cli/guestbook/main.ts`, `src/cli/name-explorer/main.ts`, `src/cli/scrape-publications/main.ts`.
- **Framework:** Uses `@openai/agents` with file tools scoped to `tmp`.

## Setup

1. Install Node.js and pnpm.
2. Install dependencies: `pnpm install`

## Environment

- Set `OPENAI_API_KEY` (export it or use a `.env`) to run the guestbook, name explorer (AI mode), and publication scraper.

## Common commands

Available pnpm scripts for development and testing:

| Command | Description |
| ------------------------------ | ------------------------------------------------- |
| `pnpm run:guestbook` | Run the interactive guestbook CLI demo |
| `pnpm run:name-explorer` | Explore Finnish name statistics (AI Q&A or stats) |
| `pnpm run:scrape-publications` | Scrape publication links and build a review page |
| `pnpm typecheck` | Run TypeScript type checking |
| `pnpm lint` | Run ESLint for code quality |
| `pnpm format` | Format code with Prettier |
| `pnpm format:check` | Check code formatting |
| `pnpm test` | Run Vitest test suite |

## Project layout

| Path | Description |
| ----------------------------------------- | ----------------------------------------------- |
| `src/cli/guestbook/main.ts` | Guestbook CLI entry point |
| `src/cli/guestbook/README.md` | Guestbook CLI docs |
| `src/cli/name-explorer/main.ts` | Name Explorer CLI entry point |
| `src/cli/name-explorer/README.md` | Name Explorer CLI docs |
| `src/cli/scrape-publications/main.ts` | Publication scraping CLI entry point |
| `src/cli/scrape-publications/README.md` | Publication scraping CLI docs |
| `src/clients/*` | Publication scraping pipeline clients |
| `src/utils/parse-args.ts` | Shared CLI argument parsing helper |
| `src/utils/question-handler.ts` | Shared CLI prompt + validation helper |
| `src/tools/index.ts` | Tool exports |
| `src/tools/fetch-url/fetch-url-tool.ts` | Safe HTTP fetch tool with SSRF protection |
| `src/tools/read-file/read-file-tool.ts` | Agent tool for reading files under `tmp` |
| `src/tools/write-file/write-file-tool.ts` | Agent tool for writing files under `tmp` |
| `src/tools/list-files/list-files-tool.ts` | Agent tool for listing files under `tmp` |
| `src/tools/utils/fs.ts` | Path safety utilities |
| `src/tools/utils/html-processing.ts` | HTML sanitization + extraction helpers |
| `src/tools/utils/url-safety.ts` | URL safety + SSRF protection helpers |
| `src/tools/utils/test-utils.ts` | Shared test helpers |
| `src/tools/*/*.test.ts` | Vitest tests for tools and safety utils |
| `src/types/index.ts` | Zod schemas for publication pipeline |
| `eslint.config.ts` | ESLint configuration |
| `prettier.config.ts` | Prettier configuration |
| `tsconfig.json` | TypeScript configuration |
| `vitest.config.ts` | Vitest configuration |
| `tmp/` | Runtime scratch space for tool + scraper output |

## Tools

File tools provide operations sandboxed to the `tmp/` directory with path validation. The `fetchUrl` tool adds SSRF protection and sanitizes HTML content before conversion.

| Tool | Location | Parameters | Description |
| ----------- | ----------------------------------------- | ---------------------------------------------------------------------------------------- | ------------------------------------------------------- |
| `fetchUrl` | `src/tools/fetch-url/fetch-url-tool.ts` | `url`, `timeoutMs?`, `maxBytes?`, `maxRedirects?`, `maxChars?`, `etag?`, `lastModified?` | Fetches URLs safely and returns sanitized Markdown/text |
| `readFile` | `src/tools/read-file/read-file-tool.ts` | `path` (string) | Reads file content from `tmp` |
| `writeFile` | `src/tools/write-file/write-file-tool.ts` | `path`, `content` (strings) | Writes content to file in `tmp` |
| `listFiles` | `src/tools/list-files/list-files-tool.ts` | `path` (string, optional) | Lists files under `tmp` |

## Agent notes

- Use pnpm for scripts and dependency changes.
- Keep changes small and focused; update tests when behavior changes.
- Do not run git operations that change repo state: no `git commit`, `git push`, or opening PRs.
- Read-only git commands are allowed (e.g., `git status`, `git diff`, `git log`).
- Do not read `.env` files or any other secrets.
# AGENTS.md — Operating Guide for AI Agents

## 0) TL;DR (Agent quick start)

**Goal:** Make small, safe, test-covered changes in this TypeScript CLI sandbox.

**Repo:** `cli-agent-sandbox` — minimal TypeScript CLI sandbox built with `@openai/agents` and tool sandboxing under `tmp/`.

1. Start at `src/cli/<cli>/main.ts` and the matching `src/cli/<cli>/README.md`.
2. Follow the pipeline classes under `src/cli/<cli>/clients/*` and schemas under `src/cli/<cli>/types/*`.
3. Reuse shared helpers: `src/utils/parse-args.ts`, `src/utils/question-handler.ts`, `src/clients/logger.ts`.
4. Keep changes minimal; add/update **Vitest** tests (`*.test.ts`) when behavior changes.
5. Run: `pnpm typecheck`, `pnpm lint`, `pnpm test` (and `pnpm format:check` if formatting changed).
6. All runtime artifacts go under `tmp/` (never commit them).

**Scratch space:** Use `tmp/` for generated HTML/markdown/JSON/reports.

---

## 1) Fast map (where to look first)

- Entry points: `src/cli/*/main.ts`
- Shared clients: `src/clients/*`
- Shared helpers: `src/utils/*`
- Agent tools: `src/tools/*`

---

## 2) Setup & commands

- Install deps: `pnpm install`
- Set `OPENAI_API_KEY` via env or `.env` (humans do this; agents must not read secrets)
- If a task requires Playwright, follow the repo README for system deps

**Common scripts (see `package.json` for all):**

- `pnpm run:[cli-name-here]`
- `pnpm typecheck`
- `pnpm lint` (use `pnpm lint:fix` if errors are auto-fixable)
- `pnpm format` / `pnpm format:check`
- `pnpm test`

---

## 3) Hard rules (security & repo safety)

### MUST NOT

- **Do not read** `.env` files or any secrets.
- **Do not run** git commands that change repo state: `git commit`, `git push`, PR creation.
- **Do not bypass** SSRF protections or URL/path safety utilities.

### Allowed

- Read-only git commands: `git status`, `git diff`, `git log`.
- Writing runtime artifacts under `tmp/`.

---

## 4) Agent tools (runtime tool catalog)

All file tools are sandboxed to `tmp/` using path validation (`src/tools/utils/fs.ts`).

### File tools

- **`readFile`** (`src/tools/read-file/read-file-tool.ts`)
- Reads a file under `tmp/`.
- Params: `{ path: string }` (path is **relative to `tmp/`**)
- **`writeFile`** (`src/tools/write-file/write-file-tool.ts`)
- Writes a file under `tmp/`.
- Params: `{ path: string, content: string }` (path is **relative to `tmp/`**)
- **`listFiles`** (`src/tools/list-files/list-files-tool.ts`)
- Lists files/dirs under `tmp/`.
- Params: `{ path?: string }` (defaults to `tmp/` root)

### Safe web fetch tool

- **`fetchUrl`** (`src/tools/fetch-url/fetch-url-tool.ts`)
- SSRF protection + redirect validation + HTML sanitization + markdown/text conversion.
- Params: `{ url, timeoutMs?, maxBytes?, maxRedirects?, maxChars?, etag?, lastModified? }`
- Output: sanitized content, metadata, and warnings.

---

## 5) Coding conventions (how changes should look)

- Initialize `Logger` in CLI entry points and pass it into clients/pipelines via constructor options.
- Prefer shared helpers in `src/utils` over custom parsing or prompt logic.
- Prefer shared helpers in `src/utils` (`parse-args`, `question-handler`) over custom logic.
- Prefer TypeScript path aliases over deep relative imports: `~tools/*`, `~clients/*`, `~utils/*`.
- Use Zod schemas for CLI args and tool IO.
- For HTTP fetching in code, prefer `Fetch` (sanitized) or `PlaywrightScraper` for JS-heavy pages.
- When adding tools that touch files, use `src/tools/utils/fs.ts` for path validation.
- Comments should capture invariants or subtle behavior, not restate code.
- Prefer a class over a function when state/lifecycle or shared dependencies make it appropriate.
- Avoid `index.ts` barrel exports; use explicit module paths.

### Comment guidance (short)

- Use comments for intent/tradeoffs, contracts (inputs/outputs, invariants, side effects, errors), non-obvious behavior (ordering, caching, perf), or domain meanings.
- Avoid `@param`/`@returns` boilerplate and step-by-step narration that repeats the signature or body.
- Rule of thumb: each comment should say something the types cannot.

---

## 6) Definition of Done (before finishing)

- [ ] Change is minimal and localized
- [ ] Tests added/updated if behavior changed (`pnpm test`)
- [ ] Typecheck passes (`pnpm typecheck`)
- [ ] Lint passes (`pnpm lint`)
- [ ] Formatting is clean (`pnpm format:check` or `pnpm format`)
- [ ] No secrets accessed, no unsafe file/network behavior introduced
- [ ] Any generated artifacts are in `tmp/` only

---
54 changes: 24 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ A minimal TypeScript CLI sandbox for testing agent workflows and safe web scrapi
| `pnpm run:scrape-publications` | Scrape publication links and build a review page |
| `pnpm typecheck` | Run TypeScript type checking |
| `pnpm lint` | Run ESLint for code quality |
| `pnpm lint:fix` | Run ESLint and auto-fix issues |
| `pnpm format` | Format code with Prettier |
| `pnpm format:check` | Check code formatting |
| `pnpm test` | Run Vitest test suite |
Expand Down Expand Up @@ -78,48 +79,41 @@ src/
│ │ ├── main.ts # Name Explorer CLI entry point
│ │ └── README.md # Name Explorer CLI docs
│ └── scrape-publications/
│ ├── main.ts # Publication scraping CLI
│ └── README.md # Publication scraping docs
│ ├── main.ts # Publication scraping CLI entry point
│ ├── README.md # Publication scraping docs
│ ├── clients/ # Publication-specific clients
│ │ ├── publication-pipeline.ts # Pipeline orchestration
│ │ ├── publication-scraper.ts # Link discovery + selector inference
│ │ └── review-page-generator.ts # Review HTML generator
│ └── types/
│ └── index.ts # Publication Zod schemas
├── clients/
│ ├── fetch.ts # HTTP fetch + sanitization helpers
│ ├── logger.ts # Console logger
│ ├── playwright-scraper.ts # Playwright-based scraper for JS-rendered pages
│ ├── publication-pipeline.ts # Pipeline orchestration
│ ├── publication-scraper.ts # Link discovery + selector inference
│ └── review-page-generator.ts # Review HTML generator
│ ├── fetch.ts # Shared HTTP fetch + sanitization
│ ├── logger.ts # Shared console logger
│ └── playwright-scraper.ts # Playwright-based web scraper
├── utils/
│ ├── parse-args.ts # Shared CLI arg parsing helper
│ └── question-handler.ts # Shared CLI prompt + validation helper
├── tools/
│ ├── fetch-url/
│ │ ├── fetch-url-tool.ts # Safe fetch tool
│ │ └── fetch-url-tool.test.ts # Fetch tool tests
│ ├── index.ts # Tool exports
│ ├── list-files/
│ │ ├── list-files-tool.ts # List tool implementation
│ │ └── list-files-tool.test.ts # List tool tests
│ ├── read-file/
│ │ ├── read-file-tool.ts # Read tool implementation
│ │ └── read-file-tool.test.ts # Read tool tests
│ ├── write-file/
│ │ ├── write-file-tool.ts # Write tool implementation
│ │ └── write-file-tool.test.ts # Write tool tests
│ ├── index.ts # Tool exports
│ ├── fetch-url/ # Safe fetch tool
│ ├── list-files/ # List files tool
│ ├── read-file/ # Read file tool
│ ├── write-file/ # Write file tool
│ └── utils/
│ ├── fs.ts # Path safety utilities
│ ├── html-processing.ts # HTML sanitization + extraction helpers
│ ├── html-processing.test.ts # HTML processing tests
│ ├── url-safety.ts # SSRF protection helpers
│ ├── url-safety.test.ts # URL safety tests
│ └── test-utils.ts # Shared test helpers
└── types/
└── index.ts # Zod schemas for publication pipeline
tmp/ # Runtime scratch space (tool I/O)
│ ├── fs.ts # Path safety utilities
│ ├── html-processing.ts # HTML sanitization + extraction helpers
│ ├── url-safety.ts # SSRF protection helpers
│ └── test-utils.ts # Shared test helpers
tmp/ # Runtime scratch space (tool I/O)
```

## CLI conventions

- When using `Logger`, initialize it in the CLI entry point and pass it into clients/pipelines via constructor options.
- Prefer shared helpers in `src/utils` (`parse-args`, `question-handler`) over custom argument parsing or prompt logic.
- Use the TypeScript path aliases for shared modules: `~tools/*`, `~clients/*`, `~utils/*`.
Example: `import { readFileTool } from "~tools/read-file/read-file-tool";`

## Security

Expand Down
48 changes: 48 additions & 0 deletions eslint.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ export default defineConfig(
eslint.configs.recommended,
...tseslint.configs.recommended,
...tseslint.configs.recommendedTypeChecked,
...tseslint.configs.strictTypeChecked,
...tseslint.configs.stylisticTypeChecked,
],
rules: {
Expand All @@ -39,8 +40,55 @@ export default defineConfig(
allowConstantLoopConditions: true,
},
],
// Enforce arrow functions over function declarations
"func-style": ["error", "expression"],
"@typescript-eslint/no-floating-promises": [
"error",
{ ignoreVoid: true },
],
"@typescript-eslint/switch-exhaustiveness-check": "error",
"@typescript-eslint/no-non-null-assertion": "error",
"@typescript-eslint/consistent-type-exports": "error",
"@typescript-eslint/consistent-type-definitions": ["error", "type"],
"@typescript-eslint/restrict-template-expressions": [
"error",
{
allowAny: false,
allowBoolean: true,
allowNever: false,
allowNullish: false,
allowNumber: true,
allowRegExp: false,
},
],
"prefer-const": "error",
"no-var": "error",
// --- Async correctness ---
"@typescript-eslint/await-thenable": "error",

// --- Safer error handling ---
"@typescript-eslint/only-throw-error": "error",

// --- Better modern TS patterns ---
"@typescript-eslint/prefer-nullish-coalescing": "error",
"@typescript-eslint/prefer-optional-chain": "error",
eqeqeq: ["error", "smart"],
curly: ["error", "all"],
"import/no-default-export": "error",
"import/consistent-type-specifier-style": ["error", "prefer-top-level"],
// Enforce path aliases for cross-module imports
"@typescript-eslint/no-restricted-imports": [
"error",
{
patterns: [
{
group: ["../../*", "../../../*", "../../../../*"],
message:
"Use path aliases (e.g. ~tools/...) instead of ../../ imports.",
},
],
},
],
},
},
{
Expand Down
4 changes: 3 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
"node:tsx": "node --disable-warning=ExperimentalWarning --import tsx",
"typecheck": "tsc --noEmit",
"lint": "eslint .",
"lint:fix": "eslint . --fix",
"format": "prettier --write .",
"format:check": "prettier --check .",
"test": "vitest"
Expand All @@ -28,6 +29,7 @@
"devDependencies": {
"@eslint/compat": "2.0.1",
"@eslint/js": "9.39.2",
"@ianvs/prettier-plugin-sort-imports": "4.7.0",
"@openai/agents": "0.3.7",
"@types/jsdom": "27.0.0",
"@types/node": "25.0.6",
Expand All @@ -39,7 +41,7 @@
"jiti": "2.6.1",
"jsdom": "27.4.0",
"marked": "17.0.1",
"node-html-markdown": "^2.0.0",
"node-html-markdown": "2.0.0",
"playwright": "1.57.0",
"prettier": "3.7.4",
"sanitize-html": "2.17.0",
Expand Down
Loading