diff --git a/.gitignore b/.gitignore index 57dcfc8..6bc3537 100644 --- a/.gitignore +++ b/.gitignore @@ -75,3 +75,6 @@ my-app/ # ROFL config (user-generated) rofl.yaml + +# Benchmark results +benchmarks/**/results/ diff --git a/benchmarks/ama-bench/.env.example b/benchmarks/ama-bench/.env.example new file mode 100644 index 0000000..ab945b1 --- /dev/null +++ b/benchmarks/ama-bench/.env.example @@ -0,0 +1,8 @@ +# Embedding API key (OpenRouter or OpenAI) +API_KEY=your-api-key + +# AMA-Bench options (these reference files in AMA-Bench/configs/) +# Available LLM configs: gpt-5.2.yaml, qwen3-32B.yaml +LLM_CONFIG=gpt-5.2.yaml +JUDGE_CONFIG=llm_judge.yaml +SUBSET=openend diff --git a/benchmarks/ama-bench/README.md b/benchmarks/ama-bench/README.md new file mode 100644 index 0000000..9075f5b --- /dev/null +++ b/benchmarks/ama-bench/README.md @@ -0,0 +1,244 @@ +# AMA-Bench Benchmark for @ekai/mindmap + +Benchmarks the `@ekai/mindmap` package against [AMA-Bench](https://github.com/ekailabs/AMA-Bench), an evaluation framework for Associative Memory Ability in AI agents. + +## Architecture + +``` +AMA-Bench (Python) Bridge Server (TypeScript) +┌──────────────────────┐ ┌───────────────────────────┐ +│ run.py │ │ server.ts │ +│ └─ ContextoMethod │ HTTP │ └─ @ekai/mindmap │ +│ │ │──────────────▶ ├─ mindmap.add() │ +│ │ construct │ /construct │ │ (embed+cluster) │ +│ │ retrieve │ /retrieve │ └─ mindmap.search() │ +│ ▼ │ │ (beam search) │ +│ LLM generates answer │ │ │ +│ Judge scores answer │ │ reads configs/default.json│ +└──────────────────────┘ └───────────────────────────┘ + +ekailabs/AMA-Bench repo ekailabs/contexto repo + src/method/contexto_method.py benchmarks/ama-bench/src/server.ts + configs/contexto.yaml benchmarks/ama-bench/configs/default.json +``` + +Two repos: +- **[ekailabs/AMA-Bench](https://github.com/ekailabs/AMA-Bench)** — Python benchmark framework + `contexto` method (thin HTTP client) +- **[ekailabs/contexto](https://github.com/ekailabs/contexto)** — Bridge server wrapping `@ekai/mindmap` + all config + +## Prerequisites + +- [Bun](https://bun.sh) >= 1.0 +- Python >= 3.9 +- pnpm +- `huggingface-cli` (`pip install huggingface_hub`) +- An API key for [OpenRouter](https://openrouter.ai) or OpenAI (used for embeddings + LLM) + +## Running Locally + +### 1. Clone both repos + +```bash +git clone https://github.com/ekailabs/contexto.git +git clone https://github.com/ekailabs/AMA-Bench.git +``` + +They should be siblings: + +``` +parent/ +├── contexto/ +└── AMA-Bench/ +``` + +### 2. Install dependencies + +```bash +# Install contexto workspace (includes the bridge) +cd contexto +pnpm install + +# Install AMA-Bench Python deps +cd ../AMA-Bench +pip install -r requirements.txt + +# Download the dataset +huggingface-cli download AMA-bench/AMA-bench --repo-type dataset --local-dir dataset +``` + +Or run the setup script which does all of the above: + +```bash +cd contexto/benchmarks/ama-bench +bash scripts/setup.sh +``` + +### 3. Configure the bridge + +Create `contexto/benchmarks/ama-bench/.env`: + +```bash +# Embedding API key (used by the bridge for mindmap embeddings) +API_KEY=your-openrouter-or-openai-key +``` + +Tune mindmap parameters in `contexto/benchmarks/ama-bench/configs/default.json`: + +```json +{ + "provider": "openrouter", + "embedModel": "openai/text-embedding-3-small", + "mindmap": { + "similarityThreshold": 0.5, + "maxDepth": 4, + "maxChildren": 10, + "rebuildInterval": 50 + }, + "search": { + "maxResults": 10, + "maxTokens": 4000, + "beamWidth": 3, + "minScore": 0.0 + } +} +``` + +### 4. Configure AMA-Bench LLM + +AMA-Bench needs an LLM config for answer generation and a judge config for scoring. Create these in `AMA-Bench/configs/`: + +```yaml +# AMA-Bench/configs/openrouter.yaml +provider: "openai" +api_key: "your-openrouter-key" +model: "openai/gpt-4o" +base_url: "https://openrouter.ai/api/v1" +max_tokens: 16000 +temperature: 0.0 +``` + +```yaml +# AMA-Bench/configs/llm_judge_openrouter.yaml +provider: "openai" +api_key: "your-openrouter-key" +model: "openai/gpt-4o" +base_url: "https://openrouter.ai/api/v1" +max_tokens: 16000 +temperature: 0.0 +``` + +### 5. Run + +```bash +cd contexto/benchmarks/ama-bench +bash scripts/run.sh +``` + +This will: +1. Start the bridge server (reads `configs/default.json` + `API_KEY` from `.env`) +2. Run AMA-Bench with the `contexto` method (208 episodes, ~35s each) +3. Evaluate answers with the LLM judge +4. Save results to `AMA-Bench/results/` +5. Shut down the bridge + +Override defaults: + +```bash +LLM_CONFIG=../../../AMA-Bench/configs/openrouter.yaml \ +JUDGE_CONFIG=../../../AMA-Bench/configs/llm_judge_openrouter.yaml \ +SUBSET=openend \ +bash scripts/run.sh +``` + +### 6. Parameter sweep (optional) + +Grid-search over mindmap/search params to find the optimal config: + +```bash +bash scripts/sweep.sh +``` + +Edit `configs/sweep.json` to change ranges. Results are saved to `results/sweep_/sweep_summary.csv`, ranked by accuracy. + +## Running with Docker Compose [WIP] + +No local Bun or Python needed. Everything runs in containers. + +```bash +cd contexto/benchmarks/ama-bench/docker +cp .env.example .env # set API_KEY +docker compose up --build +``` + +The `bridge` container starts the server, the `runner` container clones AMA-Bench, downloads the dataset, and runs the benchmark. + +### CI + +```yaml +- name: Run AMA-Bench + working-directory: benchmarks/ama-bench/docker + env: + API_KEY: ${{ secrets.API_KEY }} + run: docker compose up --build --abort-on-container-exit +``` + +## Configuration Reference + +### Tree construction (`mindmap`) + +| Parameter | Default | Description | +|---|---|---| +| `similarityThreshold` | 0.5 | Min cosine similarity to cluster items together | +| `maxDepth` | 4 | Max tree nesting depth | +| `maxChildren` | 10 | Max direct children per node | +| `rebuildInterval` | 50 | Items added before full tree rebuild | + +### Retrieval (`search`) + +| Parameter | Default | Description | +|---|---|---| +| `maxResults` | 10 | Max items returned | +| `maxTokens` | 4000 | Token budget cap for results | +| `beamWidth` | 3 | Branches explored per tree level | +| `minScore` | 0.0 | Min cosine similarity to include a result | + +## Bridge API + +| Endpoint | Method | Description | +|---|---|---| +| `/health` | GET | Health check, returns `{ status, activeEpisodes }` | +| `/construct` | POST | Add trajectory items to a mindmap instance | +| `/retrieve` | POST | Search mindmap for relevant context | +| `/reset` | POST | Clear a mindmap instance for an episode | + +## File Structure + +``` +contexto/benchmarks/ama-bench/ # Bridge + config + scripts +├── src/server.ts # Bridge server wrapping @ekai/mindmap +├── package.json +├── tsconfig.json +├── .env # API_KEY (not committed) +├── configs/ +│ ├── default.json # Mindmap + search parameters +│ └── sweep.json # Parameter sweep ranges +├── scripts/ +│ ├── setup.sh # One-time setup +│ ├── run.sh # Run benchmark +│ └── sweep.sh # Run parameter sweep +├── docker/ +│ ├── docker-compose.yml +│ ├── Dockerfile # AMA-Bench runner +│ ├── bridge.Dockerfile # Bridge server +│ └── .env.example +└── results/ # Benchmark outputs (gitignored) + +AMA-Bench/ # Fork of AMA-Bench +├── src/method/contexto_method.py # Python method adapter (thin HTTP client) +├── configs/ +│ ├── contexto.yaml # Method config (bridge_url only) +│ ├── openrouter.yaml # LLM config for answer generation +│ └── llm_judge_openrouter.yaml # LLM config for judge scoring +├── dataset/ # Downloaded via huggingface-cli +└── results/ # Benchmark outputs +``` diff --git a/benchmarks/ama-bench/configs/default.json b/benchmarks/ama-bench/configs/default.json new file mode 100644 index 0000000..e631169 --- /dev/null +++ b/benchmarks/ama-bench/configs/default.json @@ -0,0 +1,16 @@ +{ + "provider": "openrouter", + "embedModel": "openai/text-embedding-3-small", + "mindmap": { + "similarityThreshold": 0.5, + "maxDepth": 4, + "maxChildren": 10, + "rebuildInterval": 50 + }, + "search": { + "maxResults": 10, + "maxTokens": 4000, + "beamWidth": 3, + "minScore": 0.0 + } +} diff --git a/benchmarks/ama-bench/configs/sweep.json b/benchmarks/ama-bench/configs/sweep.json new file mode 100644 index 0000000..1ac4ff8 --- /dev/null +++ b/benchmarks/ama-bench/configs/sweep.json @@ -0,0 +1,7 @@ +{ + "similarityThreshold": [0.3, 0.5, 0.65, 0.8], + "maxDepth": [3, 4, 6], + "beamWidth": [2, 3, 5], + "minScore": [0.0, 0.1, 0.3], + "maxResults": [5, 10, 20] +} diff --git a/benchmarks/ama-bench/docker/Dockerfile b/benchmarks/ama-bench/docker/Dockerfile new file mode 100644 index 0000000..6229dff --- /dev/null +++ b/benchmarks/ama-bench/docker/Dockerfile @@ -0,0 +1,22 @@ +FROM --platform=linux/amd64 python:3.11-slim + +RUN apt-get update && apt-get install -y --no-install-recommends \ + git curl make build-essential && \ + rm -rf /var/lib/apt/lists/* + +WORKDIR /app + +# Clone AMA-Bench fork (contexto_method.py and configs already in repo) +ARG AMA_BENCH_REPO=https://github.com/ekailabs/AMA-Bench.git +RUN git clone ${AMA_BENCH_REPO} /app/AMA-Bench + +# Install Python dependencies +RUN pip install --no-cache-dir -r /app/AMA-Bench/requirements.txt + +# Download dataset +RUN pip install --no-cache-dir huggingface_hub && \ + huggingface-cli download AMA-bench/AMA-bench --repo-type dataset --local-dir /app/AMA-Bench/dataset + +WORKDIR /app/AMA-Bench + +ENTRYPOINT ["python", "src/run.py"] diff --git a/benchmarks/ama-bench/docker/bridge.Dockerfile b/benchmarks/ama-bench/docker/bridge.Dockerfile new file mode 100644 index 0000000..aea0c9d --- /dev/null +++ b/benchmarks/ama-bench/docker/bridge.Dockerfile @@ -0,0 +1,14 @@ +FROM oven/bun:1-slim + +WORKDIR /app + +COPY package.json ./ +COPY src/ ./src/ + +# Install @ekai/mindmap from npm (swap workspace ref) +RUN sed -i 's/"workspace:\*"/"latest"/' package.json && \ + bun install + +EXPOSE 3456 + +CMD ["bun", "src/server.ts"] diff --git a/benchmarks/ama-bench/docker/docker-compose.yml b/benchmarks/ama-bench/docker/docker-compose.yml new file mode 100644 index 0000000..c879954 --- /dev/null +++ b/benchmarks/ama-bench/docker/docker-compose.yml @@ -0,0 +1,44 @@ +services: + bridge: + build: + context: .. + dockerfile: docker/bridge.Dockerfile + ports: + - "3456:3456" + environment: + - BRIDGE_PORT=3456 + - API_KEY=${API_KEY} + healthcheck: + test: ["CMD", "curl", "-sf", "http://localhost:3456/health"] + interval: 5s + timeout: 3s + retries: 10 + + runner: + build: + context: . + dockerfile: Dockerfile + depends_on: + bridge: + condition: service_healthy + environment: + - CONTEXTO_BRIDGE_URL=http://bridge:3456 + command: + - --llm-server + - api + - --llm-config + - configs/${LLM_CONFIG:-gpt-5.2.yaml} + - --subset + - ${SUBSET:-openend} + - --method + - contexto + - --method-config + - configs/contexto.yaml + - --test-dir + - dataset/test + - --judge-config + - configs/${JUDGE_CONFIG:-llm_judge.yaml} + - --evaluate + - "True" + volumes: + - ../results:/app/AMA-Bench/results diff --git a/benchmarks/ama-bench/package.json b/benchmarks/ama-bench/package.json new file mode 100644 index 0000000..064598e --- /dev/null +++ b/benchmarks/ama-bench/package.json @@ -0,0 +1,16 @@ +{ + "name": "@ekai/ama-bench-bridge", + "version": "0.1.0", + "private": true, + "description": "HTTP bridge between AMA-Bench Python framework and @ekai/mindmap", + "type": "module", + "scripts": { + "start": "bun src/server.ts" + }, + "dependencies": { + "@ekai/mindmap": "workspace:*" + }, + "devDependencies": { + "@types/bun": "latest" + } +} diff --git a/benchmarks/ama-bench/scripts/run.sh b/benchmarks/ama-bench/scripts/run.sh new file mode 100755 index 0000000..195014d --- /dev/null +++ b/benchmarks/ama-bench/scripts/run.sh @@ -0,0 +1,75 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +BENCH_DIR="$(dirname "$SCRIPT_DIR")" +AMA_BENCH="$(cd "$BENCH_DIR/../../.." && pwd)/AMA-Bench" +BRIDGE_PORT="${BRIDGE_PORT:-3456}" + +# Load .env if present +if [ -f "$BENCH_DIR/.env" ]; then + set -a + source "$BENCH_DIR/.env" + set +a +fi + +# Validate required vars +if [ -z "${API_KEY:-}" ]; then + echo "ERROR: API_KEY not set. Export it or add to .env" + exit 1 +fi + +# Parse arguments (pass through to run.py) +LLM_CONFIG="${LLM_CONFIG:-$AMA_BENCH/configs/openrouter.yaml}" +SUBSET="${SUBSET:-openend}" +JUDGE_CONFIG="${JUDGE_CONFIG:-$AMA_BENCH/configs/llm_judge_openrouter.yaml}" +METHOD_CONFIG="${METHOD_CONFIG:-$AMA_BENCH/configs/contexto.yaml}" +EXTRA_ARGS="${*:-}" + +echo "=== Running AMA-Bench with contexto method ===" +echo "AMA-Bench dir: $AMA_BENCH" + +# Start bridge server in background +echo "[1/3] Starting contexto bridge server on port $BRIDGE_PORT..." +cd "$BENCH_DIR" +API_KEY=$API_KEY BRIDGE_PORT=$BRIDGE_PORT bun src/server.ts & +BRIDGE_PID=$! + +# Ensure cleanup on exit +cleanup() { + echo "" + echo "Shutting down bridge server (PID: $BRIDGE_PID)..." + kill $BRIDGE_PID 2>/dev/null || true + wait $BRIDGE_PID 2>/dev/null || true +} +trap cleanup EXIT + +# Wait for bridge to be ready +echo "Waiting for bridge server..." +for i in $(seq 1 30); do + if curl -s "http://localhost:$BRIDGE_PORT/health" > /dev/null 2>&1; then + echo "Bridge server ready." + break + fi + if [ "$i" -eq 30 ]; then + echo "ERROR: Bridge server failed to start within 30 seconds." + exit 1 + fi + sleep 1 +done + +# Run AMA-Bench +echo "[2/3] Running AMA-Bench evaluation..." +cd "$AMA_BENCH" +python src/run.py \ + --llm-server api \ + --llm-config "$LLM_CONFIG" \ + --subset "$SUBSET" \ + --method contexto \ + --method-config "$METHOD_CONFIG" \ + --test-dir dataset/test \ + --judge-config "$JUDGE_CONFIG" \ + --evaluate True \ + $EXTRA_ARGS + +echo "[3/3] Benchmark complete. Results saved in $AMA_BENCH/results/" diff --git a/benchmarks/ama-bench/scripts/setup.sh b/benchmarks/ama-bench/scripts/setup.sh new file mode 100755 index 0000000..5e5f6b7 --- /dev/null +++ b/benchmarks/ama-bench/scripts/setup.sh @@ -0,0 +1,42 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +BENCH_DIR="$(dirname "$SCRIPT_DIR")" +AMA_BENCH="$(cd "$BENCH_DIR/../../.." && pwd)/AMA-Bench" + +echo "=== AMA-Bench Setup for Contexto ===" +echo "AMA-Bench: $AMA_BENCH" + +# 1. Clone AMA-Bench (if not already present) +if [ ! -d "$AMA_BENCH" ]; then + echo "[1/4] Cloning AMA-Bench..." + git clone https://github.com/ekailabs/AMA-Bench.git "$AMA_BENCH" +else + echo "[1/4] AMA-Bench already exists, skipping clone." +fi + +# 2. Install Python dependencies +echo "[2/4] Installing Python dependencies..." +cd $AMA_BENCH +pip install -r requirements.txt +cd .. + +# 3. Download dataset +if [ ! -d "$AMA_BENCH/dataset" ]; then + echo "[3/4] Downloading AMA-Bench dataset..." + huggingface-cli download AMA-bench/AMA-bench --repo-type dataset --local-dir "$AMA_BENCH/dataset" +else + echo "[3/4] Dataset already downloaded." +fi + +# 4. Install bridge dependencies +echo "[4/4] Installing bridge dependencies..." +cd "$BENCH_DIR" && pnpm install + +echo "" +echo "=== Setup complete ===" +echo "Next steps:" +echo " 1. Edit configs/default.json (provider, embedModel, mindmap params)" +echo " 2. Export API_KEY=your-api-key" +echo " 3. Run: bash scripts/run.sh" diff --git a/benchmarks/ama-bench/scripts/sweep.sh b/benchmarks/ama-bench/scripts/sweep.sh new file mode 100755 index 0000000..8f4da81 --- /dev/null +++ b/benchmarks/ama-bench/scripts/sweep.sh @@ -0,0 +1,131 @@ +#!/usr/bin/env bash +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +BENCH_DIR="$(dirname "$SCRIPT_DIR")" +AMA_BENCH="$(cd "$BENCH_DIR/../../.." && pwd)/AMA-Bench" +BRIDGE_PORT="${BRIDGE_PORT:-3456}" +SWEEP_CONFIG="${SWEEP_CONFIG:-$BENCH_DIR/configs/sweep.json}" +DEFAULT_CONFIG="$BENCH_DIR/configs/default.json" +TIMESTAMP=$(date +%Y%m%d_%H%M%S) +RESULTS_DIR="$BENCH_DIR/results/sweep_$TIMESTAMP" + +LLM_CONFIG="${LLM_CONFIG:-$AMA_BENCH/configs/gpt-4o.yaml}" +JUDGE_CONFIG="${JUDGE_CONFIG:-$AMA_BENCH/configs/llm_judge.yaml}" + +echo "=== Contexto Parameter Sweep ===" +echo "Sweep config: $SWEEP_CONFIG" +echo "Results dir: $RESULTS_DIR" +mkdir -p "$RESULTS_DIR" + +# Save original default.json to restore later +cp "$DEFAULT_CONFIG" "$RESULTS_DIR/_default.json.bak" + +# Read param arrays from sweep config +read -ra SIM_THRESHOLDS <<< "$(jq -r '.similarityThreshold | join(" ")' "$SWEEP_CONFIG")" +read -ra MAX_DEPTHS <<< "$(jq -r '.maxDepth | join(" ")' "$SWEEP_CONFIG")" +read -ra BEAM_WIDTHS <<< "$(jq -r '.beamWidth | join(" ")' "$SWEEP_CONFIG")" +read -ra MIN_SCORES <<< "$(jq -r '.minScore | join(" ")' "$SWEEP_CONFIG")" +read -ra MAX_RESULTS <<< "$(jq -r '.maxResults | join(" ")' "$SWEEP_CONFIG")" + +TOTAL=$(( ${#SIM_THRESHOLDS[@]} * ${#MAX_DEPTHS[@]} * ${#BEAM_WIDTHS[@]} * ${#MIN_SCORES[@]} * ${#MAX_RESULTS[@]} )) +echo "Total configs: $TOTAL" + +# Ensure cleanup on exit +BRIDGE_PID="" +cleanup() { + echo "" + if [ -n "$BRIDGE_PID" ]; then + echo "Shutting down bridge server..." + kill $BRIDGE_PID 2>/dev/null || true + wait $BRIDGE_PID 2>/dev/null || true + fi + # Restore original config + cp "$RESULTS_DIR/_default.json.bak" "$DEFAULT_CONFIG" +} +trap cleanup EXIT + +# Run sweep +echo "[1/2] Running parameter sweep..." +COUNT=0 +SUMMARY="$RESULTS_DIR/sweep_summary.csv" +echo "similarityThreshold,maxDepth,beamWidth,minScore,maxResults,accuracy" > "$SUMMARY" + +for st in "${SIM_THRESHOLDS[@]}"; do +for md in "${MAX_DEPTHS[@]}"; do +for bw in "${BEAM_WIDTHS[@]}"; do +for ms in "${MIN_SCORES[@]}"; do +for mr in "${MAX_RESULTS[@]}"; do + COUNT=$((COUNT + 1)) + echo "" + echo "[$COUNT/$TOTAL] st=$st md=$md bw=$bw ms=$ms mr=$mr" + + # Write config for this combo + cat > "$DEFAULT_CONFIG" </dev/null || true + wait $BRIDGE_PID 2>/dev/null || true + fi + cd "$BENCH_DIR" + BRIDGE_PORT=$BRIDGE_PORT bun src/server.ts & + BRIDGE_PID=$! + + for i in $(seq 1 30); do + if curl -s "http://localhost:$BRIDGE_PORT/health" > /dev/null 2>&1; then + break + fi + [ "$i" -eq 30 ] && echo "ERROR: Bridge timeout" && exit 1 + sleep 1 + done + + # Run benchmark + cd "$AMA_BENCH" + OUTPUT=$(python src/run.py \ + --llm-server api \ + --llm-config "$LLM_CONFIG" \ + --subset openend \ + --method contexto \ + --method-config configs/contexto.yaml \ + --test-dir dataset/test \ + --judge-config "$JUDGE_CONFIG" \ + --evaluate True 2>&1) || true + + # Parse accuracy from output + ACCURACY=$(echo "$OUTPUT" | grep -i "overall" | grep -oE '[0-9]+\.[0-9]+' | head -1 || echo "0.0") + echo " -> Accuracy: $ACCURACY" + echo "$st,$md,$bw,$ms,$mr,$ACCURACY" >> "$SUMMARY" + +done +done +done +done +done + +# Print ranked results +echo "" +echo "============================================================" +echo "SWEEP RESULTS (ranked by accuracy)" +echo "============================================================" +sort -t, -k6 -rn "$SUMMARY" | head -11 + +echo "" +echo "[2/2] Sweep complete. Results: $SUMMARY" diff --git a/benchmarks/ama-bench/src/server.ts b/benchmarks/ama-bench/src/server.ts new file mode 100644 index 0000000..d321ec9 --- /dev/null +++ b/benchmarks/ama-bench/src/server.ts @@ -0,0 +1,129 @@ +import { + createMindmap, + memoryStorage, + type Mindmap, + type MindmapConfig, + type SearchOptions, +} from '@ekai/mindmap'; +import { readFileSync } from 'fs'; +import { resolve, dirname } from 'path'; +import { fileURLToPath } from 'url'; + +// --- Config --- + +const __dirname = dirname(fileURLToPath(import.meta.url)); +const configPath = resolve(__dirname, '../configs/default.json'); +const config = JSON.parse(readFileSync(configPath, 'utf-8')); + +const provider = config.provider ?? 'openrouter'; +const embedModel = config.embedModel ?? 'openai/text-embedding-3-small'; +const apiKey = process.env.API_KEY ?? ''; +const mindmapConfig: Partial = config.mindmap ?? {}; +const searchDefaults: SearchOptions = config.search ?? {}; + +if (!apiKey) { + console.warn('[bridge] WARNING: API_KEY env var not set. Embedding calls will fail.'); +} + +// --- Types --- + +interface ConstructRequest { + episodeId: string; + items: Array<{ id: string; role: string; content: string; metadata?: Record }>; +} + +interface RetrieveRequest { + episodeId: string; + question: string; + searchOptions?: SearchOptions; +} + +interface ResetRequest { + episodeId: string; +} + +// --- State --- + +const mindmaps = new Map(); + +// --- Handlers --- + +async function handleConstruct(body: ConstructRequest) { + const { episodeId, items } = body; + + mindmaps.delete(episodeId); + + const mindmap = createMindmap({ + provider, + apiKey, + embedModel, + storage: memoryStorage(), + config: mindmapConfig, + }); + + await mindmap.add(items); + mindmaps.set(episodeId, mindmap); + + const state = await mindmap.getState(); + return { success: true, totalItems: state.stats.totalItems }; +} + +async function handleRetrieve(body: RetrieveRequest) { + const { episodeId, question, searchOptions } = body; + + const mindmap = mindmaps.get(episodeId); + if (!mindmap) { + throw new Error(`No mindmap found for episode ${episodeId}. Call /construct first.`); + } + + const opts = { ...searchDefaults, ...searchOptions }; + const result = await mindmap.search(question, opts); + const context = result.items.map((si) => si.item.content).join('\n\n'); + + return { context, totalCandidates: result.totalCandidates }; +} + +function handleReset(body: ResetRequest) { + mindmaps.delete(body.episodeId); + return { success: true }; +} + +// --- Server --- + +const PORT = parseInt(process.env.BRIDGE_PORT ?? '3456', 10); + +const server = Bun.serve({ + port: PORT, + async fetch(req) { + const url = new URL(req.url); + const path = url.pathname; + + if (req.method === 'GET' && path === '/health') { + return Response.json({ status: 'ok', activeEpisodes: mindmaps.size }); + } + + if (req.method !== 'POST') { + return Response.json({ error: 'Method not allowed' }, { status: 405 }); + } + + try { + const body = await req.json(); + + if (path === '/construct') { + return Response.json(await handleConstruct(body as ConstructRequest)); + } else if (path === '/retrieve') { + return Response.json(await handleRetrieve(body as RetrieveRequest)); + } else if (path === '/reset') { + return Response.json(handleReset(body as ResetRequest)); + } + + return Response.json({ error: 'Not found' }, { status: 404 }); + } catch (err) { + const message = err instanceof Error ? err.message : 'Unknown error'; + console.error(`[bridge] Error on ${path}:`, message); + return Response.json({ error: message }, { status: 500 }); + } + }, +}); + +console.log(`[bridge] Mindmap bridge server listening on http://localhost:${server.port}`); diff --git a/benchmarks/ama-bench/tsconfig.json b/benchmarks/ama-bench/tsconfig.json new file mode 100644 index 0000000..28579a2 --- /dev/null +++ b/benchmarks/ama-bench/tsconfig.json @@ -0,0 +1,12 @@ +{ + "compilerOptions": { + "target": "ESNext", + "module": "ESNext", + "moduleResolution": "bundler", + "types": ["bun"], + "strict": true, + "skipLibCheck": true, + "noEmit": true + }, + "include": ["src"] +} diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index 63a8c86..fb87cad 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -39,6 +39,16 @@ importers: specifier: ^25.0.3 version: 25.0.3(typescript@5.9.3) + benchmarks/ama-bench: + dependencies: + '@ekai/mindmap': + specifier: workspace:* + version: link:../../packages/mindmap + devDependencies: + '@types/bun': + specifier: latest + version: 1.3.12 + packages/contexto: dependencies: openclaw: @@ -1779,6 +1789,9 @@ packages: '@types/body-parser@1.19.6': resolution: {integrity: sha512-HLFeCYgz89uk22N5Qg3dvGvsv46B8GLvKKo1zKG4NybA8U2DiEO3w9lqGg29t/tfLRJpJ6iQxnVw4OnB7MoM9g==} + '@types/bun@1.3.12': + resolution: {integrity: sha512-DBv81elK+/VSwXHDlnH3Qduw+KxkTIWi7TXkAeh24zpi5l0B2kUg9Ga3tb4nJaPcOFswflgi/yAvMVBPrxMB+A==} + '@types/chai@5.2.3': resolution: {integrity: sha512-Mw558oeA9fFbv65/y4mHtXDs9bPnFMZAL/jxdPFUpOHHIXX91mcgEHbS5Lahr+pwZFR8A7GQleRWeI6cGFC2UA==} @@ -2420,6 +2433,9 @@ packages: buffer@5.7.1: resolution: {integrity: sha512-EHcyIPBQ4BSGlvjB16k5KgAJ27CIsHY/2JBmCRReo48y9rQ3MaUzWX3KVlBa4U7MyX02HdVj0K7C3WaB3ju7FQ==} + bun-types@1.3.12: + resolution: {integrity: sha512-HqOLj5PoFajAQciOMRiIZGNoKxDJSr6qigAttOX40vJuSp6DN/CxWp9s3C1Xwm4oH7ybueITwiaOcWXoYVoRkA==} + bytes@3.1.2: resolution: {integrity: sha512-/Nf7TyzTx6S3yRJObOAV7956r8cr2+Oj8AC5dt8wSP3BQAoeX58NoHyCU8P8zGkNXStjTSi6fzO6F0pBdcYbEg==} engines: {node: '>= 0.8'} @@ -7946,6 +7962,10 @@ snapshots: '@types/connect': 3.4.38 '@types/node': 20.19.37 + '@types/bun@1.3.12': + dependencies: + bun-types: 1.3.12 + '@types/chai@5.2.3': dependencies: '@types/deep-eql': 4.0.2 @@ -8654,6 +8674,10 @@ snapshots: base64-js: 1.5.1 ieee754: 1.2.1 + bun-types@1.3.12: + dependencies: + '@types/node': 20.19.37 + bytes@3.1.2: {} cac@6.7.14: {} diff --git a/pnpm-workspace.yaml b/pnpm-workspace.yaml index eccc335..607f971 100644 --- a/pnpm-workspace.yaml +++ b/pnpm-workspace.yaml @@ -1,2 +1,3 @@ packages: - - 'packages/**' \ No newline at end of file + - 'packages/**' + - 'benchmarks/ama-bench' \ No newline at end of file