From 47fa138f5cecfa53c5c144d009b90faa14b71879 Mon Sep 17 00:00:00 2001 From: Steve <1407088+stevepridemore@users.noreply.github.com> Date: Sun, 10 May 2026 22:42:45 -0400 Subject: [PATCH 1/4] install: curl-pipeable primary/secondary device installers + GHCR image MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Refactors the install path from "clone the repo, npm install, build, compose up, copy slash commands by hand" into two one-line installers that work without cloning or building anything locally: curl -fsSL .../scripts/install-primary.sh | bash -s v0.3.0 curl -fsSL .../scripts/install-secondary.sh | bash -s v0.3.0 Plus PowerShell mirrors (install-primary.ps1 / install-secondary.ps1) so Windows users without bash can install without WSL/Git Bash. What changed - .github/workflows/release.yml — publishes the MCP image to GHCR on every v* tag push. CI keeps doing the existing typecheck + tests. - Dockerfile is now multi-stage: builds dist/ inside the container so the GHCR image doesn't depend on a host-side `npm run build`. Bundles prompts/ and scripts/sync-dream-skill.py for the entrypoint. - docker/entrypoint.sh — seeds ~/graph-memory/prompts/ from the baked-in copy on first run (idempotent), and generates a self-signed TLS cert at ~/graph-memory/certs/ if TLS_CERT is set but the file doesn't exist yet. Without the cert step a fresh install crashes immediately on startup with ENOENT trying to read server.crt — caught during Tier 1 local testing. - docker-compose.yml now defaults to the GHCR image (ghcr.io/stevepridemore/graph-memory-mcp:${MCP_IMAGE_TAG:-latest}). docker-compose.dev.yml is a small override that switches back to a local `build: .` for developers; opt-in via `docker compose -f docker-compose.yml -f docker-compose.dev.yml`. - skills/ — vendors the 12 graph-memory-specific slash commands so they can be released alongside the server. Existing skills referenced ~/.claude/graph-memory/... (a path that doesn't exist; the real data root is ~/graph-memory/) and Windows-only ~/AppData paths. Fixed to use ~/graph-memory/.tmp/ for scratch and platform-agnostic install notes for ffmpeg etc. - scripts/sync-dream-skill.py — new --prompts-dir flag lets the installer point at the entrypoint-seeded copy. New --os {auto,windows,unix} flag picks the correct path separator for the host OS (was hard-coded to Windows backslashes). - scripts/test-install-local.sh — Tier-1 local test harness. Patches the install script to use file:// URLs, runs it under a sandboxed $HOME, and verifies the expected files exist with no personal-data leaks. - README — Quick start replaced with three explicitly labeled install paths: Primary Device (runs the containers + dream + maintenance), Secondary Device (HTTP+OAuth client only), and Developer (from source). Each shows both bash and PowerShell command lines. - docs/SCHEMA.md → docs/GRAPH_SCHEMA_REFERENCE.md so the deep-dive ref doesn't shadow the new agent-facing /GRAPH_SCHEMA.md at the root. - .dockerignore added to keep the multi-stage build context lean. Tier 1 local testing passed end-to-end: install script dry-run produces all expected files in a sandbox, multi-stage build succeeds in 53s, entrypoint seeds prompts + generates certs idempotently, docker compose up brings the stack healthy in ~18s, /health returns 200 over HTTPS. The TLS-cert crash was found and fixed before this commit. --- .dockerignore | 35 +++ .github/workflows/release.yml | 41 ++++ CLAUDE.md | 16 +- Dockerfile | 34 ++- GRAPH_SCHEMA.md | 202 ++++++++++++++++++ README.md | 86 +++++--- docker-compose.dev.yml | 15 ++ docker-compose.yml | 7 +- docker/entrypoint.sh | 49 +++++ docs/ARCHITECTURE.md | 2 +- docs/{SCHEMA.md => GRAPH_SCHEMA_REFERENCE.md} | 0 scripts/install-primary.ps1 | 123 +++++++++++ scripts/install-primary.sh | 139 ++++++++++++ scripts/install-secondary.ps1 | 69 ++++++ scripts/install-secondary.sh | 83 +++++++ scripts/sync-dream-skill.py | 87 ++++++-- scripts/test-install-local.sh | 163 ++++++++++++++ skills/graph-ask/SKILL.md | 176 +++++++++++++++ skills/graph-backup/SKILL.md | 33 +++ skills/graph-boost/SKILL.md | 23 ++ skills/graph-bootstrap/SKILL.md | 55 +++++ skills/graph-briefing/SKILL.md | 45 ++++ skills/graph-capture/SKILL.md | 142 ++++++++++++ skills/graph-dream/SKILL.md | 35 +++ skills/graph-find/SKILL.md | 21 ++ skills/graph-stats/SKILL.md | 31 +++ skills/graph/SKILL.md | 31 +++ skills/ingest-audio/SKILL.md | 127 +++++++++++ skills/ingest/SKILL.md | 98 +++++++++ 29 files changed, 1907 insertions(+), 61 deletions(-) create mode 100644 .dockerignore create mode 100644 .github/workflows/release.yml create mode 100644 GRAPH_SCHEMA.md create mode 100644 docker-compose.dev.yml create mode 100755 docker/entrypoint.sh rename docs/{SCHEMA.md => GRAPH_SCHEMA_REFERENCE.md} (100%) create mode 100644 scripts/install-primary.ps1 create mode 100755 scripts/install-primary.sh create mode 100644 scripts/install-secondary.ps1 create mode 100755 scripts/install-secondary.sh create mode 100755 scripts/test-install-local.sh create mode 100644 skills/graph-ask/SKILL.md create mode 100644 skills/graph-backup/SKILL.md create mode 100644 skills/graph-boost/SKILL.md create mode 100644 skills/graph-bootstrap/SKILL.md create mode 100644 skills/graph-briefing/SKILL.md create mode 100644 skills/graph-capture/SKILL.md create mode 100644 skills/graph-dream/SKILL.md create mode 100644 skills/graph-find/SKILL.md create mode 100644 skills/graph-stats/SKILL.md create mode 100644 skills/graph/SKILL.md create mode 100644 skills/ingest-audio/SKILL.md create mode 100644 skills/ingest/SKILL.md diff --git a/.dockerignore b/.dockerignore new file mode 100644 index 0000000..134d155 --- /dev/null +++ b/.dockerignore @@ -0,0 +1,35 @@ +# Keep the build context lean. The multi-stage Dockerfile only needs +# package*.json, tsconfig.json, src/, schema/, prompts/, and scripts/. + +node_modules +dist +coverage + +# Local state and secrets — never bake into the image +.env +.env.* +!.env.example +.mcp.json +.claude +.git +.github + +# Docs and internal — not needed at runtime +docs +*.md +!README.md +!CLAUDE.md +!GRAPH_SCHEMA.md + +# Local backup / data dirs (in case anyone keeps one inside the repo) +backups +graph-memory + +# Test output +test-results + +# Editor / OS +.DS_Store +Thumbs.db +.vscode +.idea diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml new file mode 100644 index 0000000..6b16eb3 --- /dev/null +++ b/.github/workflows/release.yml @@ -0,0 +1,41 @@ +name: Release + +# Publishes the graph-memory-mcp container image to GHCR on every v* tag push, +# tagged with both the version and `latest`. The image is what the +# install-primary.sh script pulls — no clone or local build required for users. + +on: + push: + tags: + - 'v*' + +permissions: + contents: read + packages: write + +jobs: + publish-image: + name: Publish container image + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v5 + + - uses: docker/setup-buildx-action@v3 + + - name: Log in to GHCR + uses: docker/login-action@v3 + with: + registry: ghcr.io + username: ${{ github.actor }} + password: ${{ secrets.GITHUB_TOKEN }} + + - name: Build and push + uses: docker/build-push-action@v6 + with: + context: . + push: true + tags: | + ghcr.io/stevepridemore/graph-memory-mcp:${{ github.ref_name }} + ghcr.io/stevepridemore/graph-memory-mcp:latest + cache-from: type=gha + cache-to: type=gha,mode=max diff --git a/CLAUDE.md b/CLAUDE.md index 818ca78..356c70e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -38,15 +38,13 @@ You MAY write to the graph during conversation for **high-confidence, explicit** | Mentioned in context but not stated directly | **don't write** — let dream handle at 0.3 | ### Use specific relationship types — not just RELATED_TO -| Relationship | When to use | -|-------------|-------------| -| WORKS_ON | Person → Project | -| USES / USES_TECH | Person/Project → Technology | -| KNOWS_ABOUT | Person → Concept/Technology | -| PREFERS | Person → Preference | -| DECIDED_FOR | Person/Project → Decision | -| PARTICIPATED_IN | Person → Event | -| RELATED_TO | Only when nothing else fits | + +The full vocabulary (node types and edge verbs) lives in +[`GRAPH_SCHEMA.md`](GRAPH_SCHEMA.md) at the project root. Read that +file before writing edges so you pick a specific verb (`ABOUT`, +`PART_OF`, `IMPLEMENTS`, `INSPIRED_BY`, `DEPENDS_ON`, etc.) over +generic `RELATED_TO`. `RELATED_TO` is the fallback only — use it when +no specific verb fits. ### Always include provenance - `source_type`: `"conversation"` diff --git a/Dockerfile b/Dockerfile index 13eed7b..a20a87e 100644 --- a/Dockerfile +++ b/Dockerfile @@ -1,3 +1,15 @@ +# ---- Build stage: compile TypeScript --------------------------------------- +FROM node:22-bookworm-slim AS build + +WORKDIR /build + +COPY package*.json tsconfig.json ./ +RUN npm ci + +COPY src/ ./src/ +RUN npm run build + +# ---- Runtime stage --------------------------------------------------------- FROM node:22-bookworm-slim # Debian-slim is required (not alpine) because onnxruntime-node — used by @@ -6,19 +18,32 @@ FROM node:22-bookworm-slim WORKDIR /app -# wget for healthcheck, ca-certificates for HTTPS model download from huggingface.co +# wget for healthcheck, ca-certificates for HTTPS model download from huggingface.co, +# python3 for the prompt/skill path-substitution helper that the install script runs, +# openssl for self-signed TLS cert generation on first run (see docker/entrypoint.sh). RUN apt-get update && apt-get install -y --no-install-recommends \ - wget ca-certificates \ + wget ca-certificates python3 openssl \ && rm -rf /var/lib/apt/lists/* # Production dependencies only COPY package*.json ./ RUN npm ci --omit=dev && npm cache clean --force -# Pre-built JavaScript (run `npm run build` before `docker compose build`) -COPY dist/ ./dist/ +# Compiled JavaScript from the build stage +COPY --from=build /build/dist ./dist COPY schema/ ./schema/ +# Canonical prompts and helper scripts. The entrypoint copies prompts/ out to +# the host-mounted /root/graph-memory/prompts/ on first start so the user's +# scheduled tasks can read them from a stable host path. +COPY prompts/ ./prompts/ +COPY scripts/sync-dream-skill.py ./scripts/sync-dream-skill.py + +# Entrypoint shim: seeds the host data dir on first run, then exec's the +# server. Always runs (idempotent — no-op if the target already exists). +COPY docker/entrypoint.sh /usr/local/bin/entrypoint.sh +RUN chmod +x /usr/local/bin/entrypoint.sh + ENV MCP_TRANSPORT=http ENV MCP_PORT=3847 @@ -27,4 +52,5 @@ EXPOSE 3847 HEALTHCHECK --interval=30s --timeout=5s --start-period=30s \ CMD wget -qO- --no-check-certificate https://127.0.0.1:3847/health || exit 1 +ENTRYPOINT ["/usr/local/bin/entrypoint.sh"] CMD ["node", "dist/mcp-server/index.js"] diff --git a/GRAPH_SCHEMA.md b/GRAPH_SCHEMA.md new file mode 100644 index 0000000..2fb1775 --- /dev/null +++ b/GRAPH_SCHEMA.md @@ -0,0 +1,202 @@ +# Graph Schema (vocabulary) + +Concise reference for the node types and edge verbs used in the memory +graph. Optimized for the agent (Claude) to scan when writing edges, and +for humans skimming the project. + +For the full reference (weight math, decay functions, validity windows, +init Cypher, example queries) see +[`docs/GRAPH_SCHEMA_REFERENCE.md`](docs/GRAPH_SCHEMA_REFERENCE.md). + +## Conventions + +- Every node carries the `:Entity` label plus its specific type label. +- Edges are directed in storage; symmetric edges are stored once and + queried without an arrow (see "Directionality" below). +- Prefer a specific verb over `RELATED_TO`. `RELATED_TO` is the + fallback for "connected somehow but nothing else fits." +- All nodes share these common properties: `id`, `name`, `subtype`, + `confidence` (0.0–1.0), `times_mentioned`, `first_seen`, `last_seen`, + `tenant_id`, optionally `embedding` (384-dim). + +## Status legend + +- ✅ **In use** — appears in the current graph and is documented here. +- 🆕 **Proposed** — added during the 2026-05-10 vocabulary expansion. + Dream + in-conversation writes should start using these; a future + retyping campaign will backfill existing `RELATED_TO` edges. +- 🤔 **Consolidation candidate** — overlaps with another type and may + be merged in a future cleanup. Use the canonical sibling unless the + distinction is meaningful for your case. + +## Node types + +| Type | Use when | Notes | +|---|---|---| +| `Person` ✅ | A human (user, colleague, family, contact) | Subtypes: `individual`, `contact`, `group` | +| `Organization` ✅ | A company, agency, team, or institution | E.g. FBBE, Anthropic | +| `Project` ✅ | A bounded body of work, initiative, or codebase | Subtypes: `active`, `paused`, `completed`, `abandoned` | +| `Feature` ✅ 🤔 | A sub-component of a Project | Overlaps with `Project`; use only when the feature has its own lifecycle worth tracking separately. Otherwise prefer Project + describe the feature in properties. | +| `Concept` ✅ | An abstract idea, pattern, framework, or technology category | The fallback for "named thing that isn't an instance" | +| `Technology` ✅ 🤔 | A specific tool, language, library, or platform | Overlaps with `Concept`. Currently a categorized Concept; the live graph uses both. Convention going forward: prefer `Technology` for concrete tech (React, Neo4j); `Concept` for abstractions (LLM-wiki-pattern, MVC). | +| `Decision` ✅ | A choice made or position taken | Often emitted by the dream extractor from "we decided…" / "I chose…" statements. | +| `Reasoning` ✅ | The why behind a Decision | Pairs with Decision via `LED_TO`; lighter than Decision itself. | +| `Preference` ✅ | A stated user preference or rule | Person `PREFERS` Preference. Has `domain`, `key`, `value` properties. | +| `Event` ✅ | A point-in-time happening | Meeting, milestone, release, incident | +| `Fact` ✅ | A standalone piece of knowledge | Description-heavy. Best paired with `ABOUT` to whatever it's a fact *about*. | +| `Artifact` ✅ | A created/authored output (doc, file, transcript, gist) | Subtype of Object — distinct because authorship matters. Pair with `AUTHORED` / `PRODUCED`. | +| `Object` ✅ 🤔 | A "thing in the world" that isn't covered by a more specific type | Heavy overlap with `Resource` and `Infrastructure`. The live graph leans on this as a catch-all; consider whether your case is really `Resource`, `Infrastructure`, or `Artifact` first. | +| `Resource` ✅ 🤔 | A consumable or referenceable thing | Overlaps with `Object`. Currently rare in graph (1 node). Candidate for merge into `Object` unless it earns its keep. | +| `Infrastructure` ✅ 🤔 | A server, host, network device, deployment target | Overlaps with `Object`. Currently rare (1 node). Candidate for merge into `Object` with `subtype: infrastructure`. | +| `Alias` ✅ | An alternate name pointing at a canonical entity | Used by alias resolution; rarely created directly. | + +**Consolidation summary:** `Object` / `Resource` / `Infrastructure` +overlap heavily — the latter two have only 1–2 nodes each. A future +cleanup can collapse them into `Object` with subtypes. `Concept` / +`Technology` is a softer overlap; the rule above (concrete = Technology, +abstract = Concept) keeps both useful. + +## Edge verbs + +Verbs are grouped by purpose. Direction notation: `A → B` means the +edge points from A to B. + +### People & roles + +| Verb | Direction | Use when | +|---|---|---| +| `WORKS_ON` ✅ | Person → Project | Person actively contributes to a project | +| `WORKS_AT` ✅ | Person → Organization | Employment | +| `REPORTS_TO` ✅ | Person → Person | Org-chart reporting line | +| `STAKEHOLDER_IN` ✅ | Person → Project/Decision | Has interest but isn't owner | +| `KNOWS` 🆕 | Person ↔ Person | General acquaintance (symmetric) | +| `COLLABORATES_WITH` 🆕 | Person ↔ Person | Active working relationship (symmetric) | +| `FAMILY_OF` 🆕 | Person ↔ Person | Family tie; use a `role` property (`spouse`, `sibling`, `parent`) | +| `MENTOR_OF` 🆕 | Person → Person | Mentorship/teaching | + +### Knowledge, preferences, decisions + +| Verb | Direction | Use when | +|---|---|---| +| `KNOWS_ABOUT` ✅ | Person → Concept/Technology | Subject-matter familiarity | +| `PREFERS` ✅ | Person → Preference | Stated preference | +| `DECIDED_FOR` ✅ | Person/Project → Decision | Owns or made a decision | +| `LED_TO` ✅ | Event/Decision → Outcome | Causal arrow | +| `INVOLVED_IN` ✅ | Person → Event/Project | Lighter than `PARTICIPATED_IN` | + +### Tech & dependencies + +| Verb | Direction | Use when | +|---|---|---| +| `USES` ✅ | * → Tool/Object | General usage | +| `USES_TECH` ✅ | Project → Technology | Specifically a technology dependency | +| `DEPENDS_ON` ✅ | * → * | Functional dependency | +| `IMPLEMENTS` 🆕 | Project/Class → Concept/Interface | Concrete realization of a pattern; in code, "class implements interface" | +| `EXTENDS` 🆕 | Class → Class | Code-level class inheritance (Java `extends`) | +| `INSPIRED_BY` 🆕 | Project → Concept/Project | Origin/influence without inheritance | +| `BUILDS_ON` 🆕 | * → * | Direct extension at the idea level | +| `DERIVED_FROM` 🆕 | * → * | Descended from another (forks, extracted concepts) | + +### Composition & taxonomy + +| Verb | Direction | Use when | +|---|---|---| +| `CONTAINS` 🆕 | * → * | Composition (repo contains file; project contains feature) | +| `PART_OF` 🆕 | * → * | Inverse of `CONTAINS`; pick one direction per fact, don't store both | +| `INSTANCE_OF` 🆕 | Object → Concept | Concrete thing of an abstract category | +| `CATEGORIZED_AS` 🆕 | * → Concept | Lighter classification when `INSTANCE_OF` is too strong | + +### Reference & description (covers the biggest current `RELATED_TO` bucket) + +| Verb | Direction | Use when | +|---|---|---| +| `ABOUT` 🆕 | Fact/Artifact → * | The fact or document is *about* its subject | +| `DESCRIBES` 🆕 | Artifact → * | Artifact describes its subject (stronger authoring intent than `ABOUT`) | +| `DOCUMENTS` 🆕 | Artifact → Project/Object | Specifically reference documentation | +| `ATTRIBUTED_TO` 🆕 | Fact/Quote → Person | Source of a statement or observation | + +### Authorship, production, governance + +| Verb | Direction | Use when | +|---|---|---| +| `PRODUCED` ✅ | Person/Project → Artifact/Event | Authorship of an output | +| `AUTHORED` 🆕 | Person → Artifact | Wrote/created a doc, post, or code | +| `CREATED` 🆕 | Person → Project/Object | Brought something into existence (broader than `AUTHORED`) | +| `AFFECTS` 🆕 | Decision → Object/Project | A decision applies to or constrains a target | +| `GOVERNS` 🆕 | Decision/Policy → * | Stronger than `AFFECTS` — the target is *bound by* the decision | + +### Lifecycle & ordering + +| Verb | Direction | Use when | +|---|---|---| +| `SUPERSEDES` ✅ | * → * | Bi-temporal replacement (with `valid_at`) | +| `REPLACES` 🆕 | Object → Object | Successor; less formal than `SUPERSEDES` (no bi-temporal contract) | +| `DEPRECATED_BY` 🆕 | Object → Object | Inverse of `REPLACES`; the source is on its way out | +| `BLOCKS` 🆕 | Issue/Task → Issue/Task | Forward dependency | +| `BLOCKED_BY` 🆕 | Issue/Task → Issue/Task | Inverse of `BLOCKS` | +| `RESOLVED_BY` 🆕 | Issue/Fact → Decision/Event | Closes a contradiction or open question | + +### Place & runtime + +| Verb | Direction | Use when | +|---|---|---| +| `LOCATED_IN` 🆕 | Person/Object → Place | Geographic placement | +| `DEPLOYED_TO` 🆕 | Project → Infrastructure/Place | Runtime deployment target | + +### Events & temporal + +| Verb | Direction | Use when | +|---|---|---| +| `PARTICIPATED_IN` ✅ | Person → Event/Organization | Membership / past involvement | +| `OCCURRED_DURING` ✅ | Event → Event/Time | Temporal containment | +| `TRIGGERED_BY` ✅ | Event/Decision → Event/Cause | What caused this | + +### Identity & contradiction + +| Verb | Direction | Use when | +|---|---|---| +| `ALIAS_OF` ✅ | * ↔ * | Same thing, different name (kept un-merged); symmetric | +| `CONTRADICTS` ✅ | * ↔ * | Mutually exclusive facts surfaced for human review | +| `RELATED_TO` ✅ | * → * | **Fallback only** — use a specific verb if one fits | + +## Directionality + +Neo4j stores every relationship with a direction, but you can query +without one. Three patterns: + +1. **Symmetric verbs** (`KNOWS`, `COLLABORATES_WITH`, `FAMILY_OF`, + `ALIAS_OF`, `CONTRADICTS`, `RELATED_TO`): store one edge, query + without an arrow (`MATCH (a)-[r:KNOWS]-(b)`). Direction in storage + is meaningless. +2. **Inverse-pair verbs** (`CONTAINS`/`PART_OF`, `BLOCKS`/`BLOCKED_BY`, + `REPLACES`/`DEPRECATED_BY`): only store one direction per fact — + pick the canonical (typically active voice → forward arrow) and let + queries follow either. +3. **Asymmetric verbs** (`AUTHORED`, `EXTENDS`, `INSPIRED_BY`, + `WORKS_ON`, etc.): direction is meaningful and unique. Only one edge. + +Never store both directions of the same fact — it doubles storage and +the weights drift out of sync over time. + +## When to write to the graph during conversation + +See the rules in [`CLAUDE.md`](CLAUDE.md). Short version: + +- Write only for **high-confidence, explicit** statements (weight 0.7). +- Always include provenance: `source_type: "conversation"` and the + current `source_session` if available. +- Never write secrets (API keys, passwords, tokens). Note existence + only. +- Defer ambiguous or inferred context to the dream process at weight + 0.3. + +## Adding a new verb or node type + +1. Decide whether it's rare enough to warrant `RELATED_TO` (or + `Object`), or whether it deserves a name. +2. If it deserves a name, add it here with status 🆕, direction, and a + one-sentence "use when." +3. Update `~/.claude/GRAPH_SCHEMA.md` if the new type is genuinely + universal (applies in any project), not just this one. +4. After it sees real use across multiple sessions, change status to ✅. + diff --git a/README.md b/README.md index 1324fc1..d638b58 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,7 @@ Every node and edge carries: - `embedding` (nodes) — 384-dim vector for semantic search - `valid_at` / `invalid_at` / `ingested_at` (edges) — bi-temporal tracking -Full schema in [`docs/SCHEMA.md`](docs/SCHEMA.md). +Concise vocabulary in [`GRAPH_SCHEMA.md`](GRAPH_SCHEMA.md). Full reference (weights, decay, validity windows, init Cypher) in [`docs/GRAPH_SCHEMA_REFERENCE.md`](docs/GRAPH_SCHEMA_REFERENCE.md). ## Tools @@ -93,38 +93,71 @@ Slash-command wrappers (`/graph`, `/graph-ask`, `/graph-search`, `/graph-stats`, - **[cloudflared](https://github.com/cloudflare/cloudflared)** + a Cloudflare account — only needed for the multi-device / claude.ai web setup described in [`docs/REMOTE.md`](docs/REMOTE.md). Local-only deployments don't need it. - **Python 3.10+** — required only by MarkItDown and by `scripts/sync-dream-skill.py`. -## Quick start (single-machine, local only) +## Install -For just running the graph on your laptop with stdio access from Claude Code: +graph-memory has exactly one "primary device" — the machine that **runs the two Docker containers** (Neo4j + the MCP server) **and** runs the nightly dream + weekly maintenance scheduled tasks. Every other device is a "secondary device" that talks to the primary over HTTPS + OAuth — secondaries don't run their own containers and don't run their own dream process. Pick the install path that matches the role of the device you're sitting at right now. + +### Install — Primary Device (this device runs the containers) + +Use this on the machine that will host Neo4j + the MCP server. This is also where the nightly dream and weekly maintenance scheduled tasks run, so the Claude Code transcripts you want extracted should live on this device. + +**Linux / macOS / Windows with Git Bash or WSL:** ```bash -git clone -cd graph-memory -cp .env.example .env -# Edit .env — set NEO4J_PASSWORD to anything ≥8 chars, -# GRAPH_MEMORY_HOME to your data root (e.g. C:\Users\you\graph-memory or ~/graph-memory), -# and CLAUDE_PROJECTS_DIR to your Claude transcripts folder -# (e.g. C:\Users\you\.claude\projects or ~/.claude/projects) -npm install -npm run build -docker compose up -d +curl -fsSL https://raw.githubusercontent.com/stevepridemore/graph-memory/v0.3.0/scripts/install-primary.sh \ + | bash -s v0.3.0 +# edit ~/graph-memory/.env (NEO4J_PASSWORD, GRAPH_MEMORY_HOME, CLAUDE_PROJECTS_DIR) +cd ~/graph-memory && docker compose up -d ``` -Then add to `.mcp.json` (project-local) or `~/.claude/.mcp.json` (global). A ready-to-copy template is included at [`.mcp.json.example`](.mcp.json.example): +**Windows PowerShell (no bash needed):** -```json -{ - "mcpServers": { - "graph-memory": { - "command": "docker", - "args": ["exec", "-i", "-e", "MCP_TRANSPORT=stdio", - "graph-memory-mcp", "node", "/app/dist/mcp-server/index.js"] - } - } -} +```powershell +$v = 'v0.3.0' +iwr "https://raw.githubusercontent.com/stevepridemore/graph-memory/$v/scripts/install-primary.ps1" -UseBasicParsing -OutFile $env:TEMP\gm-install.ps1 +& $env:TEMP\gm-install.ps1 -Version $v +# edit $HOME\graph-memory\.env +cd $HOME\graph-memory; docker compose up -d +``` + +Verify with `/graph-stats` in any Claude Code session. + +Optional: see [`docs/REMOTE.md`](docs/REMOTE.md) for the Cloudflare Tunnel + Access setup that lets secondary devices and claude.ai web reach this graph remotely. + +### Install — Secondary Device (this device just talks to the primary) + +Use this on every additional laptop, work computer, or phone. No Docker, no Neo4j — just the slash commands and an MCP client config pointed at the primary device's Cloudflare Tunnel URL. The primary device must already have the tunnel set up per [`docs/REMOTE.md`](docs/REMOTE.md). + +**Linux / macOS / Windows with Git Bash or WSL:** + +```bash +curl -fsSL https://raw.githubusercontent.com/stevepridemore/graph-memory/v0.3.0/scripts/install-secondary.sh \ + | bash -s v0.3.0 your-tunnel-host.example.com +``` + +**Windows PowerShell (no bash needed):** + +```powershell +$v = 'v0.3.0' +iwr "https://raw.githubusercontent.com/stevepridemore/graph-memory/$v/scripts/install-secondary.ps1" -UseBasicParsing -OutFile $env:TEMP\gm-install.ps1 +& $env:TEMP\gm-install.ps1 -Version $v -TunnelHost your-tunnel-host.example.com +``` + +First `/graph-stats` call triggers the OAuth browser flow once; subsequent calls use the cached bearer token. + +### Install — Developer (build from source) + +Use this if you want to modify graph-memory itself. Requires Node 22+ and Docker. + +```bash +git clone https://github.com/stevepridemore/graph-memory +cd graph-memory +cp .env.example .env # edit as above +npm install && npm run build +docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d ``` -Verify with `/graph-stats` in any Claude Code conversation. +The `docker-compose.dev.yml` override switches the MCP service from the published GHCR image to a local `build: .` so your edits get picked up on rebuild. ## Multi-device / claude.ai web access @@ -206,7 +239,8 @@ Currently steady-state. Active development is opportunistic; the system runs una ## Documentation - [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) — system design, data flows, component responsibilities -- [`docs/SCHEMA.md`](docs/SCHEMA.md) — node types, relationship types, decay functions, weight semantics +- [`GRAPH_SCHEMA.md`](GRAPH_SCHEMA.md) — concise vocabulary (node types + edge verbs) for both agent and humans +- [`docs/GRAPH_SCHEMA_REFERENCE.md`](docs/GRAPH_SCHEMA_REFERENCE.md) — full reference (decay functions, weight semantics, validity windows, example queries) - [`docs/MCP_SERVER.md`](docs/MCP_SERVER.md) — every MCP tool with input/output schemas - [`docs/DREAM_PROCESS.md`](docs/DREAM_PROCESS.md) — extraction pipeline, manifest format, changelog structure - [`docs/SKILLS.md`](docs/SKILLS.md) — slash command definitions diff --git a/docker-compose.dev.yml b/docker-compose.dev.yml new file mode 100644 index 0000000..a1576d2 --- /dev/null +++ b/docker-compose.dev.yml @@ -0,0 +1,15 @@ +# Dev override: builds the graph-memory-mcp image locally from the working +# tree instead of pulling from GHCR. Use this when iterating on server code. +# +# Usage from a clone: +# docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d +# +# The build re-runs whenever you change source files. For a one-line dev +# command, you can alias this: +# alias dgm='docker compose -f docker-compose.yml -f docker-compose.dev.yml' +# dgm up -d --build + +services: + graph-memory-mcp: + build: . + image: graph-memory-mcp:dev diff --git a/docker-compose.yml b/docker-compose.yml index 5570a27..bce27c2 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -41,8 +41,11 @@ services: max-file: "5" graph-memory-mcp: - build: . - image: graph-memory-mcp:latest + # Production default: pull the pre-built image from GHCR. Pin a specific + # tag via MCP_IMAGE_TAG in .env (e.g. MCP_IMAGE_TAG=v0.3.0); defaults to + # `latest`. To build locally from source instead, use the dev override: + # docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d + image: ghcr.io/stevepridemore/graph-memory-mcp:${MCP_IMAGE_TAG:-latest} container_name: graph-memory-mcp restart: unless-stopped ports: diff --git a/docker/entrypoint.sh b/docker/entrypoint.sh new file mode 100755 index 0000000..b00844c --- /dev/null +++ b/docker/entrypoint.sh @@ -0,0 +1,49 @@ +#!/usr/bin/env bash +# Container entrypoint. Two first-run setup tasks, then exec's the MCP server: +# +# 1. Seed the host-mounted data dir with canonical prompts (idempotent — +# skipped if ~/graph-memory/prompts/ already exists). +# 2. Generate a self-signed TLS cert at ~/graph-memory/certs/ if the server +# was configured to use TLS (env TLS_CERT + TLS_KEY) but the files +# aren't there yet. The MCP server reads these synchronously and crashes +# hard if they're missing, so a fresh install otherwise fails to boot. +# +# To pick up updated prompts after upgrading the image, delete or move +# ~/graph-memory/prompts/ on the host before restarting. To rotate the +# self-signed cert, delete ~/graph-memory/certs/. + +set -euo pipefail + +DATA_PROMPTS=/root/graph-memory/prompts +BAKED_PROMPTS=/app/prompts + +if [ -d "$BAKED_PROMPTS" ] && [ ! -d "$DATA_PROMPTS" ]; then + echo "[entrypoint] seeding $DATA_PROMPTS from $BAKED_PROMPTS" + mkdir -p "$(dirname "$DATA_PROMPTS")" + cp -r "$BAKED_PROMPTS" "$DATA_PROMPTS" +fi + +# Self-signed TLS cert for the HTTP/MCP transport. Only generate if TLS_CERT +# is set (the server is asking for TLS) and the file doesn't exist yet. For +# Cloudflare Tunnel deployments the tunnel terminates TLS — but the upstream +# (this server) still listens on HTTPS, so the cert is required either way. +if [ -n "${TLS_CERT:-}" ] && [ ! -f "$TLS_CERT" ]; then + CERT_DIR="$(dirname "$TLS_CERT")" + KEY_FILE="${TLS_KEY:-${CERT_DIR}/server.key}" + echo "[entrypoint] generating self-signed TLS cert at $TLS_CERT" + mkdir -p "$CERT_DIR" + # 10-year self-signed cert with SAN for localhost — Cloudflare Tunnel + # doesn't validate the upstream cert, so SAN coverage doesn't matter + # operationally, but localhost is the smallest reasonable default. + openssl req -x509 -newkey rsa:2048 -nodes \ + -keyout "$KEY_FILE" \ + -out "$TLS_CERT" \ + -days 3650 \ + -subj "/CN=graph-memory-mcp" \ + -addext "subjectAltName=DNS:localhost,IP:127.0.0.1" \ + >/dev/null 2>&1 + chmod 600 "$KEY_FILE" "$TLS_CERT" + echo "[entrypoint] cert generated ($(openssl x509 -in "$TLS_CERT" -noout -enddate))" +fi + +exec "$@" diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 4f9b3c9..a80471a 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -394,7 +394,7 @@ Dream process extracts 8 entities and 15 relationships from a transcript ├── docker-compose.yml # Neo4j + MCP container definitions ├── docs/ │ ├── ARCHITECTURE.md # This file -│ ├── SCHEMA.md +│ ├── GRAPH_SCHEMA_REFERENCE.md │ ├── MCP_SERVER.md │ ├── DREAM_PROCESS.md │ ├── REMOTE.md # Cloudflare Tunnel + OAuth setup diff --git a/docs/SCHEMA.md b/docs/GRAPH_SCHEMA_REFERENCE.md similarity index 100% rename from docs/SCHEMA.md rename to docs/GRAPH_SCHEMA_REFERENCE.md diff --git a/scripts/install-primary.ps1 b/scripts/install-primary.ps1 new file mode 100644 index 0000000..6c9a8fc --- /dev/null +++ b/scripts/install-primary.ps1 @@ -0,0 +1,123 @@ +# graph-memory primary-device installer (PowerShell, for Windows users +# without bash/WSL/Git Bash). Mirrors scripts/install-primary.sh. +# +# Usage: +# iwr -UseBasicParsing https://raw.githubusercontent.com/stevepridemore/graph-memory/v0.3.0/scripts/install-primary.ps1 ` +# | iex +# ...wait, that pattern only works without arguments. For tagged installs: +# +# $v = 'v0.3.0' +# iwr "https://raw.githubusercontent.com/stevepridemore/graph-memory/$v/scripts/install-primary.ps1" -UseBasicParsing -OutFile $env:TEMP\gm-install.ps1 +# & $env:TEMP\gm-install.ps1 -Version $v + +param( + [Parameter(Mandatory=$true)] + [string]$Version +) + +$ErrorActionPreference = 'Stop' + +$Repo = 'stevepridemore/graph-memory' +$Raw = "https://raw.githubusercontent.com/$Repo/$Version" +if ($Version -eq 'latest') { + $Tarball = "https://github.com/$Repo/archive/refs/heads/main.zip" + $TarballPrefix = 'graph-memory-main' +} else { + $Tarball = "https://github.com/$Repo/archive/refs/tags/$Version.zip" + $TarballPrefix = "graph-memory-$($Version.TrimStart('v'))" +} + +Write-Host "[install-primary] graph-memory $Version" + +# 0. Pre-flight: Docker must be installed and the daemon must be running. +# Without it, `docker compose up` later will fail with a less helpful error. +if (-not (Get-Command docker -ErrorAction SilentlyContinue)) { + Write-Host '' + Write-Host '[install-primary] ERROR: docker is not installed on this device.' -ForegroundColor Red + Write-Host '' + Write-Host ' graph-memory runs Neo4j + the MCP server as Docker containers.' + Write-Host ' Install Docker Desktop for Windows from:' + Write-Host ' https://www.docker.com/products/docker-desktop/' + Write-Host ' Then re-run this installer.' + exit 1 +} + +# `docker info` exits non-zero (and writes to stderr) when the daemon is down, +# even if the CLI is installed. Suppress stderr noise; check exit code only. +$null = & docker info 2>&1 +if ($LASTEXITCODE -ne 0) { + Write-Host '' + Write-Host '[install-primary] ERROR: Docker is installed but the daemon is not running.' -ForegroundColor Red + Write-Host '' + Write-Host ' Start Docker Desktop and wait for the "Docker Desktop is running"' + Write-Host ' notification, then re-run this installer.' + exit 1 +} +Write-Host '[install-primary] docker: OK' + +# 1. Data directory + compose + env template +$DataDir = Join-Path $HOME 'graph-memory' +New-Item -ItemType Directory -Force -Path $DataDir | Out-Null +Write-Host "[install-primary] data dir: $DataDir" + +Invoke-WebRequest "$Raw/docker-compose.yml" -OutFile (Join-Path $DataDir 'docker-compose.yml') -UseBasicParsing +Write-Host "[install-primary] wrote docker-compose.yml" + +$EnvPath = Join-Path $DataDir '.env' +if (-not (Test-Path $EnvPath)) { + Invoke-WebRequest "$Raw/.env.example" -OutFile $EnvPath -UseBasicParsing + Write-Host "[install-primary] wrote .env (TEMPLATE -- edit before starting)" +} else { + Write-Host "[install-primary] kept existing .env" +} + +# 2. Slash commands -- extract skills/ from the release zip into ~/.claude/skills/ +$SkillsDir = Join-Path $HOME '.claude\skills' +New-Item -ItemType Directory -Force -Path $SkillsDir | Out-Null +$TmpZip = Join-Path $env:TEMP "graph-memory-$Version.zip" +$TmpExtract = Join-Path $env:TEMP "graph-memory-$Version-extract" +Invoke-WebRequest $Tarball -OutFile $TmpZip -UseBasicParsing +if (Test-Path $TmpExtract) { Remove-Item -Recurse -Force $TmpExtract } +Expand-Archive -Path $TmpZip -DestinationPath $TmpExtract -Force +$SrcSkills = Join-Path $TmpExtract "$TarballPrefix\skills" +if (Test-Path $SrcSkills) { + Copy-Item -Recurse -Force -Path (Join-Path $SrcSkills '*') -Destination $SkillsDir + $Count = (Get-ChildItem $SrcSkills -Directory).Count + Write-Host "[install-primary] installed slash commands to $SkillsDir ($Count skills)" +} else { + Write-Warning "skills/ not found in release tarball; skipping slash commands" +} +Remove-Item $TmpZip +Remove-Item -Recurse $TmpExtract + +# 3. MCP client config (stdio mode talking to the local docker container) +$ClaudeDir = Join-Path $HOME '.claude' +New-Item -ItemType Directory -Force -Path $ClaudeDir | Out-Null +$McpJson = Join-Path $ClaudeDir '.mcp.json' +if (-not (Test-Path $McpJson)) { + Invoke-WebRequest "$Raw/.mcp.json.example" -OutFile $McpJson -UseBasicParsing + Write-Host "[install-primary] wrote ~/.claude/.mcp.json" +} else { + Write-Host "[install-primary] kept existing ~/.claude/.mcp.json" +} + +Write-Host '' +Write-Host '[install-primary] next steps:' +Write-Host '' +Write-Host " 1. Edit ${EnvPath}:" +Write-Host ' - NEO4J_PASSWORD (>=8 chars)' +Write-Host " - GRAPH_MEMORY_HOME=$DataDir" +Write-Host " - CLAUDE_PROJECTS_DIR=$(Join-Path $HOME '.claude\projects')" +Write-Host '' +Write-Host ' 2. Start the containers:' +Write-Host " cd $DataDir; docker compose up -d" +Write-Host '' +Write-Host ' 3. (Optional) Install scheduled tasks:' +Write-Host ' docker exec graph-memory-mcp python3 /app/scripts/sync-dream-skill.py `' +Write-Host " --user-home '$HOME' --prompts-dir /root/graph-memory/prompts --os windows" +Write-Host '' +Write-Host ' 4. In any Claude Code session, run /graph-stats to verify.' +Write-Host '' +Write-Host ' For multi-device access (claude.ai web, secondary laptops): see docs/REMOTE.md' +Write-Host ' for the optional Cloudflare Tunnel setup. Without it, this install is' +Write-Host ' local-only -- accessible only from this device.' diff --git a/scripts/install-primary.sh b/scripts/install-primary.sh new file mode 100755 index 0000000..9f65867 --- /dev/null +++ b/scripts/install-primary.sh @@ -0,0 +1,139 @@ +#!/usr/bin/env bash +# graph-memory primary-device installer. +# +# Sets up graph-memory on this machine as the "primary device" — the one that +# runs the Neo4j + MCP Docker containers and (optionally) the nightly dream +# and weekly maintenance scheduled tasks. No git clone, no Node.js, no build +# step needed — pulls pre-built artifacts from the GitHub release. +# +# Usage: +# curl -fsSL https://raw.githubusercontent.com/stevepridemore/graph-memory//scripts/install-primary.sh \ +# | bash -s +# +# Example: +# curl -fsSL https://raw.githubusercontent.com/stevepridemore/graph-memory/v0.3.0/scripts/install-primary.sh \ +# | bash -s v0.3.0 +# +# Then: edit ~/graph-memory/.env, cd ~/graph-memory, docker compose up -d. + +set -euo pipefail + +VERSION="${1:-}" +if [ -z "$VERSION" ]; then + echo "usage: install-primary.sh (e.g. v0.3.0 or 'latest' for main)" >&2 + exit 2 +fi + +REPO="stevepridemore/graph-memory" +RAW="https://raw.githubusercontent.com/$REPO/$VERSION" +TARBALL="https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz" +# 'latest' isn't a tag — fall back to main branch tarball +if [ "$VERSION" = "latest" ]; then + TARBALL="https://github.com/$REPO/archive/refs/heads/main.tar.gz" + TARBALL_PREFIX="graph-memory-main" +else + TARBALL_PREFIX="graph-memory-${VERSION#v}" +fi + +echo "[install-primary] graph-memory $VERSION" + +# 0. Pre-flight: Docker must be installed and the daemon must be running. +# Without it, `docker compose up` later will fail with a less helpful error. +if ! command -v docker >/dev/null 2>&1; then + cat >&2 </dev/null 2>&1; then + cat >&2 <&2 +fi + +# 3. MCP client config (stdio mode talking to the local docker container) +mkdir -p "$HOME/.claude" +if [ ! -f "$HOME/.claude/.mcp.json" ]; then + curl -fsSL "$RAW/.mcp.json.example" -o "$HOME/.claude/.mcp.json" + echo "[install-primary] wrote ~/.claude/.mcp.json (stdio to graph-memory-mcp container)" +else + echo "[install-primary] kept existing ~/.claude/.mcp.json" +fi + +cat < $TunnelHost" + +# 1. Slash commands +$SkillsDir = Join-Path $HOME '.claude\skills' +New-Item -ItemType Directory -Force -Path $SkillsDir | Out-Null +$TmpZip = Join-Path $env:TEMP "graph-memory-$Version.zip" +$TmpExtract = Join-Path $env:TEMP "graph-memory-$Version-extract" +Invoke-WebRequest $Tarball -OutFile $TmpZip -UseBasicParsing +if (Test-Path $TmpExtract) { Remove-Item -Recurse -Force $TmpExtract } +Expand-Archive -Path $TmpZip -DestinationPath $TmpExtract -Force +$SrcSkills = Join-Path $TmpExtract "$TarballPrefix\skills" +if (Test-Path $SrcSkills) { + Copy-Item -Recurse -Force -Path (Join-Path $SrcSkills '*') -Destination $SkillsDir + $Count = (Get-ChildItem $SrcSkills -Directory).Count + Write-Host "[install-secondary] installed slash commands to $SkillsDir ($Count skills)" +} else { + Write-Warning 'skills/ not found in release tarball' +} +Remove-Item $TmpZip +Remove-Item -Recurse $TmpExtract + +# 2. MCP client config -- point at the primary's tunnel host +$ClaudeDir = Join-Path $HOME '.claude' +New-Item -ItemType Directory -Force -Path $ClaudeDir | Out-Null +$McpJson = Join-Path $ClaudeDir '.mcp.json' +if (Test-Path $McpJson) { + $Backup = "$McpJson.bak-$(Get-Date -UFormat %s)" + Copy-Item $McpJson $Backup + Write-Host "[install-secondary] existing .mcp.json backed up to $Backup" +} +$Template = Invoke-WebRequest "$Raw/.mcp.json.remote.example" -UseBasicParsing +$Body = $Template.Content -replace 'your-host\.example', $TunnelHost +Set-Content -Path $McpJson -Value $Body -Encoding UTF8 +Write-Host "[install-secondary] wrote ~/.claude/.mcp.json (HTTP+OAuth -> https://$TunnelHost/mcp)" + +Write-Host '' +Write-Host '[install-secondary] done.' +Write-Host '' +Write-Host ' In any Claude Code session, run /graph-stats. The first call will open' +Write-Host ' your browser to complete the OAuth flow with Cloudflare Access. After' +Write-Host ' that, the bearer token is cached and subsequent calls are silent.' diff --git a/scripts/install-secondary.sh b/scripts/install-secondary.sh new file mode 100755 index 0000000..b6e6480 --- /dev/null +++ b/scripts/install-secondary.sh @@ -0,0 +1,83 @@ +#!/usr/bin/env bash +# graph-memory secondary-device installer. +# +# Sets up this device as a "secondary" — uses the graph remotely through the +# primary device's Cloudflare Tunnel. No Docker, no Neo4j, no source clone. +# Just installs the slash commands and writes an MCP client config pointed at +# the tunnel URL. +# +# The primary device must already have: +# - graph-memory running (see install-primary.sh) +# - Cloudflare Tunnel + OAuth set up per docs/REMOTE.md +# +# Usage: +# curl -fsSL https://raw.githubusercontent.com/stevepridemore/graph-memory//scripts/install-secondary.sh \ +# | bash -s +# +# Example: +# curl -fsSL https://raw.githubusercontent.com/stevepridemore/graph-memory/v0.3.0/scripts/install-secondary.sh \ +# | bash -s v0.3.0 graph.example.com + +set -euo pipefail + +VERSION="${1:-}" +HOST="${2:-}" +if [ -z "$VERSION" ] || [ -z "$HOST" ]; then + echo "usage: install-secondary.sh " >&2 + echo " e.g.: install-secondary.sh v0.3.0 graph.example.com" >&2 + exit 2 +fi + +REPO="stevepridemore/graph-memory" +RAW="https://raw.githubusercontent.com/$REPO/$VERSION" +TARBALL="https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz" +if [ "$VERSION" = "latest" ]; then + TARBALL="https://github.com/$REPO/archive/refs/heads/main.tar.gz" + TARBALL_PREFIX="graph-memory-main" +else + TARBALL_PREFIX="graph-memory-${VERSION#v}" +fi + +echo "[install-secondary] graph-memory $VERSION → $HOST" + +# 1. Slash commands +mkdir -p "$HOME/.claude/skills" +TMP=$(mktemp -d) +trap 'rm -rf "$TMP"' EXIT +curl -fsSL "$TARBALL" -o "$TMP/repo.tar.gz" +tar -xzf "$TMP/repo.tar.gz" -C "$TMP" +SRC_SKILLS="$TMP/$TARBALL_PREFIX/skills" +if [ -d "$SRC_SKILLS" ]; then + cp -r "$SRC_SKILLS"/* "$HOME/.claude/skills/" + echo "[install-secondary] installed slash commands to ~/.claude/skills/ ($(ls "$SRC_SKILLS" | wc -l) skills)" +else + echo "[install-secondary] WARNING: skills/ not found in tarball" >&2 +fi + +# 2. MCP client config — point at the primary's tunnel host +mkdir -p "$HOME/.claude" +if [ -f "$HOME/.claude/.mcp.json" ]; then + BACKUP="$HOME/.claude/.mcp.json.bak-$(date +%s)" + cp "$HOME/.claude/.mcp.json" "$BACKUP" + echo "[install-secondary] existing .mcp.json backed up to $BACKUP" +fi +curl -fsSL "$RAW/.mcp.json.remote.example" \ + | sed "s|your-host.example|$HOST|g" \ + > "$HOME/.claude/.mcp.json" +echo "[install-secondary] wrote ~/.claude/.mcp.json (HTTP+OAuth → https://$HOST/mcp)" + +cat </SKILL.md, substituting absolute +Sync canonical scheduled-task prompts to live SKILL.md files under +~/.claude/scheduled-tasks//SKILL.md, substituting absolute Windows paths for the portable ~/ placeholders. Run after editing any canonical prompt. +Source prompt files are resolved relative to --prompts-dir (default +"prompts" — i.e. the repo's prompts/ when run from the repo root). The +installer for non-developer users passes --prompts-dir ~/graph-memory/prompts +so the substitution reads from the entrypoint-seeded copy on the host. + Currently syncs: - - prompts/dream-nightly.md → scheduled-tasks/nightly-graph-dream/SKILL.md - - prompts/weekly-maintenance.md → scheduled-tasks/weekly-graph-maintenance/SKILL.md + - /dream-nightly.md → scheduled-tasks/nightly-graph-dream/SKILL.md + - /weekly-maintenance.md → scheduled-tasks/weekly-graph-maintenance/SKILL.md Usage: python3 scripts/sync-dream-skill.py [--user-home ] [--task ] + [--prompts-dir ] Without --task, syncs all known tasks. With --task , syncs only that one. """ @@ -25,12 +31,13 @@ PATH_RE = re.compile(r"~/(graph-memory|\.claude/projects)/[\w./<>*-]*") -# Each entry: (task_id, source_prompt_path, frontmatter_block) +# Each entry: (task_id, source_prompt_filename, frontmatter_block) +# Filenames are resolved against --prompts-dir at runtime. # The frontmatter is what Claude Code's scheduled-task runner expects in SKILL.md. TASKS = [ ( "nightly-graph-dream", - "prompts/dream-nightly.md", + "dream-nightly.md", """--- name: nightly-graph-dream description: Nightly graph memory dream process — ingest transcripts and documents, update knowledge graph, run decay maintenance. Hook errors (check-pending.js MODULE_NOT_FOUND) at session start are expected and harmless in remote sessions. @@ -40,7 +47,7 @@ ), ( "weekly-graph-maintenance", - "prompts/weekly-maintenance.md", + "weekly-maintenance.md", """--- name: weekly-graph-maintenance description: Weekly graph memory maintenance — backup, health analysis, and prune (gated on backup success and a sanity check on prune count). @@ -51,23 +58,42 @@ ] -def winify(home_dir: str): - """Return a substitution callable that turns ~// into - \\\\ (Windows backslashes throughout).""" +def make_substituter(home_dir: str, use_backslashes: bool): + """Return a substitution callable that expands ~// to an + absolute path under home_dir, using the OS-appropriate separator. + + - On Windows, use_backslashes=True produces e.g. C:\\Users\\you\\graph-memory\\... + - On macOS/Linux, use_backslashes=False keeps forward slashes throughout + and produces e.g. /Users/you/graph-memory/... or /home/you/graph-memory/... + """ + + sep = "\\" if use_backslashes else "/" def _sub(match): s = match.group(0) if s.startswith("~/graph-memory/"): - s = f"{home_dir}\\graph-memory\\" + s[len("~/graph-memory/") :] + rest = s[len("~/graph-memory/") :] + s = f"{home_dir}{sep}graph-memory{sep}" + rest elif s.startswith("~/.claude/projects/"): - s = f"{home_dir}\\.claude\\projects\\" + s[len("~/.claude/projects/") :] - return s.replace("/", "\\") + rest = s[len("~/.claude/projects/") :] + s = f"{home_dir}{sep}.claude{sep}projects{sep}" + rest + # Normalize internal separators + if use_backslashes: + return s.replace("/", "\\") + return s.replace("\\", "/") return _sub -def sync_one(task_id: str, src_path: str, frontmatter: str, user_home: str) -> int: - src = Path(src_path).resolve() +def sync_one( + task_id: str, + src_filename: str, + frontmatter: str, + user_home: str, + prompts_dir: Path, + use_backslashes: bool, +) -> int: + src = (prompts_dir / src_filename).resolve() dst = Path( os.path.join(user_home, ".claude", "scheduled-tasks", task_id, "SKILL.md") ) @@ -82,8 +108,8 @@ def sync_one(task_id: str, src_path: str, frontmatter: str, user_home: str) -> i lines = content.splitlines(keepends=True) body = "".join(lines[1:]).lstrip("\n") - # Substitute portable paths with absolute ones - body = PATH_RE.sub(winify(user_home), body) + # Substitute portable paths with absolute ones (OS-specific separators) + body = PATH_RE.sub(make_substituter(user_home, use_backslashes), body) dst.parent.mkdir(parents=True, exist_ok=True) # IMPORTANT: write LF line endings even on Windows. Claude Desktop's @@ -107,7 +133,23 @@ def main(): default=None, help="Sync only the named task (e.g. nightly-graph-dream, weekly-graph-maintenance). Default: all.", ) + parser.add_argument( + "--prompts-dir", + default="prompts", + help="Directory containing canonical prompt .md files. Default: 'prompts' (repo-relative). The installer passes '~/graph-memory/prompts' for end-user installs.", + ) + parser.add_argument( + "--os", + choices=["auto", "windows", "unix"], + default="auto", + help="Path separator style. 'auto' (default) detects from the runtime platform. Use 'windows' to force backslashes (e.g. when running this script inside Linux Docker on behalf of a Windows host) or 'unix' to force forward slashes.", + ) args = parser.parse_args() + prompts_dir = Path(os.path.expanduser(args.prompts_dir)).resolve() + if args.os == "auto": + use_backslashes = os.name == "nt" + else: + use_backslashes = args.os == "windows" targets = TASKS if args.task is None else [t for t in TASKS if t[0] == args.task] if args.task is not None and not targets: @@ -116,8 +158,15 @@ def main(): return 2 rc = 0 - for task_id, src_path, frontmatter in targets: - rc |= sync_one(task_id, src_path, frontmatter, args.user_home) + for task_id, src_filename, frontmatter in targets: + rc |= sync_one( + task_id, + src_filename, + frontmatter, + args.user_home, + prompts_dir, + use_backslashes, + ) return rc diff --git a/scripts/test-install-local.sh b/scripts/test-install-local.sh new file mode 100755 index 0000000..0411d0f --- /dev/null +++ b/scripts/test-install-local.sh @@ -0,0 +1,163 @@ +#!/usr/bin/env bash +# Tier-1 local test: runs install-primary.sh against the working tree without +# touching GitHub. Patches the install script's URLs to file:// and uses a +# sandboxed $HOME so it cannot disturb your real ~/.claude or ~/graph-memory. +# +# Run this INSIDE the test environment (WSL clone, Linux VM, etc.). The repo +# must be reachable via REPO_DIR (default: $PWD if it looks like the repo, +# else /mnt/c/Users//Documents/Projects/graph-memory). +# +# Usage: +# bash scripts/test-install-local.sh [REPO_DIR] + +set -euo pipefail + +REPO_DIR="${1:-${REPO_DIR:-$PWD}}" +if [ ! -f "$REPO_DIR/scripts/install-primary.sh" ]; then + echo "ERROR: scripts/install-primary.sh not found under $REPO_DIR" >&2 + echo "Pass the repo path: bash test-install-local.sh /path/to/graph-memory" >&2 + exit 2 +fi + +SANDBOX="$(mktemp -d -t gm-install-test.XXXXXX)" +FAKE_HOME="$SANDBOX/home" +TARBALL="$SANDBOX/release.tar.gz" +PATCHED="$SANDBOX/install-primary-patched.sh" +mkdir -p "$FAKE_HOME" + +echo "[test] repo dir : $REPO_DIR" +echo "[test] sandbox : $SANDBOX" +echo "[test] fake HOME: $FAKE_HOME" + +# 1. Build a release tarball from the working tree. Uses `tar` (not +# `git archive`) so we capture uncommitted files too — that's how we test +# new things before they're committed. Excludes .git, node_modules, dist, +# and coverage to match what GitHub's release tarball would actually ship. +TARBALL_PREFIX="graph-memory-0.3.0-test" +echo "[debug] before tar: ls $(dirname "$REPO_DIR")/$(basename "$REPO_DIR")/skills:" +ls "$(dirname "$REPO_DIR")/$(basename "$REPO_DIR")/skills" 2>&1 | head -5 +echo "[debug] direct: ls $REPO_DIR/skills:" +ls "$REPO_DIR/skills" 2>&1 | head -5 +( cd "$(dirname "$REPO_DIR")" && tar -czf "$TARBALL" \ + --transform "s|^$(basename "$REPO_DIR")|${TARBALL_PREFIX}|" \ + --exclude="$(basename "$REPO_DIR")/.git" \ + --exclude="$(basename "$REPO_DIR")/node_modules" \ + --exclude="$(basename "$REPO_DIR")/dist" \ + --exclude="$(basename "$REPO_DIR")/coverage" \ + --exclude="$(basename "$REPO_DIR")/docs/internal" \ + "$(basename "$REPO_DIR")" ) +echo "[test] tarball : $(du -h "$TARBALL" | cut -f1) at $TARBALL" +# Sanity: skills/ must be in the tarball or the verification will fail spuriously. +# Note: avoid `tar -tzf | grep -q` here. With `set -o pipefail`, grep -q's early +# exit triggers SIGPIPE on tar, the pipeline returns non-zero, and the check +# fires a false negative. Materialize the listing first, then grep on the file. +TARBALL_LIST="$SANDBOX/tarball.txt" +tar -tzf "$TARBALL" > "$TARBALL_LIST" +if ! grep -q "^${TARBALL_PREFIX}/skills/graph/SKILL.md$" "$TARBALL_LIST"; then + echo "ERROR: skills/ not in tarball — fix the build" >&2 + echo "[debug] looked for: ${TARBALL_PREFIX}/skills/graph/SKILL.md" >&2 + echo "[debug] skills entries actually present:" >&2 + grep "/skills/" "$TARBALL_LIST" | head -10 >&2 || echo "[debug] (none)" >&2 + exit 1 +fi + +# 2. Patch the install script to use file:// URLs for everything. +# - $RAW points at a directory served by file:// (curl supports that) +# - $TARBALL points at the local tar file +cp "$REPO_DIR/scripts/install-primary.sh" "$PATCHED" +sed -i \ + -e 's|https://raw.githubusercontent.com/$REPO/$VERSION|file://'"$REPO_DIR"'|g' \ + -e 's|https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz|file://'"$TARBALL"'|g' \ + -e 's|https://github.com/$REPO/archive/refs/heads/main.tar.gz|file://'"$TARBALL"'|g' \ + -e 's|graph-memory-${VERSION#v}|'"$TARBALL_PREFIX"'|g' \ + "$PATCHED" + +# 3. Drop the Docker pre-flight check for this test — we're only validating +# the file-layout half of the installer here, not docker compose up. +# (To do an end-to-end docker test, run without this neutering on a host +# that has Docker installed.) +if [ "${SKIP_DOCKER_CHECK:-1}" = "1" ]; then + sed -i 's|^if ! command -v docker.*|if false; \&\&|' "$PATCHED" + # The previous sed leaves the file syntactically broken; use a more careful + # patch instead — comment out the entire pre-flight block. + cp "$REPO_DIR/scripts/install-primary.sh" "$PATCHED" + sed -i \ + -e 's|https://raw.githubusercontent.com/$REPO/$VERSION|file://'"$REPO_DIR"'|g' \ + -e 's|https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz|file://'"$TARBALL"'|g' \ + -e 's|https://github.com/$REPO/archive/refs/heads/main.tar.gz|file://'"$TARBALL"'|g' \ + -e 's|graph-memory-${VERSION#v}|'"$TARBALL_PREFIX"'|g' \ + "$PATCHED" + # Now neutralize the docker block by replacing the conditional with a true: + python3 - "$PATCHED" <<'PYEOF' +import re, sys, pathlib +p = pathlib.Path(sys.argv[1]) +t = p.read_text() +# Replace the whole pre-flight Docker block from `# 0. Pre-flight` up to the +# `echo "[install-primary] docker: OK"` line with a stubbed-out echo. +t = re.sub( + r"# 0\. Pre-flight.*?echo \"\[install-primary\] docker: OK\"\n", + 'echo "[install-primary] docker: SKIPPED (test mode)"\n', + t, + count=1, + flags=re.S, +) +p.write_text(t) +PYEOF +fi +chmod +x "$PATCHED" + +# 4. Run the patched installer with HOME redirected to the sandbox +echo +echo "[test] === running patched installer ===" +HOME="$FAKE_HOME" bash "$PATCHED" v0.3.0-test +INSTALL_RC=$? + +echo +echo "[test] === verification ===" + +# Expected files +PASS=0 +FAIL=0 +check() { + if [ -e "$1" ]; then + echo " ✓ $1" + PASS=$((PASS+1)) + else + echo " ✗ MISSING: $1" + FAIL=$((FAIL+1)) + fi +} + +check "$FAKE_HOME/graph-memory/docker-compose.yml" +check "$FAKE_HOME/graph-memory/.env" +check "$FAKE_HOME/.claude/.mcp.json" +check "$FAKE_HOME/.claude/skills/graph/SKILL.md" +check "$FAKE_HOME/.claude/skills/graph-stats/SKILL.md" +check "$FAKE_HOME/.claude/skills/ingest/SKILL.md" + +# Skill content sanity — none of the personal-data leaks should be present +LEAKS=$(grep -rilE 'doublec|\bsprid\b|AppData' "$FAKE_HOME/.claude/skills/" 2>/dev/null || true) +if [ -z "$LEAKS" ]; then + echo " ✓ skills are clean of personal-data / AppData refs" + PASS=$((PASS+1)) +else + echo " ✗ skills contain leaks:" + echo "$LEAKS" | sed 's/^/ /' + FAIL=$((FAIL+1)) +fi + +# .env should be the template (still has placeholder values) +if grep -q "replace-with-a-strong-password" "$FAKE_HOME/graph-memory/.env"; then + echo " ✓ .env is template (user must edit before docker compose up)" + PASS=$((PASS+1)) +else + echo " ✗ .env doesn't look like the template" + FAIL=$((FAIL+1)) +fi + +echo +echo "[test] PASS=$PASS FAIL=$FAIL installer_rc=$INSTALL_RC" +echo "[test] sandbox kept for inspection at: $SANDBOX" +echo "[test] cleanup: rm -rf $SANDBOX" + +[ "$FAIL" -eq 0 ] && [ "$INSTALL_RC" -eq 0 ] diff --git a/skills/graph-ask/SKILL.md b/skills/graph-ask/SKILL.md new file mode 100644 index 0000000..f71a040 --- /dev/null +++ b/skills/graph-ask/SKILL.md @@ -0,0 +1,176 @@ +--- +name: graph-ask +description: Ask any natural language question about the memory graph. You generate Cypher directly and execute it. Use when the user has a complex or ad-hoc question that the standard graph tools don't cover. +argument-hint: [natural language question] +--- + +The user wants to ask a natural language question about the memory graph. +You will generate the Cypher query yourself and execute it via the graph_cypher MCP tool. + +Arguments: $ARGUMENTS + +## GRAPH SCHEMA + +All entity nodes carry both :Entity and their type label (e.g., :Entity:Person). +All entities have: id (STRING), name (STRING), subtype (STRING), confidence (FLOAT), +times_mentioned (INTEGER), first_seen (DATETIME), last_seen (DATETIME), source_file (STRING). + +Node labels (8 types): +- Person: role, relationship_to_user, organization, email +- Project: status, stack, description, directory, start_date +- Preference: domain, key, value, times_confirmed +- Concept: category, user_expertise, description +- Decision: what, why, context, reversible (BOOLEAN), status, decided_date +- Fact: domain, content, source, verified (BOOLEAN) +- Event: description, event_date (DATETIME), duration, outcome, location, status +- Object: object_type, description, status, url, version + +Relationship types (17 total, all have weight FLOAT and last_confirmed DATETIME): +- WORKS_ON (Person->Project): role, since +- PREFERS (Person->Preference): strength +- KNOWS_ABOUT (Person->Concept): depth +- DEPENDS_ON (Project->Project): dependency_type +- USES_TECH (Project->Concept): role +- DECIDED_FOR (Decision->any) +- SUPERSEDES (Decision->Decision): reason, superseded_date +- CONTRADICTS (any->any): description, detected_date, resolved (BOOLEAN), resolution +- RELATED_TO (any->any): relationship_type (similar_to, part_of, enables, impacts, depends_on, alternative_to, derived_from, implements, extends, configured_by) +- ALIAS_OF (any->any) +- PARTICIPATED_IN (Person->Event): role +- OCCURRED_DURING (Event->Project) +- PRODUCED (Event->Decision|Object|Fact) +- TRIGGERED_BY (Event->Event) +- USES (Project|Person->Object): purpose +- HOSTED_ON (Object->Object) +- PRODUCED_BY (Object->Project|Event) + +--- + +## QUERY TEMPLATES + +Use these tested templates for common question patterns. Fill in the `$param` values and adjust as needed. + +### T1 — Find everything connected to an entity +```cypher +MATCH (n:Entity)-[r]-(m:Entity) +WHERE toLower(n.name) CONTAINS toLower($name) + AND r.weight > 0.3 +RETURN n.name AS entity, type(r) AS relation, m.name AS connected, m.type AS connected_type, r.weight AS weight +ORDER BY r.weight DESC LIMIT 50 +``` +*Use for: "what do you know about X?", "tell me about Y"* + +### T2 — List entities of a type +```cypher +MATCH (n:Entity:$Type) +WHERE n.confidence > 0.3 +RETURN n.name AS name, n.confidence AS confidence, n.subtype AS subtype +ORDER BY n.confidence DESC, n.times_mentioned DESC LIMIT 50 +``` +*Replace `$Type` with Person, Project, Concept, Decision, Fact, Event, Object, or Preference.* +*Use for: "list all projects", "what people do I know?", "show my decisions"* + +### T3 — Path between two entities +```cypher +MATCH path = shortestPath((a:Entity)-[*..4]-(b:Entity)) +WHERE toLower(a.name) CONTAINS toLower($nameA) + AND toLower(b.name) CONTAINS toLower($nameB) +RETURN [n IN nodes(path) | n.name] AS path_nodes, + [r IN relationships(path) | type(r)] AS relations +LIMIT 5 +``` +*Use for: "how is X related to Y?", "connect X and Y"* + +### T4 — Recent entities (added in last N days) +```cypher +MATCH (n:Entity) +WHERE n.first_seen > datetime() - duration({days: $days}) +RETURN n.name AS name, n.type AS type, n.confidence AS confidence, n.first_seen AS added +ORDER BY n.first_seen DESC LIMIT 50 +``` +*Use for: "what did you learn recently?", "new entities this week"* + +### T5 — Strong preferences and decisions +```cypher +MATCH (p:Entity:Person)-[r:PREFERS]->(pref:Entity:Preference) +WHERE r.weight > 0.5 +RETURN p.name AS person, pref.name AS preference, pref.properties.value AS value, r.weight AS strength +ORDER BY r.weight DESC LIMIT 30 + +UNION + +MATCH (d:Entity:Decision) +WHERE d.confidence > 0.5 +RETURN "Decision" AS person, d.name AS preference, d.properties.what AS value, d.confidence AS strength +ORDER BY d.confidence DESC LIMIT 20 +``` +*Use for: "what are my preferences?", "what decisions have been made?"* + +### T6 — Technologies used by a project +```cypher +MATCH (proj:Entity:Project)-[r:USES_TECH]->(tech:Entity) +WHERE toLower(proj.name) CONTAINS toLower($projectName) + AND r.weight > 0.2 +RETURN proj.name AS project, tech.name AS technology, tech.type AS tech_type, r.weight AS weight +ORDER BY r.weight DESC LIMIT 30 +``` +*Use for: "what tech does X use?", "stack for project Y"* + +### T7 — Contradictions +```cypher +MATCH (a:Entity)-[r:CONTRADICTS]->(b:Entity) +WHERE r.resolved = false +RETURN a.name AS entity_a, b.name AS entity_b, r.properties.description AS conflict, r.properties.detected_date AS detected +ORDER BY r.properties.detected_date DESC LIMIT 20 +``` +*Use for: "what contradictions exist?", "conflicting facts"* + +### T8 — Entities by source session +```cypher +MATCH (n:Entity)-[r]-() +WHERE r.source_session = $sessionId +RETURN DISTINCT n.name AS name, n.type AS type, n.confidence AS confidence +ORDER BY n.confidence DESC LIMIT 50 +``` +*Use for: "what was extracted from session X?", "show last dream output"* + +### T9 — Weakest / stale entities (candidates for pruning) +```cypher +MATCH (n:Entity) +WHERE n.confidence < 0.25 + AND n.last_seen < datetime() - duration({days: 60}) +RETURN n.name AS name, n.type AS type, n.confidence AS confidence, n.last_seen AS last_seen +ORDER BY n.confidence ASC LIMIT 30 +``` +*Use for: "what might be pruned?", "stale graph entries"* + +### T10 — Full-text search across names and properties +```cypher +MATCH (n:Entity) +WHERE toLower(n.name) CONTAINS toLower($term) + OR toLower(toString(n.properties)) CONTAINS toLower($term) +RETURN n.name AS name, n.type AS type, n.confidence AS confidence +ORDER BY n.confidence DESC LIMIT 50 +``` +*Use for: broad keyword searches when you're not sure of the entity name* + +--- + +## Steps + +1. Match the user's question to the closest template above (or compose from scratch if none fit) +2. Fill in parameters; adjust filters if the user asks for weak/all connections +3. Call `graph_cypher` with the query +4. Present results: + - Show the Cypher in a code block + - Show results as a readable table or list + - If empty: suggest a broader term or related template +5. If the query fails: read the error, fix the Cypher, and retry once + +## Rules +- Only read-only Cypher (MATCH / RETURN / WITH / WHERE / ORDER BY / LIMIT / SKIP / UNWIND) +- Default LIMIT 50 unless user asks for more +- Default weight > 0.3 filter unless user asks for weak or all connections +- Always show the Cypher you generated + +If no question is provided, ask the user what they'd like to know. diff --git a/skills/graph-backup/SKILL.md b/skills/graph-backup/SKILL.md new file mode 100644 index 0000000..55cb0c5 --- /dev/null +++ b/skills/graph-backup/SKILL.md @@ -0,0 +1,33 @@ +--- +name: graph-backup +description: Export the memory graph to a timestamped JSONL backup file. Use before risky operations or on demand. +triggers: + - /graph-backup +--- + +# Graph Backup + +Export the memory graph to a backup file. + +## Steps + +1. Call `graph_export` with no arguments (or pass `label` for a named backup, e.g. `label: "pre-prune"`). + +2. Report the result to the user: + - Backup file path + - Node and edge counts + - File size + - How many old backups were pruned + - How many are retained + +3. If the export fails, report the error clearly and suggest checking whether the Docker container is running (`docker ps`). + +## Example output + +``` +Backup complete: + File: /root/graph-memory/backups/backup-2026-05-05T22-00-00.jsonl + 297 nodes, 418 edges + 42 KB + Retained: 7 backups, pruned: 0 +``` diff --git a/skills/graph-boost/SKILL.md b/skills/graph-boost/SKILL.md new file mode 100644 index 0000000..cb91b2a --- /dev/null +++ b/skills/graph-boost/SKILL.md @@ -0,0 +1,23 @@ +--- +name: graph-boost +description: Reinforce or weaken a specific relationship in the memory graph. Use when the user wants to manually adjust edge weights. +argument-hint: [weaken] [entity-from] [entity-to] [relation-type] [--reason "why"] +--- + +The user wants to manually adjust an edge weight in the memory graph. + +Arguments: $ARGUMENTS + +Parse the arguments: +- If first argument is "weaken", this is a weaken operation. Otherwise it's a boost. +- Next two positional arguments are the from and to entity names +- Next positional argument is the relationship type (WORKS_ON, PREFERS, KNOWS_ABOUT, etc.) +- `--reason` optional: why the adjustment is being made + +Steps: +1. For boost: call `graph_boost` MCP tool with from_name, to_name, relation, and reason +2. For weaken: call `graph_weaken` MCP tool with from_name, to_name, relation, and reason +3. Report the previous and new weight +4. If the edge wasn't found, suggest using `/graph` to find the correct entity names + +If insufficient arguments, ask the user to specify the entities and relationship type. diff --git a/skills/graph-bootstrap/SKILL.md b/skills/graph-bootstrap/SKILL.md new file mode 100644 index 0000000..8cef378 --- /dev/null +++ b/skills/graph-bootstrap/SKILL.md @@ -0,0 +1,55 @@ +--- +name: graph-bootstrap +description: Run the one-time bootstrap to populate the memory graph from existing memory files and conversation transcripts. Use when first setting up graph-memory or to catch up on historical data. +argument-hint: [--memory-only] [--transcripts-only] [--dry-run] +--- + +The user wants to run the graph memory bootstrap process. This is a one-time bulk import +of existing knowledge into the graph. You ARE the bootstrap process -- run it inline. + +Arguments: $ARGUMENTS + +Parse the arguments: +- `--memory-only` optional: only process memory .md files, skip transcripts +- `--transcripts-only` optional: only process conversation transcripts, skip memory files +- `--dry-run` optional: describe what you would extract without calling graph write tools + +Steps: +1. Check for lock file at ~/graph-memory/processed/dream.lock + - If it exists and was created less than 2 hours ago, report "Dream/bootstrap process already running" and exit + - Otherwise, create the lock file with: {"pid": 0, "timestamp": "", "source": "manual-bootstrap"} +2. Read ~/graph-memory/config.json for parameters (defaults: chunk_size_lines=500, max_transcripts_per_run=10) + +3. Process memory files (unless --transcripts-only): + a. Find all .md files in ~/.claude/projects/*/memory/ (skip MEMORY.md index files) + b. Read each file, parse YAML frontmatter (name, description, type) + c. Extract entities and relationships based on content + d. Check existing entities via graph_entities before creating (boost if exists) + e. Call graph_relate in batch mode with source_type: "memory-file" + f. Use specific relationship types (WORKS_ON, USES_TECH, KNOWS_ABOUT, PREFERS, etc.) + +4. Process conversation transcripts (unless --memory-only): + a. Read manifest.json, find unprocessed JSONL files in ~/.claude/projects/ + b. For each transcript (oldest first, up to max_transcripts_per_run): + - Validate format (check first 5 lines for expected fields) + - Read content (chunk large files at 500 lines) + - Extract entities from user and assistant text blocks only + - Call graph_relate in batch mode with source_type: "conversation" + - **Update manifest.json immediately** after each transcript + c. If max_transcripts_per_run reached, note remaining count + +5. Call graph_decay to apply time-based maintenance + +6. Write changelog to ~/graph-memory/logs/bootstrap-YYYY-MM-DD.md + +7. Delete the lock file at ~/graph-memory/processed/dream.lock + +8. Report summary: memory files processed, transcripts processed, entities created/updated, edges created + +If --dry-run, describe what you would extract but do NOT call any graph write tools. Skip the lock. + +IMPORTANT: Never merge entities unless highly confident. Flag suspected duplicates in the changelog. +IMPORTANT: Always delete the lock file when done, even if you encountered errors. +IMPORTANT: NEVER extract API keys, passwords, tokens, secrets, or credentials into graph entities. +IMPORTANT: Use specific relationship types -- not just RELATED_TO for everything. +IMPORTANT: Memory files are higher signal than transcripts -- process them first. diff --git a/skills/graph-briefing/SKILL.md b/skills/graph-briefing/SKILL.md new file mode 100644 index 0000000..1b9e7d6 --- /dev/null +++ b/skills/graph-briefing/SKILL.md @@ -0,0 +1,45 @@ +--- +name: graph-briefing +description: Generate a session briefing from the memory graph — recent changes, unresolved contradictions, relevant context for the current project. Use at the start of a session to catch up, or when switching projects. +argument-hint: [--full] +--- + +Generate a structured session briefing from the memory graph. + +Arguments: $ARGUMENTS + +Steps: +1. Call `graph_stats` to get overall health and last dream run timestamp +2. Call `graph_query` with project_context set to the current working directory, context_level: "minimal" + to get entities relevant to the current project +3. Call `graph_contradictions` to find any unresolved conflicts +4. If --full flag: also call `graph_ingest` with action: "status" to check pending ingests, + and call `graph_query` with current_only: false to show recently superseded facts + +Present the briefing in this format: + +## Session Briefing + +**Project:** [current project name from directory] +**Last dream run:** [timestamp] ([how long ago]) +**Graph:** [node count] entities, [edge count] relationships + +### What's Changed Recently +- [List entities/edges updated since last session, if any] +- [Recently superseded facts — what changed and when] + +### Unresolved Contradictions +- [Any CONTRADICTS edges with resolved=false] +- [Or "None — graph is consistent"] + +### Relevant to Current Project +- [Top entities connected to active project, by effective weight] +- [Recent events related to this project] +- [Key decisions still active for this project] + +### Pending +- [Unprocessed transcripts count] +- [Pending ingest documents] +- [Or "All caught up"] + +If the graph is empty or Neo4j is not running, say so and suggest running /graph-dream or the bootstrap. diff --git a/skills/graph-capture/SKILL.md b/skills/graph-capture/SKILL.md new file mode 100644 index 0000000..1a5291d --- /dev/null +++ b/skills/graph-capture/SKILL.md @@ -0,0 +1,142 @@ +--- +name: graph-capture +description: End-of-session catch-up for the memory graph. Reviews the current conversation and writes any new entities, edges, decisions, or facts that weren't already captured in flight. Use before closing a long claude.ai or Desktop conversation, or any time you want to commit recent context to long-term memory. +argument-hint: [--dry-run] [--topic ] [--since-message ] +--- + +The user wants to capture the current conversation to the memory graph. Many conversations contribute knowledge that the dream process can't see (claude.ai web and Desktop chats live server-side, not in the local transcript store the dream walks). This command is the manual catch-up: review what we just talked about, find what hasn't already been written to the graph, and write it. + +Arguments: $ARGUMENTS + +Parse: +- `--dry-run` — describe what you would write without calling write tools. Useful to review before committing. +- `--topic ` — focus capture on entities/relationships related to a specific topic mentioned in the conversation. Skip unrelated material. Default: capture everything substantive. +- `--since-message ` — only consider messages from that point onward. Default: whole conversation. + +## Steps + +### 1. Inventory candidate entities and relationships + +Walk the conversation (or the slice indicated by `--since-message` / `--topic`) and list every distinct candidate that meets the "worth writing" bar: + +- **People** named with role, organization, or relationship context — not just casual references +- **Projects** worked on, evaluated, or referenced with meaningful context +- **Technologies / Concepts** the user used, evaluated, decided about, or expressed a preference toward +- **Preferences** explicitly stated ("I prefer X", "always use Y") or strongly implied through repeated choice +- **Decisions** made with reasoning ("we decided X because Y") — both explicit and clear inferred decisions +- **Facts** about infrastructure, processes, configuration, or the user's environment +- **Events** — meetings, deployments, incidents, milestones with dates or outcomes +- **Objects** — specific repos, servers, databases, tools, containers +- **Reasoning traces** — *how* a problem was solved, especially if there were dead ends or alternatives considered. Capture the trace, not just the outcome. + +Skip: +- Conversation mechanics ("let me read that file", "running the test now") +- Trivial mentions without significance +- API keys, passwords, tokens, connection strings, or any secret value (note existence only, never the value) + +### 2. Check what's already in the graph (search-first, three-way branch) + +For every candidate, run **both** lookups before deciding what to do — a single exact-name check misses near-duplicates: + +1. **`graph_search`** with the candidate's name + a brief paraphrase as the query. Semantic similarity will find entities under different names ("Cloudflare Tunnel for graph-memory" vs "Cloudflare Tunnel (graph-memory)" vs "graph-memory tunnel" all hit each other at high similarity) +2. **`graph_entities`** with the candidate's exact name string for the strict match + +Combine the results and classify the candidate into one of **three** states (not two): + +| State | Trigger | Action | +|---|---|---| +| **A. Not in graph** | Both lookups empty (or top hits are clearly different concepts) | Queue for creation | +| **B. In graph, conversation aligns or extends** | Existing entity matches, conversation reinforces or adds new edges to it | Reuse the existing entity. Queue any new edges. If nothing new beyond a re-mention, `graph_boost` (+0.05) on the strongest existing edge | +| **C. In graph, conversation contradicts** | Existing entity matches, but the conversation says the existing facts are now wrong (e.g. user corrected something, or talked about a deprecated state) | **Invalidate first, then create.** See "Handling corrections" below | + +When `graph_search` returns a top hit at score ≥ 0.85 with a name that's a near-paraphrase (different word order, parenthesization, prepositions, hyphenation), **always treat as state B or C — never create a duplicate.** If you're unsure, surface the candidate to the user: *"This looks like the existing entity 'X' — should I merge or create new?"* + +### 2b. Handling corrections (state C) + +If the conversation contradicts an existing fact, **do not** just write a new contradicting edge alongside the old one — that strengthens the old edge by bumping its `last_confirmed`, and queries will return both. The right pattern: + +1. Identify the specific old edges that are now wrong (use `graph_query` on the entity to enumerate) +2. **Invalidate** each wrong edge by calling `graph_relate` with the SAME from/to/relation but adding `valid_at` set to the time the old fact stopped being true (the migration date, the moment of correction, etc.) — `graph_relate` then sets `invalid_at` on the predecessor and creates the new "version" — see the supersession behaviour described in `dream-nightly.md` +3. If the entity should be entirely retired (the concept no longer applies), `graph_delete` it and skip step 2 +4. Only after invalidation, create the new entities and edges representing the corrected reality + +For Decision-vs-Decision succession, use `SUPERSEDES` **in one direction only**: `(new_decision)-[SUPERSEDES]->(old_decision)`. Never write the reverse. SUPERSEDES is read as "the new decision replaces the old one." + +For entity-level rename or re-categorization (same thing, different name), use `ALIAS_OF` — link the redundant entity to the canonical one rather than maintaining two parallel records. + +### 3. Plan the relationships + +Connect the captured entities to each other and to existing graph entities the conversation touched on. Don't create islands — if the user mentioned that Project X depends on Library Y, write the `DEPENDS_ON` edge even if both already existed individually. + +Use canonical relationship types (`WORKS_ON`, `WORKS_AT`, `REPORTS_TO`, `STAKEHOLDER_IN`, `PREFERS`, `KNOWS_ABOUT`, `DEPENDS_ON`, `USES_TECH`, `USES`, `DECIDED_FOR`, `CONTRADICTS`, `RELATED_TO`, `ALIAS_OF`, `PARTICIPATED_IN`, `OCCURRED_DURING`, `LED_TO`, `INVOLVED_IN`, etc.) over inventing new types. For `RELATED_TO`, pick a `relationship_type` subtype: `similar_to`, `part_of`, `enables`, `impacts`, `depends_on`, `alternative_to`, `derived_from`, `implements`, `extends`, `configured_by`. + +If a genuinely new pattern emerges that doesn't fit anything in the canonical set, surface it to the user and ask before inventing — don't accumulate ad-hoc synonyms. + +### 4. Apply weight guidelines + +| Origin | Starting weight | +|---|---| +| User explicitly stated | 0.7 | +| Strongly inferred from conversation context | 0.5 | +| Loosely inferred / single passing mention | 0.3 | +| User confirmed something already in the graph | use `graph_boost` (+0.15) instead | +| User corrected something the graph had wrong | use `graph_weaken` (-0.3) on the wrong edge, then `graph_relate` for the corrected fact | + +Capture aliases and identifying details inline in the edge `evidence` string when the source mentioned them. Examples: "Andrew McElroy (nickname Tripp, initials AHM)", "Domino 14.5.1 (the version that introduced DominoIQ)". This makes future fuzzy matching much better. + +### 5. Write to the graph (unless --dry-run) + +Use `graph_relate` in **batch mode** for efficiency. Always include provenance: + +- `source_type`: `"conversation"` +- `source_session`: a stable identifier — Claude Code session id if you have it, otherwise something like `"claude-ai-capture-"` so audit log entries can be grouped + +**Critical: don't accidentally re-confirm wrong edges.** Calling `graph_relate` with the same `(from, to, relation)` triple as an existing edge will MERGE — it bumps the existing edge's `last_confirmed` to "now", which on the decay curve treats it as freshly reinforced. If your goal is to mark a fact as no-longer-true, use the supersession pattern from step 2b (set `valid_at` on the new replacement so the old one gets `invalid_at`), or call `graph_weaken` to drop weight, or `graph_delete` to remove the entity entirely. Never update a wrong edge by writing a contradicting one alongside. + +For `--dry-run`, instead of calling write tools, print a human-readable summary of what would be written: +``` +Would create: + - Andrew McElroy (Person) — Assistant General Counsel at FBBE + - DominoIQ POC October Timeline (Decision) — confidence 0.7 +Would link: + - Andrew McElroy -[STAKEHOLDER_IN, weight 0.6]-> DominoIQ POC + - ... +Would boost (already in graph): + - Tara Newman: confirmed mentioned in DominoIQ context — +0.05 +``` + +### 6. Validate after writing + +After the batch, call `graph_validate` with the `source_session` you used. For high-severity issues: +- **Generic or reference-language names** (e.g. "the server", "this project"): delete with `graph_delete` and re-extract more carefully +- **Near-duplicates that are clearly the same**: link with `graph_relate ALIAS_OF` + +For medium/low severity issues: report them to the user as "flagged for review." + +### 7. Report back + +Tell the user what was captured: + +``` +Captured to graph: + - 4 new entities (2 Person, 1 Decision, 1 Project) + - 7 new edges + - 3 existing entities boosted + - 1 flagged for review: "the deployment" — too generic, suggest a more specific name +``` + +If `--dry-run` was used, end with: "Run `/graph-capture` (no flags) to commit." + +## Rules + +1. **Capture, don't speculate.** Only write what the conversation actually established. If you're guessing what the user meant, don't write it. +2. **Search before creating — use semantic similarity, not just exact match.** `graph_search` is the primary lookup; `graph_entities` is the strict-match secondary. Names that differ only in word order, parens, or prepositions ("Cloudflare Tunnel for graph-memory" vs "Cloudflare Tunnel (graph-memory)") are duplicates. If unsure, ask the user to confirm before creating. +3. **Corrections require invalidation, not just contradiction.** If the conversation says the graph has something wrong, identify the specific old edges and invalidate them via the supersession pattern (or `graph_weaken` / `graph_delete`). Writing a new contradicting edge while leaving the wrong one intact is worse than not capturing — it leaves the graph in a state where queries return both versions, with the wrong one freshly reinforced via `last_confirmed`. +4. **SUPERSEDES is one-directional.** `(new)-[SUPERSEDES]->(old)`, never the reverse, never both. +5. **Never extract secrets.** API keys, passwords, tokens, connection strings, private keys, signed URLs — none of these go into entity properties or edge evidence. Note that a credential exists ("Project X uses an API key for service Y") but never the value. +6. **Stay silent if there's nothing new.** If after reviewing the conversation you find nothing worth capturing (because everything was already written in flight, or because the conversation was purely mechanical), just say "Nothing new to capture — the conversation's facts are already in the graph." +7. **Dry-run for long conversations.** If the conversation is large (50+ messages) and you're about to write 20+ entities, run the dry-run path internally and present the plan to the user for confirmation before committing the batch — saves them from a runaway capture they didn't expect. + +## Why this exists + +The nightly dream process extracts knowledge from Claude Code transcripts (`~/.claude/projects/*/*.jsonl`) but cannot see claude.ai web conversations or Claude Desktop chats — those live server-side or in Electron app data, not in the local file store the dream walks. This skill closes that loop: explicit, on-demand capture before a conversation is lost or scrolled away. Run it at the end of any substantive conversation in claude.ai or Desktop where you spent real thought and want it preserved. diff --git a/skills/graph-dream/SKILL.md b/skills/graph-dream/SKILL.md new file mode 100644 index 0000000..1c687ce --- /dev/null +++ b/skills/graph-dream/SKILL.md @@ -0,0 +1,35 @@ +--- +name: graph-dream +description: Manually run the graph memory dream process to extract entities from recent conversations and ingested documents. Use when the user wants to update the graph now rather than waiting for the scheduled run. +argument-hint: [--dry-run] [--ingest-only] +--- + +The user wants to manually trigger the graph memory dream process. This runs inline +in the current session — you ARE the dream process. No separate script needed. + +Arguments: $ARGUMENTS + +Parse the arguments: +- `--dry-run` optional: describe what you would do without actually calling graph tools +- `--ingest-only` optional: only process the ingest drop folder, skip conversation transcripts + +For the full dream process instructions, read the file: +~/graph-memory/prompts/dream-nightly.md +(seeded into the data dir by the docker container's entrypoint on first start) + +Follow those instructions exactly. The key steps are: +1. Read manifest at ~/graph-memory/processed/manifest.json +2. Find unprocessed transcripts in ~/.claude/projects/ +3. Check ~/graph-memory/ingest/pending/ for documents +4. If nothing pending, report "No pending work" and exit +5. For each unprocessed transcript: extract entities, call graph_relate/graph_boost/graph_weaken +6. For each ingest document: same extraction process +7. Call graph_decay for maintenance +8. Write changelog to ~/graph-memory/logs/YYYY-MM-DD.md +9. Update manifest.json +10. Move ingest files from pending/ to completed/ + +If --dry-run, describe what you would extract but do NOT call any graph write tools. +If --ingest-only, skip transcript processing entirely. + +IMPORTANT: Never merge entities unless highly confident. Flag suspected duplicates instead. diff --git a/skills/graph-find/SKILL.md b/skills/graph-find/SKILL.md new file mode 100644 index 0000000..e080633 --- /dev/null +++ b/skills/graph-find/SKILL.md @@ -0,0 +1,21 @@ +--- +name: graph-find +description: Find specific patterns in the memory graph — contradictions, stale entities, orphans, or strongest connections. Use when the user wants to audit or review their knowledge graph. +argument-hint: [contradictions|stale|orphans|strongest] +--- + +The user wants to find specific patterns in the memory graph. + +Arguments: $ARGUMENTS + +Based on the argument: + +**contradictions** — Call `graph_contradictions` MCP tool. Show each pair of conflicting facts with their evidence and detection date. Offer to resolve them. + +**stale** — Call `graph_entities` MCP tool with min_confidence: 0.0, sort_by: "confidence", limit: 20. Show the weakest entities that may need reinforcement or pruning. + +**orphans** — Call `graph_entities` MCP tool with sort_by: "last_seen", limit: 50. Filter the response for entities where edge_count is 0. Show these disconnected entities. + +**strongest** — Call `graph_entities` MCP tool with sort_by: "confidence", limit: 20. Show the highest-confidence entities — the core of the knowledge graph. + +If no argument is provided, show a brief menu of what's available and ask what they'd like to find. diff --git a/skills/graph-stats/SKILL.md b/skills/graph-stats/SKILL.md new file mode 100644 index 0000000..919612f --- /dev/null +++ b/skills/graph-stats/SKILL.md @@ -0,0 +1,31 @@ +--- +name: graph-stats +description: Show memory graph health dashboard — node counts, edge counts, contradictions, stale entities, pending ingests, and last dream run. Use when the user asks about graph status, health, or size. +--- + +The user wants to see the memory graph status. + +Steps: +1. Call the `graph_stats` MCP tool (no arguments needed) +2. Call the `graph_ingest` MCP tool with action: "status" to get ingest queue info +3. Present a clean dashboard: + + ## Memory Graph Dashboard + + **Nodes** — total and breakdown by type (Person, Project, Preference, Concept, Decision, Fact, Event, Object) + **Edges** — total and breakdown by type + + **Health** + - Average edge weight + - Orphaned nodes (no connections) + - Unresolved contradictions + - Stale nodes (confidence < 0.2) + + **Ingest Queue** + - Pending documents + - Recently completed + + **Last dream run** — timestamp and what changed + +If there are unresolved contradictions, highlight them and offer to show details. +If there are stale nodes, mention they may need review or will be pruned. diff --git a/skills/graph/SKILL.md b/skills/graph/SKILL.md new file mode 100644 index 0000000..a68d0f3 --- /dev/null +++ b/skills/graph/SKILL.md @@ -0,0 +1,31 @@ +--- +name: graph +description: Query the memory graph to explore entities, relationships, and knowledge. Use when the user asks what the graph knows, wants to explore connections, or asks about entities in their knowledge base. +argument-hint: [entity-name] [--hops N] [--type Type] [--related] +--- + +The user wants to query the memory graph. + +Arguments: $ARGUMENTS + +Parse the arguments: +- First positional argument(s) are entity names to query (required) +- `--hops N` optional: max traversal depth (default 2) +- `--type` optional: filter results by node type (Person, Project, Preference, Concept, Decision, Fact, Event, Object) +- `--related` optional: show all entities related to the query entity + +Steps: +1. Call the `graph_query` MCP tool with: + - entities: the entity name(s) from the arguments + - max_hops: from --hops flag or default 2 + - entity_types: from --type flag if provided + - project_context: the current working directory +2. Present the results in a readable format: + - Group by entity type + - Show relationship types and weights + - Highlight the strongest connections + - Note any effective_weight vs raw weight differences (project affinity) + - List associated source files the user can read for full context + +If the query returns no results, suggest checking entity names or trying broader terms. +If no entity name is provided, ask the user what they want to look up. diff --git a/skills/ingest-audio/SKILL.md b/skills/ingest-audio/SKILL.md new file mode 100644 index 0000000..2feb216 --- /dev/null +++ b/skills/ingest-audio/SKILL.md @@ -0,0 +1,127 @@ +--- +name: ingest-audio +description: Transcribe a local audio or video file using Whisper and ingest it into the memory graph. Use when the user has a local MP3, WAV, M4A, MP4, or similar audio/video file they want to add to their knowledge graph. +argument-hint: [--model base] [--topic "hint1, hint2"] [--author "name"] [--now] +--- + +The user wants to transcribe a local audio or video file and ingest it into the graph memory system. + +Arguments: $ARGUMENTS + +## Step 1: Parse arguments + +- First positional argument: local file path (required) +- `--model`: Whisper model size. Default: `base`. Options: `tiny`, `base`, `small`, `medium`, `large` + - `tiny`: fastest, least accurate (~39MB) + - `base`: good balance, recommended default (~74MB) + - `small`: noticeably better accuracy (~244MB) + - `medium`: high accuracy, slow on CPU (~769MB) + - `large`: best accuracy, very slow on CPU (~1.5GB) +- `--topic`: topic hints for metadata (comma-separated) +- `--author`: speaker or creator name +- `--now`: process immediately inline after transcription instead of queuing + +If no file path is provided, ask the user for one. + +## Step 2: Verify Whisper is installed + +Run: `whisper --help` + +If this fails, report: +``` +Whisper is not installed. Install it with: + pip install openai-whisper + +Note: This downloads model weights (~74MB for 'base') on first run. +ffmpeg is also required: + - Windows: winget install ffmpeg + - macOS: brew install ffmpeg + - Linux: apt install ffmpeg (or your distro's equivalent) +``` +Then stop. + +## Step 3: Verify the file exists and is a supported format + +Supported: `.mp3`, `.wav`, `.m4a`, `.mp4`, `.ogg`, `.flac`, `.webm`, `.mkv`, `.avi`, `.mov` + +If the file doesn't exist or the format isn't supported, report the issue and stop. + +## Step 4: Transcribe with Whisper + +Output directory: `~/graph-memory/.tmp/graph-audio-ingest/` + +Run: +``` +whisper "" --model --output_format txt --output_dir "~/graph-memory/.tmp/graph-audio-ingest/" +``` + +Whisper outputs `.txt` in the output directory. + +Note: This may take several minutes for longer files on CPU. Inform the user that transcription is running. + +If transcription fails, report the error and stop. + +## Step 5: Read and validate the transcript + +Read the output `.txt` file. If it's empty or very short (under 20 words), warn the user: +- "Transcript appears empty or very short. The audio may be silent, too quiet, or in a different language." +- Suggest trying `--model small` or `--model medium` for better accuracy. + +## Step 6: Prepare the output markdown file + +Write a markdown file to `~/graph-memory/.tmp/graph-audio-ingest/.md`: + +```markdown +# + +**Source:** audio transcription +**Original file:** +**Transcribed with:** Whisper +**Date:** +> + +## Transcript + + +``` + +Write the `.meta.json` sidecar alongside it: +```json +{ + "source": "audio", + "author": "", + "date": "", + "topic_hints": [""], + "original_file": "", + "whisper_model": "" +} +``` +Only include fields that have values. + +## Step 7: Queue or process immediately + +**If `--now` is NOT set (default):** +1. Copy the `.md` file to `~/graph-memory/ingest/pending/.md` using Bash +2. Write the meta object to `~/graph-memory/ingest/pending/.md.meta.json` +3. Report: + - Original file and model used + - Approximate word count of the transcript + - "Queued -- will be processed on the next dream run (`/graph-dream`)" + +Note: Do NOT use the `graph_ingest` MCP tool for queuing -- it runs inside Docker and only sees its own mounted volumes. Write directly to the pending dir on the host. + +**If `--now` IS set:** +1. Read the markdown content +2. Extract entities and relationships inline (you reason about the content) +3. For each candidate entity: call `graph_entities` to check if it already exists +4. Call `graph_relate` for new entities and relationships (batch mode preferred) +5. Call `graph_boost` for reinforcements of existing knowledge +6. Report: entities created, edges created, key topics found + +## Notes + +- Whisper runs entirely locally -- free, no API key needed, no data leaves your machine +- First run for a given model downloads the weights (e.g. ~74MB for `base`) +- For YouTube or online video URLs, use `/ingest ` (uses youtube_transcript_api, no local compute needed) +- For non-YouTube video platforms, use `/yt-dlp ` which tries subtitles first before falling back to Whisper +- ffmpeg must be installed for Whisper: `winget install ffmpeg` (Windows), `brew install ffmpeg` (macOS), `apt install ffmpeg` (Linux) diff --git a/skills/ingest/SKILL.md b/skills/ingest/SKILL.md new file mode 100644 index 0000000..43353dd --- /dev/null +++ b/skills/ingest/SKILL.md @@ -0,0 +1,98 @@ +--- +name: ingest +description: Ingest a file or URL into the memory graph. Handles local files (text, PDF, DOCX, XLSX, images, etc.) and URLs (web pages, YouTube, Wikipedia, RSS). Use when the user wants to add any document or web content to their knowledge graph. +argument-hint: [--now] [--source "type"] [--author "name"] [--topic "hint1, hint2"] +--- + +The user wants to ingest a document or URL into the graph memory system. + +Arguments: $ARGUMENTS + +## Step 1: Parse arguments + +- First positional argument: file path or URL (required) +- `--now`: process immediately inline (Claude extracts entities in this session) +- `--source`: document type label, e.g. "article", "meeting notes", "YouTube transcript" +- `--author`: creator/author name +- `--topic`: topic hints for better extraction (comma-separated string) + +If no argument is provided, ask the user for a file path or URL. + +## Step 2: Detect input type + +**URL** (starts with `http://` or `https://`): +- YouTube URLs (`youtube.com` or `youtu.be`) → MarkItDown fetches transcript + metadata +- All other URLs (web pages, Wikipedia, RSS, Bing) → MarkItDown converts to markdown +- Go to Step 3A + +**Local file**: +- Native text formats (`.md`, `.txt`, `.srt`, `.vtt`, `.json`, `.html`, `.csv`) → queue directly, skip MarkItDown +- Binary/rich formats (`.pdf`, `.docx`, `.doc`, `.xlsx`, `.xls`, `.pptx`, `.ppt`, `.epub`, `.ipynb`, `.msg`, `.eml`, `.zip`, images) → convert via MarkItDown first +- Go to Step 3B + +## Step 3A: URL ingestion via MarkItDown + +1. Run MarkItDown on the URL: + ``` + markitdown "" -o "~/graph-memory/.tmp/graph-ingest-tmp.md" + ``` +2. If it fails, report the error and stop +3. Read the output file to check it has content (not empty) +4. Derive a filename from the URL: slugify the domain + path, e.g. `youtube-com-watch-dQw4w9WgXcQ.md` +5. Continue to Step 4 using the temp `.md` file as the file to queue + +## Step 3B: Local file ingestion + +**Native text file:** use the file path as-is, go to Step 4. + +**Binary/rich file:** +1. Run MarkItDown on the local file: + ``` + markitdown "" -o "~/graph-memory/.tmp/graph-ingest-tmp.md" + ``` +2. If it fails, report the error with the MarkItDown output and stop +3. Read the output to verify it has meaningful content +4. Derive a filename: take the original basename and replace the extension with `.md` + - e.g. `report.pdf` → `report.md` +5. Continue to Step 4 using the temp `.md` file + +## Step 4: Queue or process immediately + +Build the meta object from flags: +```json +{ + "source": "", + "author": "", + "date": "", + "topic_hints": [""] +} +``` +Only include fields that have values. + +**If `--now` is NOT set (default -- queue for later):** +1. Copy the file directly to `~/graph-memory/ingest/pending/` using Bash +2. Write the meta object to `~/graph-memory/ingest/pending/.meta.json` + (Only include fields that have values -- omit empty keys) +3. Report: + - What was queued (filename, source type) + - For URLs: the page title if MarkItDown extracted one + - "Will be processed on the next dream run (`/graph-dream`)" + +Note: Do NOT use the `graph_ingest` MCP tool for queuing -- it runs inside the Docker container and only sees its own mounted volumes. Write directly to the pending dir on the host instead. + +**If `--now` IS set (immediate inline processing):** +1. Read the file content +2. Extract entities and relationships (you reason about the content): + - People, projects, technologies, preferences, decisions, facts, events + - Focus on knowledge, not formatting or navigation elements +3. For each candidate entity: call `graph_entities` with a search to check if it already exists +4. Call `graph_relate` for new entities and relationships (use batch mode for efficiency) +5. Call `graph_boost` for reinforcements of existing knowledge +6. Report: source, entities created, edges created, any notable findings + +## Notes + +- MarkItDown handles: PDF, DOCX, XLSX, PPTX, EPUB, MSG/EML, IPYNB, CSV, ZIP, images, HTML, YouTube, Wikipedia, RSS +- For non-YouTube video platforms (Vimeo, TikTok, Loom, etc.) use `/yt-dlp ` instead +- For local audio files (MP3, WAV, M4A) use `/ingest-audio ` instead +- MarkItDown should be on PATH after `pip install "markitdown[pdf,docx,xlsx,pptx]"`. If it isn't, ask the user to confirm the install location. From df9405ad084dba6c6334686c32c2f650ff7479ff Mon Sep 17 00:00:00 2001 From: Steve <1407088+stevepridemore@users.noreply.github.com> Date: Sun, 10 May 2026 22:43:20 -0400 Subject: [PATCH 2/4] =?UTF-8?q?ci:=20TEMP=20=E2=80=94=20also=20build=20ins?= =?UTF-8?q?tall-test=20branch=20image=20for=20dogfood?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Remove this commit before merging install-test → main. --- .github/workflows/release.yml | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index 6b16eb3..caea883 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -8,6 +8,10 @@ on: push: tags: - 'v*' + # TEMP: also build on the install-test branch so we can dogfood the + # release flow end-to-end before tagging. Remove before merging to main. + branches: + - 'install-test' permissions: contents: read @@ -34,8 +38,12 @@ jobs: with: context: . push: true + # Branch builds (install-test) tag the image with the branch name + # only — don't move :latest. Tag builds tag both the version and + # :latest. github.ref_name resolves to the tag name on tag pushes + # and the branch name on branch pushes. tags: | ghcr.io/stevepridemore/graph-memory-mcp:${{ github.ref_name }} - ghcr.io/stevepridemore/graph-memory-mcp:latest + ${{ startsWith(github.ref, 'refs/tags/v') && 'ghcr.io/stevepridemore/graph-memory-mcp:latest' || '' }} cache-from: type=gha cache-to: type=gha,mode=max From dba198f6d630f4212777b7002bf1c73038e70c27 Mon Sep 17 00:00:00 2001 From: Steve <1407088+stevepridemore@users.noreply.github.com> Date: Sun, 10 May 2026 22:54:49 -0400 Subject: [PATCH 3/4] install: branch-name fallback for installer version arg For testing v0.3.0 against a feature branch before tagging. Treats anything not matching v* or 'latest' as a branch name and pulls from refs/heads/. A real user install with a tag like v0.3.0 hits the same tag URL as before. --- scripts/install-primary.sh | 27 +++++++++++++++++++-------- scripts/install-secondary.sh | 21 ++++++++++++++------- 2 files changed, 33 insertions(+), 15 deletions(-) diff --git a/scripts/install-primary.sh b/scripts/install-primary.sh index 9f65867..038203f 100755 --- a/scripts/install-primary.sh +++ b/scripts/install-primary.sh @@ -26,14 +26,25 @@ fi REPO="stevepridemore/graph-memory" RAW="https://raw.githubusercontent.com/$REPO/$VERSION" -TARBALL="https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz" -# 'latest' isn't a tag — fall back to main branch tarball -if [ "$VERSION" = "latest" ]; then - TARBALL="https://github.com/$REPO/archive/refs/heads/main.tar.gz" - TARBALL_PREFIX="graph-memory-main" -else - TARBALL_PREFIX="graph-memory-${VERSION#v}" -fi + +# Resolve $VERSION to the right tarball URL. Tags (v*) come from refs/tags; +# 'latest' resolves to refs/heads/main; anything else is assumed to be a +# branch name. This makes pre-release dogfooding on a branch work without +# any further config. +case "$VERSION" in + v*) + TARBALL="https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz" + TARBALL_PREFIX="graph-memory-${VERSION#v}" + ;; + latest) + TARBALL="https://github.com/$REPO/archive/refs/heads/main.tar.gz" + TARBALL_PREFIX="graph-memory-main" + ;; + *) + TARBALL="https://github.com/$REPO/archive/refs/heads/$VERSION.tar.gz" + TARBALL_PREFIX="graph-memory-$VERSION" + ;; +esac echo "[install-primary] graph-memory $VERSION" diff --git a/scripts/install-secondary.sh b/scripts/install-secondary.sh index b6e6480..37e7628 100755 --- a/scripts/install-secondary.sh +++ b/scripts/install-secondary.sh @@ -30,13 +30,20 @@ fi REPO="stevepridemore/graph-memory" RAW="https://raw.githubusercontent.com/$REPO/$VERSION" -TARBALL="https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz" -if [ "$VERSION" = "latest" ]; then - TARBALL="https://github.com/$REPO/archive/refs/heads/main.tar.gz" - TARBALL_PREFIX="graph-memory-main" -else - TARBALL_PREFIX="graph-memory-${VERSION#v}" -fi +case "$VERSION" in + v*) + TARBALL="https://github.com/$REPO/archive/refs/tags/$VERSION.tar.gz" + TARBALL_PREFIX="graph-memory-${VERSION#v}" + ;; + latest) + TARBALL="https://github.com/$REPO/archive/refs/heads/main.tar.gz" + TARBALL_PREFIX="graph-memory-main" + ;; + *) + TARBALL="https://github.com/$REPO/archive/refs/heads/$VERSION.tar.gz" + TARBALL_PREFIX="graph-memory-$VERSION" + ;; +esac echo "[install-secondary] graph-memory $VERSION → $HOST" From 00cd68b05f63347fae2889d48c2e64f5aae323e5 Mon Sep 17 00:00:00 2001 From: Steve <1407088+stevepridemore@users.noreply.github.com> Date: Sun, 10 May 2026 22:58:41 -0400 Subject: [PATCH 4/4] =?UTF-8?q?Revert=20"ci:=20TEMP=20=E2=80=94=20also=20b?= =?UTF-8?q?uild=20install-test=20branch=20image=20for=20dogfood"?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This reverts commit df9405ad084dba6c6334686c32c2f650ff7479ff. --- .github/workflows/release.yml | 10 +--------- 1 file changed, 1 insertion(+), 9 deletions(-) diff --git a/.github/workflows/release.yml b/.github/workflows/release.yml index caea883..6b16eb3 100644 --- a/.github/workflows/release.yml +++ b/.github/workflows/release.yml @@ -8,10 +8,6 @@ on: push: tags: - 'v*' - # TEMP: also build on the install-test branch so we can dogfood the - # release flow end-to-end before tagging. Remove before merging to main. - branches: - - 'install-test' permissions: contents: read @@ -38,12 +34,8 @@ jobs: with: context: . push: true - # Branch builds (install-test) tag the image with the branch name - # only — don't move :latest. Tag builds tag both the version and - # :latest. github.ref_name resolves to the tag name on tag pushes - # and the branch name on branch pushes. tags: | ghcr.io/stevepridemore/graph-memory-mcp:${{ github.ref_name }} - ${{ startsWith(github.ref, 'refs/tags/v') && 'ghcr.io/stevepridemore/graph-memory-mcp:latest' || '' }} + ghcr.io/stevepridemore/graph-memory-mcp:latest cache-from: type=gha cache-to: type=gha,mode=max