Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 97 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@ Modular logs and telemetry collector. Plugin-kernel architecture.
HypAware captures conversations and traffic from local AI clients (Claude
Code, Codex), raw Anthropic / OpenAI API traffic, and OpenTelemetry
logs / traces / metrics into a local query cache and optional Parquet
exports. Everything runs on the local machine — no central server is
required for V1.
exports. It runs fully local by default, no central server required, and a
host can optionally join a fleet (`hyp join`) to forward its recordings to a
central server.

> Part of **[HypStack](https://hypstack.ai/)**, an open-source stack for AI observability.

Expand Down Expand Up @@ -59,7 +60,30 @@ Other init flags:
| `--from-file <config.json>`| Skip the picker and load a known-good config |
| `--bin <path>` | Override the binary path the daemon installer uses |

## Where things live
## Joining a centrally-managed fleet (`hyp join`)

`hyp join` enrolls a host in a fleet so its recordings are forwarded to a
central sink:

```sh
hyp join <url> [token]
hyp join <url> --token-file <path> # read the token from a file (recommended for MDM)
echo "<token>" | hyp join <url> # or from stdin
hyp join <url> <token> --no-daemon # write the seed only, skip daemon install
```

It writes a central-enrollment config (mode `0600`) to a dedicated layer under
`config-control/`, never to your local `hypaware-config.json`, so joining
augments an existing install rather than replacing it, then installs and starts
the daemon (unless `--no-daemon` is passed).

The policy token is a multi-use fleet-wide credential. Prefer `--token-file`
or stdin over a positional argument, which would otherwise land in shell
history and process listings. Other flags: `--bin <path>` overrides the binary
the daemon installer records, and `--no-daemon` writes the seed without
installing or restarting the daemon.

## Files and directories

| Path | Contents |
|------------------------------------------------|----------------------------------------------------------|
Expand All @@ -86,6 +110,33 @@ hyp query sql "select count(*) from logs"
Use `hyp query schema <dataset>` to see the columns available on each
dataset, and `hyp query status` to inspect cache freshness per dataset.

## Building and querying the activity graph

Alongside the row datasets, HypAware can project captured activity into a
node/edge **activity graph**: which sessions ran in which app, against which
model, using which tools, touching which files. The projection is
deterministic (exact-key matching, no models), and the `context-graph` plugins
are active by default.

Projection is a manual, cheap-to-rerun step. Build or refresh the graph from
what has been captured, then walk it from a seed node:

```sh
hyp graph project # project captured data into the node/edge graph
hyp graph compact # merge duplicate rows (optional housekeeping)
hyp graph neighbors <node> --depth 2 # walk out from a seed node
```

`hyp graph neighbors` takes a `node_id`, natural key, or label as the seed,
plus `--depth`, `--direction out|in|both`, `--type <node_type>`, `--edge-type
<type>` (repeatable), and `--limit`. The graph is also plain data: the `node`
and `edge` datasets are queryable through `hyp query sql` like any other
dataset.

Claude Code and Codex additionally get a `hypaware-graph` skill (and a
`graph_neighbors` tool) so an assistant can project and walk the graph on your
behalf.

## Attaching and detaching AI clients

Attach a single client (idempotent — running twice is a no-op):
Expand Down Expand Up @@ -115,6 +166,46 @@ scripting. Claude writes only HypAware-related keys to
`~/.claude/settings.json`; Codex writes a `hypaware` provider entry to
`~/.codex/config.toml`. Unrelated keys in either file are preserved.

## Opting a folder out of recording (`.hypignore`)

A `.hypignore` file declares a data-usage policy for a directory subtree. It
currently carries exactly one class, `ignore`: AI gateway exchanges (Claude /
Codex) whose working directory is at or under a directory containing a
`.hypignore` are never written to the local cache. The live LLM call is
untouched; only persistence is suppressed.

Resolution is gitignore-style: from an exchange's `cwd`, HypAware walks up the
directory tree, and any `.hypignore` found on the way governs. Dropping one
file at a repo root covers the whole repo, but the file works anywhere in the
ancestor chain, including outside a git repo.

Manage it with the CLI (idempotent; hand-authoring the dotfile is optional):

```sh
hyp ignore # write a .hypignore at the repo root (or cwd if not in a repo)
hyp ignore <path> # ignore a specific directory subtree
hyp ignore --check # report whether cwd is ignored, which file governs, and
# how many already-cached rows from the scope remain
hyp unignore # remove the governing .hypignore, re-enabling recording
```

`hyp ignore` writes a self-documenting file (a comment header plus the `ignore`
token); an empty or comment-only `.hypignore` also means `ignore`. Pass
`--json` to `hyp ignore` for a machine-readable result.

Two things to know:

- **Prospective only.** A `.hypignore` gates future recording and backfills.
Rows already captured before the file existed are left in place; `hyp ignore
--check` surfaces that residual count.
- **Folder matching needs a `cwd`.** Only the Claude and Codex pathways supply
one, so `.hypignore` is a no-op for the `raw-anthropic` / `raw-openai`
proxy and OTEL sources.

To pause recording for just the current Claude session (in-memory, reversible,
not committed) use the `/hypaware-ignore` and `/hypaware-unignore` skills
instead.

## Daemon lifecycle

```sh
Expand All @@ -130,33 +221,6 @@ hyp daemon uninstall # remove the service file (config + recordings are kept)
content and target paths without touching the filesystem — useful for
verifying what `hyp init` will install.

## V1 scope

V1 ships with:

- Bundled plugins (no external plugin install required for the default
flow): `@hypaware/ai-gateway`, `@hypaware/claude`, `@hypaware/codex`,
`@hypaware/otel`, `@hypaware/local-fs`, `@hypaware/format-parquet`,
`@hypaware/format-jsonl`, plus Anthropic and OpenAI upstreams.
- Local capture for Claude Code, Codex, raw Anthropic API, raw OpenAI
API, and OTEL logs / traces / metrics.
- Local query (`hyp query sql`) across captured AI gateway messages,
logs, traces, and metrics.
- Local Parquet export through the `local-fs` sink and the
`format-parquet` encoder.
- Claude Code attach and Codex attach (idempotent, reversible).
- Persistent macOS / Linux user daemon (`launchd` LaunchAgent or
`systemd --user` unit).

## Out of scope for V1

- The central / enterprise sink.
- Gascity (multi-tenant aggregation).
- Extracting the kernel or bundled plugins into separate repos.
- Migrating an existing Collectivus install.
- An external first-party plugin registry. Only bundled plugins are
required for the default V1 path.

## Troubleshooting

`hyp status` is the entry point for any "is HypAware working?" question.
Expand Down Expand Up @@ -208,7 +272,7 @@ npm run typecheck # if a typecheck script is present
npm pack --dry-run # verify the published file set
```

Re-run the V1 smoke battery and confirm every one is green:
Re-run the smoke battery and confirm every one is green:

```sh
hyp smoke package_bin_boot
Expand Down Expand Up @@ -265,8 +329,8 @@ hypaware-core/
format-jsonl/ # @hypaware/format-jsonl
claude/ # @hypaware/claude
codex/ # @hypaware/codex
central/ # @hypaware/central (out of scope for V1)
gascity/ # @hypaware/gascity (out of scope for V1)
central/ # @hypaware/central (bundled, opt-in via `hyp join`)
gascity/ # @hypaware/gascity (bundled, opt-in)
bin/
hypaware.js # CLI entrypoint (bound to both `hypaware` and `hyp`)
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
},
"skills": [
{ "name": "hypaware-query", "clients": ["claude"] },
{ "name": "hypaware-reference", "clients": ["claude"] },
{ "name": "hypaware-ignore", "clients": ["claude"] },
{ "name": "hypaware-unignore", "clients": ["claude"] },
{ "name": "hypaware-ai-adoption-report", "clients": ["claude"] },
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
---
name: hypaware-reference
description: Explain what HypAware is, what it captures, how its data flows, which hyp command to run, config and paths, joining a fleet, and the activity graph. Use for product and CLI orientation - "what is HypAware", "what can it capture", "how do I detach codex", "how do I join a server", "where does my data go". For querying recorded data use hypaware-query; for analyses use the hypaware-ai-*-report skills.
---

# HypAware Reference

Orientation for questions about HypAware itself: what it is, what it captures,
how data moves, and which command does what. This is a conceptual map, not a
flag reference. For the exact flags and behavior on *this* install, run
`hyp --help` and `hyp <command> --help`; for "is it working right now?" run
`hyp status` (add `--json` for the stable machine shape). Those live sources
win over anything summarized here.

Do not answer data questions from this skill. It names no dataset columns, JSON
paths, or SQL - that all lives in the **hypaware-query** skill, which is the
single source of truth for the recorded data format.

## What HypAware is

A modular logs and telemetry collector with a plugin-kernel architecture. It
captures conversations and traffic from local AI clients (Claude Code, Codex),
raw Anthropic / OpenAI API traffic, and OpenTelemetry logs / traces / metrics
into a local query cache and optional Parquet exports.

It runs fully local by default, with no central server required. A host can
optionally join a fleet (`hyp join`) to forward its recordings to a central
HypAware server. HypAware is part of HypStack, an open-source stack for AI
observability.

## What it captures (sources)

Any subset of these can be enabled at `hyp init`:

- `claude` - Claude Code conversations
- `codex` - Codex conversations
- `raw-anthropic` - raw Anthropic API traffic
- `raw-openai` - raw OpenAI API traffic
- `otel` - OpenTelemetry logs / traces / metrics

A `.hypignore` file opts a directory subtree out of recording, but it only
gates the `claude` and `codex` pathways (they supply a working directory to
match against). It is a no-op for the `raw-anthropic` / `raw-openai` proxy and
OTEL sources. See the **hypaware-ignore** and **hypaware-unignore** skills for
managing it, or `hyp ignore --help`.

## How data flows (invariants)

- **Capture always lands in the local cache first.** Every source writes only
to the intrinsic Iceberg-backed local query cache. Storage and query are
intrinsic to the kernel, not plugin-provided.
- **Sinks are export targets, not the write path.** Configured sinks (for
example local-fs Parquet) receive *scheduled exports out of* the cache.
Sources never see sinks.
- **One source, one table.** Each dataset table has exactly one producer and is
named after it.
- **Config is explicit.** The written config enumerates the chosen plugins;
there is no implicit "use defaults" mode.

## CLI command map

The CLI is bound to both `hyp` and `hypaware`. Core commands:

- `hyp status` - health snapshot: config path, daemon state, active plugins,
sources/sinks, attach state, retention, cache size, recent errors,
diagnostics with repair lines. The entry point for any "is it working?"
question.
- `hyp init` - interactive walkthrough (or `--yes` non-interactive) that picks
sources, an export strategy, and a retention window, writes the config,
installs the daemon, and attaches selected clients.
- `hyp query` - SQL over the local cache (and remote hosts via `--remote`).
Use the **hypaware-query** skill for this.
- `hyp graph` - build and walk the activity graph: `project` materializes a
node/edge graph from captured data, `compact` folds duplicate rows, and
`neighbors <node>` walks out from a seed node. Contributed by the default-on
`@hypaware/context-graph` plugin.
- `hyp attach` / `hyp detach` - wire a client (`claude`, `codex`) into the
local gateway, or remove only HypAware-managed settings. Idempotent and
reversible. `hyp unattach` is an alias of `detach`.
- `hyp ignore` / `hyp unignore` - write or remove a `.hypignore` for a folder.
- `hyp daemon` - lifecycle for the persistent user daemon: `install`, `start`,
`status`, `restart`, `stop`, `uninstall` (launchd on macOS, systemd `--user`
on Linux).
- `hyp join` - enroll the host in a centrally-managed fleet.
- `hyp remote` - manage remote HypAware query targets (`add`, `login`, `list`).
- `hyp collect` - manage collected JSONL tables (`list`, `remove`).
- `hyp backfill` - import client history from registered backfill providers.
- `hyp sink` - manage sink instances (`force`, `maintain`).
- `hyp config` - inspect or validate the config.
- `hyp mcp` - serve this host's read verbs as an MCP server over stdio.
- `hyp smoke` - run a hermetic smoke flow.
- `hyp version` - print the version.

`logs`, `metrics`, and `traces` commands are contributed by the `@hypaware/otel`
plugin, so they appear only when it is enabled. Plugin-contributed commands
vary by install - trust `hyp --help` for the live inventory, and
`hyp <command> --help` for exact subcommands and flags.

## Config and paths

`HYP_HOME` defaults to `~/.hyp`; override by exporting it before invoking the
CLI or daemon.

- `<HYP_HOME>/hypaware-config.json` - active config, rewritten by `hyp init`
- `<HYP_HOME>/hypaware/cache/` - local query cache (Iceberg-backed)
- `<HYP_HOME>/hypaware/sinks/<name>/outbox/` - failed export rows awaiting retry
- `<HYP_HOME>/hypaware/dev-telemetry/` - daemon self-telemetry
- `<HYP_HOME>/hypaware/logs/daemon.{out,err}.log` - daemon stdout / stderr
- `<HYP_HOME>/exports/` - local Parquet exports (when the local-fs sink is on)

`hyp join` writes a separate central-enrollment layer under `config-control/`
(mode `0600`), never into the local `hypaware-config.json`, so joining augments
an existing install rather than replacing it.

## Availability

Available on every install (default flow, no extra config):

- Local capture for Claude Code, Codex, raw Anthropic API, raw OpenAI API, and
OTEL logs / traces / metrics.
- Local query over captured messages, logs, traces, and metrics.
- The activity graph: `hyp graph project` builds a node/edge graph from captured
data and `hyp graph neighbors` walks it (also queryable via `hyp query`).
- Local Parquet export.
- Claude Code and Codex attach (idempotent, reversible).
- A persistent macOS / Linux user daemon.

Opt-in (enabled by explicit config, not the default flow):

- **Central forwarding.** `hyp join <url> <token>` enrolls the host and turns on
the `@hypaware/central` sink, which forwards cache partitions to a central
HypAware server. It is fine to explain how to join a server; the receiving
server is a separate deployment the user points the host at.
- **Additional bundled plugins** (for example an S3 sink and AI-enrichment
plugins). Some are opt-in specifically because enabling them lets captured
content leave the machine, so they must be a deliberate config choice. Trust
`hyp --help` and the written config for the live set on this install.

Not provided by the CLI: there is no first-party plugin registry. Third-party
plugins can be installed from npm or git, but HypAware does not curate a
registry.

## Hand-offs

- Query or inspect recorded data - use the **hypaware-query** skill.
- Run an analysis (spend, adoption, security, improvement) - use the
**hypaware-ai-spend-report**, **-adoption-report**, **-security-report**, or
**-improvement-report** skills.
- Opt a folder out of recording - use **hypaware-ignore** /
**hypaware-unignore**, or pause just the current session with the
`/hypaware-ignore` and `/hypaware-unignore` skills.
- "Is it working?" or diagnose a problem - `hyp status` (with `--json` for the
stable shape); its `diagnostics:` section carries `repair:` lines to run.

## Guardrails

- Treat `hyp --help`, `hyp <command> --help`, and `hyp status --json` as the
authoritative source for exact commands, flags, and state on this install.
Use the map above for orientation and conceptual answers only.
- Never invent flags or promise a capability you cannot confirm on this
install. When unsure, run the relevant `--help` before answering.
- Do not name dataset columns, JSON paths, or SQL here. Defer every data-format
detail to the **hypaware-query** skill.
1 change: 1 addition & 0 deletions hypaware-core/plugins-workspace/claude/src/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,7 @@ export async function activate(ctx) {
// @ref LLP 0011#interactive-walkthrough [implements]: contributes client skills the first-run walkthrough installs
for (const skillName of [
'hypaware-query',
'hypaware-reference',
'hypaware-ignore',
'hypaware-unignore',
'hypaware-ai-adoption-report',
Expand Down
1 change: 1 addition & 0 deletions hypaware-core/plugins-workspace/codex/hypaware.plugin.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
},
"skills": [
{ "name": "hypaware-query", "clients": ["codex"] },
{ "name": "hypaware-reference", "clients": ["codex"] },
{ "name": "hypaware-ai-adoption-report", "clients": ["codex"] },
{ "name": "hypaware-ai-improvement-report", "clients": ["codex"] },
{ "name": "hypaware-ai-security-report", "clients": ["codex"] },
Expand Down
Loading