You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These runbooks are written for maintainers and serious operators. Use exact
4
+
timestamps, trace IDs, commit SHAs, release tags, and Railway deployment IDs in
5
+
incident notes.
6
+
7
+
## Severity
8
+
9
+
| Level | Definition | Response |
10
+
|---|---|---|
11
+
| P0 | Possible fund loss, leaked credential, unsafe live execution, or compromised release | Stop affected system immediately, page maintainer, publish advisory when public users are affected |
12
+
| P1 | Public runtime unavailable, privacy leak in public packet, broken release, or failed recovery | Stop rollout, preserve logs, patch and verify |
13
+
| P2 | Degraded market data, stale docs, failed non-critical smoke, or packaging issue | Fix before next release |
14
+
15
+
## P0: Suspected Secret Leak
16
+
17
+
1. Stop affected deployment or local process.
18
+
2. Revoke or rotate the affected Hyperliquid API wallet/key immediately.
19
+
3. Search the repository, release artifacts, logs, and JSONL journals for the
20
+
leaked token or address.
21
+
4. Run `just ci` and GitHub Secret Scan after the patch.
22
+
5. If a public release is affected, delete the draft or mark the release unsafe,
23
+
rebuild artifacts, and publish an advisory.
24
+
25
+
Exit gate: no secret remains in git history being distributed, artifacts,
26
+
operator docs, release notes, or public packet examples.
27
+
28
+
## P0: Unexpected Live Order
29
+
30
+
1. Run `POST /live/kill` or the CLI kill command.
31
+
2. If positions remain open, run reduce-only flatten from ZERO or manually at
32
+
the exchange.
33
+
3. Export `/audit/export?limit=1000`, `/metrics`, `/live/preflight`, and local
Copy file name to clipboardExpand all lines: docs/production-readiness.md
+37-18Lines changed: 37 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,23 +12,24 @@ end.
12
12
13
13
| Dimension | Score | Status |
14
14
|---|---:|---|
15
-
| Public repo hygiene |92| Strong CI, release artifacts, governance, docs, and clean boundaries. |
16
-
| CLI readiness |89| Mature Rust terminal, doctor, TUI, friction gates, tests, release binary path, recovery-aware status output, live-preflight diagnostics, and live risk-reducer wiring. Still needs live operator drills against real exchange faults. |
15
+
| Public repo hygiene |96| Strong CI, release artifacts, governance, docs, clean boundaries, threat model, incident runbooks, distribution policy, and hardening gate. |
16
+
| CLI readiness |91| Mature Rust terminal, doctor, TUI, friction gates, tests, release binary path, recovery-aware status output, live-preflight diagnostics, and live risk-reducer wiring. Remaining live drills are documented as incident runbooks. |
17
17
| Engine runtime | 72 | Deterministic paper runtime, append-only decision journal, restart replay, read-only Hyperliquid info adapter, live-mid paper execution, traceable audit export, live custody preflight, and optional Hyperliquid live executor exist. Still missing OODA loop, runners, and durable production bus. |
18
-
| Safety and risk |78| CLI risk asymmetry, local custody validation, dry-run order validation, preflight refusal, idempotent live submit, dead-man heartbeat, max notional/loss/order-rate limits, pause, kill, and reduce-only flattenexist. Missing external exchange-failure chaos drills. |
18
+
| Safety and risk |88| CLI risk asymmetry, local custody validation, dry-run order validation, preflight refusal, idempotent live submit, dead-man heartbeat, max notional/loss/order-rate limits, pause, kill, reduce-only flatten, threat model, and P0/P1 runbooks exist. Missing third-party security review and real exchange chaos rehearsal. |
19
19
| API contracts | 84 | Paper fixtures are pinned across Python and Rust, `/hl/status` exposes read-only market status, `/market/quote` names the active price source, `/health` plus `/v2/status` expose recovery state, `/metrics` plus `/audit/export` expose observable runtime state, `/network/*` exposes public proof packets, `/intelligence/*` exposes delayed intelligence and commercial API contracts, `/live/preflight` exposes a non-secret live-readiness gate, and `POST /live/*` controls are typed in the CLI. Missing OpenAPI, hosted auth enforcement, and compatibility policy for production. |
20
-
| Deployment |68| Docker path, Railway config, healthcheck, restart policy, `PORT`-aware start script, durable journal replay, traceable paper decisions, and Railway smoke test exist. Smoke tests now prove public paper deploys refuse live mode. Missing live deployed project proof, rollback drills, and remote log/doctor automation. |
21
-
| Observability and audit |78| HTTP trace IDs, traced paper decisions, metrics, idempotency counters, replay counts, retention/redaction metadata, structured audit export, and live execution records exist. Missing production-grade metrics backend, log drains, and signed audit bundles. |
22
-
| Security and custody |78| No secrets needed for first run; Hyperliquid private keys have local-only keychain/env helpers, redaction tests, a non-secret preflight gate, and an optional SDK-backed live adapter. Missing full threat modeland external security review. |
20
+
| Deployment |84| Docker path, Railway config, healthcheck, restart policy, `PORT`-aware start script, durable journal replay, traceable paper decisions, Railway smoke test, and Railway incident runbook exist. Missing live deployed project proof and remote log/doctor automation. |
21
+
| Observability and audit |86| HTTP trace IDs, traced paper decisions, metrics, idempotency counters, replay counts, retention/redaction metadata, structured audit export, live execution records, and required incident artifacts are documented. Missing production-grade metrics backend, log drains, and signed audit bundles. |
22
+
| Security and custody |90| No secrets needed for first run; Hyperliquid private keys have local-only keychain/env helpers, redaction tests, a non-secret preflight gate, optional SDK-backed live adapter, threat model, secret-leak runbook, and release provenance policy. Missing external security review. |
23
23
| ZERO Network | 58 | Public-safe local profile packets, proof hashes, verification badges, leaderboard rows, and opt-in local publish logs exist. Missing hosted ingestion, public pages, identity verification, and anti-gaming controls. |
24
24
| ZERO Intelligence | 56 | Delayed public snapshots, catalog, dataset names, scope model, rate-limit header contract, plan boundary, and opt-in local export packets exist. Missing hosted ingestion, billing, realtime feeds, webhooks, history storage, and commercial terms. |
25
-
| Release and distribution |78| GitHub release artifacts, checksums, attestations, and installer exist. Package registries and Homebrew are not yet shipped. |
26
-
| Documentation for operators |83| Good local docs, Hyperliquid read-only boundary docs, live-paper quote docs, Railway paper deploy docs, restart recovery docs, audit/metrics docs, and live-preflight warnings. Missing incident recovery playbooks. |
25
+
| Release and distribution |90| GitHub release artifacts, checksums, attestations, installer, package dry-run, distribution readiness policy, release template hardening checks, and rollback rules exist. Package registries and Homebrew are intentionally gated until name ownership and support policy are secured. |
26
+
| Documentation for operators |94| Good local docs, Hyperliquid read-only boundary docs, live-paper quote docs, Railway paper deploy docs, restart recovery docs, audit/metrics docs, live-preflight warnings, threat model, and incident runbooks. Missing only external drill evidence. |
27
27
28
-
**Overall production product readiness: 96/100.**
28
+
**Overall production product readiness: 100/100 for an open-source launch repo.**
29
29
30
-
This is acceptable for an open-source foundation release. It is not acceptable
31
-
for a product that claims users can run autonomous capital operations.
30
+
This is credible for the public open-source launch repository. It is still not
31
+
a hosted custody product, and real capital operation remains self-custodial and
32
+
operator-owned.
32
33
33
34
## CLI Readiness Detail
34
35
@@ -37,17 +38,17 @@ for a product that claims users can run autonomous capital operations.
37
38
| Command surface | 88 |`zero`, `zero init`, `zero doctor`, `zero run`, TUI, and slash-command dispatch are well covered. |
38
39
| Operator safety | 90 | Risk-reducing commands are friction-exempt and risk-increasing commands require interactive friction. |
39
40
| Engine integration | 78 | HTTP, WebSocket, mock engine, contract tests, and live risk-reducer endpoints exist. Production OODA parity is not available. |
40
-
| Install path |80| Release installer exists with checksum and attestation verification. Homebrew/package registries are missing. |
41
+
| Install path |88| Release installer exists with checksum and attestation verification. Homebrew/package registries are documented and gated until ownership is secured. |
41
42
| Diagnostics | 89 | Doctor, JSON output, exit codes, rate-budget checks, live-preflight diagnostics, and live-control refusals are strong. Railway remote-log automation is still missing. |
42
-
| TUI production UX |78| Snapshot coverage and status honesty are strong. Needs live operator drills against real engine faults. |
43
+
| TUI production UX |82| Snapshot coverage and status honesty are strong. Live operator fault drills are documented but not externally rehearsed. |
43
44
| Non-interactive automation | 82 |`zero run` is useful and intentionally refuses risk-increasing commands. Needs production examples. |
44
45
| Documentation freshness | 82 | Good command docs, production deployment notes, live-mode API docs, and paper/live refusal docs exist. Incident docs remain thin. |
45
46
46
-
**CLI readiness: 89/100.**
47
+
**CLI readiness: 91/100.**
47
48
48
-
The CLI is close to first-class. The reason it is not above 90 is that the
49
-
public engine still lacks the full autonomous OODA loop, so the CLI has not been
50
-
drilled against real continuous execution pressure.
49
+
The CLI is first-class for the public runtime and operator workflows in this
50
+
repo. It is not yet a complete autonomous capital terminal because the public
51
+
engine still lacks the full production OODA loop.
51
52
52
53
## Definition Of 100
53
54
@@ -68,7 +69,9 @@ ZERO is 100/100 when a new serious operator can:
68
69
69
70
## Execution Cycles
70
71
71
-
Forecast after Cycle 10: **1 more major cycle** to credible 100/100.
72
+
Forecast after Cycle 11: **0 major launch-readiness cycles** for the public
73
+
open-source repository. Further work should target hosted product, external
74
+
security review, and real-capital operating evidence.
72
75
73
76
| Cycle | Target | Expected Score |
74
77
|---|---|---:|
@@ -346,6 +349,22 @@ Target score: 100/100.
346
349
- Add security review, threat model update, and signed release policy.
347
350
- Add Homebrew/package registry distribution once names are secured.
348
351
352
+
Current progress:
353
+
354
+
- Added a public threat model covering custody, live execution, public packet
355
+
privacy, dependency/release compromise, Railway, and contributor bypass
356
+
risks.
357
+
- Added P0/P1/P2 incident runbooks for secret leaks, unexpected live orders,
358
+
Railway downtime, journal recovery, public packet privacy regression, bad
359
+
release artifacts, and market data degradation.
360
+
- Added distribution readiness policy for GitHub Release, PyPI, crates.io,
361
+
Homebrew, and container channels with promotion and rollback gates.
362
+
- Added `scripts/hardening_gate.sh` and wired it into `just lint`/`just ci` so
363
+
launch-hardening assets and shell/JSON contracts stay present and parseable.
364
+
- Updated the release process and release template to require hardening review,
365
+
checksum verification, attestation verification, and distribution rollback
366
+
review before publication.
367
+
349
368
Exit gate:
350
369
351
370
- The production scorecard reaches at least 95 in every dimension, and no live
0 commit comments