diff --git a/crates/sprout-core/src/backup/NIP-SB.md b/crates/sprout-core/src/backup/NIP-SB.md new file mode 100644 index 000000000..2e5419135 --- /dev/null +++ b/crates/sprout-core/src/backup/NIP-SB.md @@ -0,0 +1,906 @@ +NIP-SB +====== + +Steganographic Key Backup +-------------------------- + +`draft` `optional` + +Your Nostr identity is a single private key. If you lose it, you lose everything — your name, your messages, your connections. There's no "forgot my password" button, no customer support, no recovery email. The key IS the identity. + +This NIP lets you back up your key to any Nostr relay using just a password. The backup is split into multiple pieces — real chunks, parity blobs for fault tolerance, and dummy blobs to obscure the count — each stored as a separate Nostr event signed by a different throwaway key. To recover, you just need your password and your public key (which is your Nostr identity — you know it, or you can look it up). + +NIP-SB provides two distinct privacy properties: + +1. **Cryptographic unlinkability (the security property).** No field in any blob references the user's real pubkey. The throwaway signing keys, d-tags, and ciphertext are derived from a one-way function of `password ‖ pubkey ‖ index`. An attacker who obtains a blob — even one they suspect is a NIP-SB backup — cannot determine which user it belongs to and cannot link it to other blobs in the same backup set. Crucially, each password guess is bound to a specific pubkey, preventing the batch-cracking accumulation attack that NIP-49 warned about: an attacker cannot test one password against multiple users simultaneously. This property holds under standard computational assumptions (pseudorandomness of scrypt and HKDF). It does not depend on cover traffic, relay population, or deployment conditions. + +2. **Steganographic cover (environment-dependent).** NIP-SB blobs share the same event structure (`kind:30078`, constant-size content, standard alt tag) as other application-specific data. The degree to which blobs blend into ambient relay traffic depends on the relay's `kind:30078` population — specifically, the distribution of content lengths, d-tag formats, pubkey reuse patterns, and publication timing among non-backup events. This property has not been empirically validated. On a relay with diverse, high-volume `kind:30078` traffic, steganographic cover may be strong. On a relay with sparse or structurally uniform traffic, a statistical classifier may identify probable backup blobs. Steganographic cover provides defense-in-depth but the core security properties hold without it. + +Note: against an active relay operator with connection logs, blobs published or queried from the same IP address within a short time window can be correlated into backup sets and potentially associated with a client identity, even though the operator cannot link the blobs to a specific Nostr pubkey without the password. Per-blob isolation holds at the database layer but not at the network layer. Implementations that require protection against active relay operators SHOULD publish blobs over separate connections with substantial time separation, ideally via Tor or a relay proxy. See §Security Analysis for detailed adversary-class analysis. + +## Versions + +This NIP is versioned to allow future algorithm upgrades without breaking existing implementations. + +Currently defined versions: + +| Version | Status | Description | +|---------|--------|-------------| +| `1` | Active | scrypt KDF, HKDF-SHA256, XChaCha20-Poly1305, kind:30078 | + +Blobs do not carry an on-wire version indicator — the version is implicit in the constants and algorithms used. Future versions will use different scrypt parameters, HKDF info strings, or event kinds, ensuring that v1 blobs are never misinterpreted by a v2 implementation. Implementations SHOULD document which version(s) they support. + +## Motivation + +[NIP-49](49.md) provides password-encrypted key export (`ncryptsec1`) but explicitly warns against publishing to relays: *"cracking a key may become easier when an attacker can amass many encrypted private keys."* This warning is well-founded: with NIP-49, an attacker who dumps a relay can grep for `ncryptsec1` and instantly build a list of every user's encrypted backup, then try one password against all blobs simultaneously — the cost is `|passwords| × 1 scrypt`, tested against all targets in parallel. + +This NIP substantially mitigates the accumulation problem through cryptographic unlinkability: no field in any blob contains or reveals the user's real pubkey. While the KDF inputs include the pubkey, the outputs (throwaway signing keys, d-tags, ciphertext) are computationally unlinkable to it without the password. An attacker who obtains a blob cannot determine which user it belongs to and cannot link it to other blobs in the same backup set. Crucially, each password guess is bound to a specific pubkey, so an attacker who wants to test a password against all users must pay `|users|×` the cost of a single-target attack — eliminating the cheap batch-cracking that NIP-49 is vulnerable to. This property holds under standard computational assumptions (pseudorandomness of scrypt and HKDF) — it does not depend on cover traffic or relay population. + +As a secondary benefit, NIP-SB blobs share the same event structure as other `kind:30078` application data (Cashu wallets, app settings, drafts), providing steganographic cover that makes it harder for a passive relay-dump adversary to identify which events are backup blobs at all. This cover is environment-dependent — it improves with ambient `kind:30078` traffic volume and has not been empirically validated against a statistical classifier (see §Limitations). The security argument does not depend on steganographic cover: even if an attacker can classify blobs as probable backups, the unlinkability property prevents them from determining whose backups they are or batch-cracking them. + +### Prior Art + +| System | Pattern | Gap | +|--------|---------|-----| +| NIP-49 | Single identifiable `ncryptsec1` blob | Accumulation-vulnerable, linkable to user | +| BIP-38 | Single identifiable `6P…` blob | Same | +| SLIP-39 | 2-level Shamir, PBKDF2 Feistel | Shares linkable by shared `id` field, no accumulation resistance | +| Kintsugi ([arXiv:2507.21122](https://arxiv.org/abs/2507.21122)) | Decentralized threshold OPRF key recovery | Requires dedicated recovery node infrastructure, no deniability | +| Apollo ([arXiv:2507.19484](https://arxiv.org/abs/2507.19484)) | Indistinguishable shares in social circle | Requires trustees, not relay-native | +| PASSAT ([arXiv:2102.13607](https://arxiv.org/abs/2102.13607)) | XOR secret sharing across cloud storage | No steganography, no throwaway keys, shares linkable | +| NIP-59 | Throwaway keys for gift wrap | Messaging, not backup | +| Shufflecake ([arXiv:2310.04589](https://arxiv.org/abs/2310.04589)) | Plausible deniability for disk volumes | Local disk only | +| **This NIP** | Per-blob throwaway keys + password-derived tags + variable N + RS parity + dummy blobs + constant-size blobs | Novel combination | + +### Design Principles + +1. **No bootstrap problem** — everything derives from `password ‖ pubkey`. No salt to store, no chicken-and-egg. The user knows their pubkey at recovery time (it is the identity they are trying to recover). +2. **Constant-size blobs** — every blob is the same byte length regardless of payload type (real chunk, parity, or dummy). An attacker cannot infer N, P, or D from content sizes. +3. **Per-blob isolation** — each real and parity blob has its own scrypt derivation, its own throwaway keypair, its own d-tag. Compromise of one blob's metadata reveals nothing about others. +4. **Per-user uniqueness** — the user's pubkey is mixed into every derivation via length-prefixed concatenation (injective encoding). Identical passwords for different users produce completely unrelated blobs. No cross-user interference, no d-tag collisions. +5. **Fault tolerance** — Reed-Solomon parity (P=2) tolerates loss of up to 2 blobs. Dummy blobs obscure the real chunk count. +6. **No new crypto** — scrypt (NIP-49 parameters), HKDF-SHA256, XChaCha20-Poly1305, Reed-Solomon over GF(2^8). All battle-tested. +7. **Just Nostr events** — `kind:30078` parameterized replaceable events. No special relay protocol extensions needed (though relays may require authorization for `kind:30078` writes — see §Relay Requirements). + +## Encoding Conventions + +- **Strings to bytes**: All string-to-bytes conversions use UTF-8 encoding. The NFKC-normalized password is UTF-8 encoded before concatenation. +- **Concatenation (`‖`)**: Raw byte concatenation with no delimiters. Where noted, a 2-byte big-endian length prefix is prepended to variable-length fields to ensure injective encoding (see `base` construction in §Step 1). +- **`pubkey_bytes`**: The 32-byte raw x-only public key (as used throughout Nostr per [BIP-340](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki)), NOT hex-encoded. +- **`to_string(i)`**: The ASCII decimal representation of the blob index `i`, with no leading zeros or padding. Examples: `"0"`, `"1"`, `"15"`. UTF-8 encoded (ASCII is a subset of UTF-8). +- **`relay_url_bytes`**: The UTF-8 encoding of the normalized relay URL (see §Relay URL Normalization). Used as a domain separator in HKDF `info` strings to produce relay-scoped d-tags and signing keys. +- **Hex encoding**: Lowercase hexadecimal, no `0x` prefix. Used for d-tags and pubkeys in JSON. +- **Base64**: RFC 4648 standard alphabet (`A-Z`, `a-z`, `0-9`, `+`, `/`) with `=` padding. NOT URL-safe alphabet. The `content` field of each blob event is base64-encoded and MUST decode to exactly 56 bytes. This produces 76 base64 characters including one trailing `=` padding character (`56 mod 3 = 2`, so padding is required). Implementations MUST accept both padded and unpadded base64 on input, and MUST produce padded base64 on output. + +## Terminology + +- **backup password**: User-chosen password used to derive all backup parameters. MUST be normalized to NFKC before use. Combined with the user's pubkey (via length-prefixed concatenation) before hashing, so that identical passwords for different users produce completely unrelated blobs. +- **blob**: A single `kind:30078` event containing encrypted data. Each blob is signed by a different throwaway keypair and is indistinguishable from any other `kind:30078` application data. A backup set contains three types of blobs: real chunks, parity blobs, and dummy blobs — all identical in format and size. +- **chunk**: A fragment of the raw 32-byte private key. Chunks are padded to constant size before encryption. +- **N**: The number of real chunk blobs in a backup set. Derived deterministically from the password and pubkey. Range: 3–16. Unknown to an attacker without the password. +- **P**: The number of parity blobs. Fixed at 2. Parity blobs contain Reed-Solomon erasure-coding data computed across all N chunks, enabling recovery of up to 2 missing chunks. +- **D**: The number of dummy blobs. Derived deterministically from the password and pubkey. Range: 4–12. Dummy blobs contain encrypted HKDF-derived filler (deterministic, not nsec material) and are indistinguishable from real and parity blobs. +- **parity blob**: A blob containing Reed-Solomon parity data computed across all N padded chunks. Enables reconstruction of up to P missing chunks during recovery. +- **dummy blob**: A blob containing encrypted HKDF-derived filler bytes (deterministic, independent of nsec). Published alongside real and parity blobs to obscure the total number of real chunks. Discarded during recovery. +- **throwaway keypair**: An ephemeral secp256k1 keypair generated for signing a single blob. Deterministically derived from the password, pubkey, and blob index. Has no relationship to the user's real identity and is not reused across backup operations. +- **enc_key**: A 32-byte symmetric key derived from the password and pubkey, shared across all blobs in a backup set. Used for XChaCha20-Poly1305 encryption. +- **d-tag**: The NIP-33 `d` parameter uniquely identifying a parameterized replaceable event. Each blob's d-tag is derived from its per-blob key material and is indistinguishable from random data. + +## Limitations + +This NIP provides relay-based steganographic backup and recovery of a Nostr private key. It does not provide: + +- **Limited fault tolerance**: Reed-Solomon parity (P=2) tolerates loss of up to 2 blobs. Loss of more than 2 blobs makes the backup unrecoverable. Multi-relay publication and periodic health checks are strongly recommended. +- **No post-quantum security**: scrypt and XChaCha20-Poly1305 are not quantum-resistant. +- **Password strength is the security floor**: weak passwords make the backup crackable regardless of the steganographic properties. Implementations MUST enforce minimum entropy (see §Specification). +- **No automatic relay discovery**: the user must know which relay(s) hold their backup blobs. There is no relay discovery mechanism in this NIP. +- **Relay retention not guaranteed**: events from throwaway keypairs may be garbage-collected by relays that do not recognize them. Multi-relay publication and periodic health checks are recommended. +- **Deniability is probabilistic, not absolute**: against a passive relay-dump adversary, backup blobs are indistinguishable from other `kind:30078` data. Against an active relay operator with timing and network metadata, the steganographic cover is weaker. Deniability improves as the relay's ambient `kind:30078` population grows. +- **No key rotation or migration**: this NIP provides backup and recovery only. It does not provide key rotation, key migration, or ongoing key management. +- **Chunks are byte slices, not independent shares**: unlike Shamir's Secret Sharing, each chunk is a contiguous slice of the encrypted key, not an information-theoretically independent share. A compromised chunk reveals its portion of the ciphertext (though not the plaintext, which requires `enc_key`). +- **Cross-relay blob count correlation**: N and D are derived from `password ‖ pubkey` without relay URL input, so the total blob count (N+P+D) is identical across all relays for the same user. An attacker with dumps from multiple relays could use this as a weak correlation signal (22 possible values in range 9–30). This does not reveal the user's identity but may help group backup sets across relays. + +## Overview + +``` +pw_bytes = NFKC(password) # UTF-8 encoded +base = len(pw_bytes).to_bytes(2, 'big') ‖ pw_bytes ‖ pubkey_bytes # length-prefixed for injectivity + +base ──→ scrypt(base, salt="") ──→ H ──→ N = (H[0] % 14) + 3 (3..16 real chunks) +base ──→ scrypt(base, salt="dummies") ──→ H_d ──→ D = (H_d[0] % 9) + 4 (4..12 dummy blobs) +base ──→ scrypt(base, salt="encrypt") ──→ H_enc ──→ enc_key = HKDF(H_enc, "key") +base ──→ scrypt(base, salt="cover") ──→ H_cover (for dummy blob key derivation) + +P = 2 (fixed Reed-Solomon parity blobs) + +nsec_bytes (32 bytes) split into N variable-length chunks +parity = RS(N+2, N) over GF(256), 16 parallel byte codes across padded chunks → 2 parity rows + +Total blobs = N + P + D (range: 9..30, variable per user, all indistinguishable) + +For real chunk blobs i in 0..N-1: + H_i = scrypt(base ‖ to_string(i), salt="") + d_tag_i = hex(HKDF(H_i, "d-tag" ‖ relay_url_bytes, length=32)) + signing_key_i = HKDF(H_i, "signing-key" ‖ relay_url_bytes, length=32) → reject if zero/≥n + padded_i = chunk_i ‖ HKDF(H_i, "pad", length=16 - len(chunk_i)) # deterministic padding + +For parity blobs i in N..N+1: + H_i = scrypt(base ‖ to_string(i), salt="") + d_tag_i = hex(HKDF(H_i, "d-tag" ‖ relay_url_bytes, length=32)) + signing_key_i = HKDF(H_i, "signing-key" ‖ relay_url_bytes, length=32) → reject if zero/≥n + padded_i = parity_row_{i-N} (16 bytes from RS encoding) + +For dummy blobs j in 0..D-1: + d_tag = hex(HKDF(H_cover, "dummy-d-tag-" ‖ to_string(j) ‖ relay_url_bytes, length=32)) + signing_key = HKDF(H_cover, "dummy-signing-key-" ‖ to_string(j) ‖ relay_url_bytes, length=32) + padded = HKDF(H_cover, "dummy-pad-" ‖ to_string(j), length=16) # deterministic + +For ALL blobs (real, parity, dummy): + nonce_i = random(24) + ciphertext_i = XChaCha20-Poly1305(enc_key, nonce_i, padded_i, aad=0x02) + content_i = base64(nonce_i ‖ ciphertext_i) (56 bytes constant) + +Collect all N+P+D blobs, shuffle into random order, publish with jittered delays. +``` + +Recovery requires only the password, the user's pubkey, and a relay URL. The client re-derives N, P, D, all d-tags, and queries all N+P+D d-tags in random order with jittered delays. Under normal conditions all queries return events; if up to 2 real or parity blobs are missing or corrupted, Reed-Solomon erasure decoding reconstructs them. Dummies are discarded. No salt storage, no bootstrap problem, no special relay API. + +## Specification + +### Constants + +``` +SCRYPT_LOG_N = 20 # 2^20 iterations (NIP-49 default) +SCRYPT_R = 8 +SCRYPT_P = 1 + +MIN_CHUNKS = 3 +MAX_CHUNKS = 16 +CHUNK_RANGE = 14 # MAX_CHUNKS - MIN_CHUNKS + 1 + +PARITY_BLOBS = 2 # Reed-Solomon parity blobs (tolerates 2 missing chunks) + +MIN_DUMMIES = 4 +MAX_DUMMIES = 12 +DUMMY_RANGE = 9 # MAX_DUMMIES - MIN_DUMMIES + 1 + +CHUNK_PAD_LEN = 16 # pad each chunk to this size before encryption +BLOB_CONTENT_LEN = 56 # 24-byte nonce + 32-byte ciphertext (16 padded + 16 tag) +EVENT_KIND = 30078 # NIP-78 application-specific data +``` + +### Password Requirements + +Implementations MUST normalize passwords to NFKC Unicode normalization form before any use. The NFKC-normalized, UTF-8-encoded password MUST NOT exceed 65535 bytes (the maximum representable by the 2-byte length prefix in the `base` construction). In practice, even a 1000-character passphrase is well under this limit. + +Implementations MUST enforce minimum password entropy of 80 bits. The specific entropy estimation method is implementation-defined (e.g., zxcvbn, wordlist-based calculation, or other validated estimator). Implementations MUST refuse to create a backup if the password does not meet this threshold. Implementations SHOULD recommend generated passphrases of seven or more words from a standard wordlist (e.g., EFF large wordlist at ~12.9 bits/word ≥ 90 bits for 7 words, or BIP-39 English wordlist at ~11 bits/word ≥ 88 bits for 8 words). Both exceed the 80-bit minimum with margin. + +### Relay URL Normalization + +Blob d-tags and signing keys are scoped to the relay they are published to. This ensures that the same backup published to different relays produces completely different metadata on each relay, preventing cross-relay linkability (see §Security Analysis, Adversary Classes). + +The relay URL MUST be normalized to a canonical form before use in any derivation. Normalization uses the [WHATWG URL Standard](https://url.spec.whatwg.org/) parsing and serialization algorithms: + +``` +To compute relay_url_bytes for a given relay: + +1. Parse the URL as an absolute URL using the WHATWG URL Standard + parsing algorithm with no base URL. REJECT if parsing fails. +2. Inspect the parsed URL components: + a. REJECT if the scheme is not "wss". + b. REJECT if the URL contains userinfo (username or password). + c. REJECT if the URL contains a query string (search component is non-empty). +3. Serialize the parsed URL using the WHATWG URL Standard + serialization algorithm with the "exclude fragment" flag set to true. +4. UTF-8 encode the serialized string. The result is relay_url_bytes. +``` + +The WHATWG URL Standard is implemented by JavaScript `new URL()`, Rust's `url` crate, and equivalent libraries in most languages. Parse-then-serialize is idempotent — the same input always produces the same output. The standard handles scheme and hostname lowercasing, IDNA/punycode normalization, default port elision (443 for `wss`), path normalization, and IPv6 address normalization. + +Implementations MUST use a WHATWG-conformant URL parser. Generic URL libraries that implement RFC 3986 but not the WHATWG URL Standard may produce different canonical forms and will cause recovery failures. + +**Normalization examples:** + +| Input | Canonical form (`relay_url_bytes`) | +|-------|-----| +| `wss://Relay.Example.COM` | `wss://relay.example.com/` | +| `wss://relay.example.com/` | `wss://relay.example.com/` | +| `wss://relay.example.com:443` | `wss://relay.example.com/` | +| `wss://relay.example.com:443/` | `wss://relay.example.com/` | +| `wss://relay.example.com:8080` | `wss://relay.example.com:8080/` | +| `wss://relay.example.com/v1` | `wss://relay.example.com/v1` | +| `wss://relay.example.com/v1/` | `wss://relay.example.com/v1/` | +| `wss://relay.example.com#frag` | `wss://relay.example.com/` | +| `wss://relay.example.com?q=1` | REJECTED (query string) | +| `ws://relay.example.com` | REJECTED (not wss) | +| `wss://user:pass@relay.example.com` | REJECTED (userinfo) | + +### Step 1: Determine N and D + +``` +pw_bytes = NFKC(password) # UTF-8 encoded +base = len(pw_bytes).to_bytes(2, 'big') ‖ pw_bytes ‖ pubkey_bytes +# Length prefix ensures injective encoding: distinct (password, pubkey) pairs +# always produce distinct base values. pubkey_bytes is 32 bytes (raw x-only, not hex). + +H = scrypt( + password = base, + salt = b"", + N = 2^SCRYPT_LOG_N, + r = SCRYPT_R, + p = SCRYPT_P, + dkLen = 32 +) +N = (H[0] % CHUNK_RANGE) + MIN_CHUNKS # result in [3, 16] + +H_d = scrypt( + password = base, + salt = b"dummies", + N = 2^SCRYPT_LOG_N, + r = SCRYPT_R, + p = SCRYPT_P, + dkLen = 32 +) +D = (H_d[0] % DUMMY_RANGE) + MIN_DUMMIES # result in [4, 12] +``` + +P is fixed at `PARITY_BLOBS = 2`. The total number of blobs in a backup set is `N + P + D`, ranging from 9 to 30. + +The empty salt for N derivation is intentional — this derivation exists solely to determine N and is not used for encryption. The `"dummies"` salt provides domain separation for D derivation. Each real and parity blob receives its own full-strength scrypt derivation in Step 3b. The pubkey is included in `base` (with a length-prefixed password for injective encoding) to ensure per-user uniqueness: identical passwords for different users produce completely unrelated N, D values and blob chains. + +Note: `H[0] % 14` and `H_d[0] % 9` have slight modular bias. This is acceptable for this use case. Implementations MAY use rejection sampling if strict uniformity is required. + +### Step 2: Derive the Master Encryption Key + +``` +pw_bytes = NFKC(password) +base = len(pw_bytes).to_bytes(2, 'big') ‖ pw_bytes ‖ pubkey_bytes + +H_enc = scrypt( + password = base, + salt = b"encrypt", + N = 2^SCRYPT_LOG_N, + r = SCRYPT_R, + p = SCRYPT_P, + dkLen = 32 +) +enc_key = HKDF-SHA256(ikm=H_enc, salt=b"", info=b"key", length=32) +``` + +`enc_key` is shared across all blobs in the backup set. It is derived once and used for all XChaCha20-Poly1305 operations. + +### Step 3: Split the Private Key into Chunks + +The raw 32-byte private key is split into N variable-length chunks using integer division: + +``` +remainder = 32 % N +base_len = 32 // N # integer division + +# Chunks 0..(remainder-1) are (base_len + 1) bytes. +# Chunks remainder..(N-1) are base_len bytes. +# Example: N=7 → 32 = 4×5 + 3×4 → chunks 0-3 are 5 bytes, chunks 4-6 are 4 bytes. + +offset = 0 +for i in 0..N-1: + chunk_len_i = base_len + 1 if i < remainder else base_len + chunk_i = nsec_bytes[offset : offset + chunk_len_i] + offset += chunk_len_i +``` + +### Step 3b: Derive Per-Blob Scrypt Keys (H_i) + +Before padding or RS encoding, derive the per-blob scrypt key `H_i` for each real and parity blob. These keys are used for deterministic padding (Step 3c), d-tags, and signing keys (Step 4). + +``` +For each blob i in 0..N+P-1 (real chunks and parity): + pw_bytes = NFKC(password) + base_i = len(pw_bytes).to_bytes(2, 'big') ‖ pw_bytes ‖ pubkey_bytes ‖ to_string(i) + + H_i = scrypt( + password = base_i, + salt = b"", + N = 2^SCRYPT_LOG_N, + r = SCRYPT_R, + p = SCRYPT_P, + dkLen = 32 + ) +``` + +Each `H_i` is fully independent: different scrypt input, different output. Compromise of any `H_i` reveals nothing about any other blob's key material or the `enc_key`. + +### Step 3c: Pad Chunks and Compute Reed-Solomon Parity + +Using the `H_i` values from Step 3b, pad each chunk deterministically and compute P=2 parity rows using 16 parallel systematic Reed-Solomon codes over GF(2^8): + +``` +# Pad each chunk to CHUNK_PAD_LEN before RS encoding. +# Padding MUST be deterministic: derived from per-blob key material so that +# re-publication produces identical padded chunks and consistent RS parity. +# Use the same padded values that will be encrypted in Step 5. +for i in 0..N-1: + pad_bytes_i = HKDF-SHA256(ikm=H_i, salt=b"", info=b"pad", length=CHUNK_PAD_LEN - len(chunk_i)) + padded_i = chunk_i ‖ pad_bytes_i + +# For each byte position b in 0..15: +# Treat padded_0[b], padded_1[b], ..., padded_{N-1}[b] as N data symbols. +# Encode using a systematic RS(N+2, N) code over GF(2^8). +# This produces 2 parity symbols for byte position b. +# parity_row_0[b] = first parity symbol +# parity_row_1[b] = second parity symbol + +# Result: parity_row_0 and parity_row_1, each 16 bytes. +# These are the plaintext payloads for the 2 parity blobs. +``` + +The RS code MUST use the following construction: GF(2^8) with the irreducible polynomial `x^8 + x^4 + x^3 + x + 1` (0x11B, the AES polynomial). Evaluation points for the N+2 codeword positions are `α^0, α^1, ..., α^{N+1}` where `α = 0x03` is a primitive element of GF(2^8) under 0x11B (i.e., `0x03` generates the full multiplicative group of order 255). The first N positions are systematic (data), the last 2 are parity. + +Concretely, the encoding for each byte position `b` in `0..15`: +- Let `d_0, d_1, ..., d_{N-1}` be the data symbols (byte `b` of each padded chunk). +- Evaluate the unique polynomial of degree `N-1` passing through `(α^0, d_0), (α^1, d_1), ..., (α^{N-1}, d_{N-1})` at the parity points `α^N` and `α^{N+1}`. +- `parity_row_0[b] = P(α^N)`, `parity_row_1[b] = P(α^{N+1})`. + +Erasure decoding: given any N of the N+2 symbols (data + parity) at known positions, reconstruct the degree-(N-1) polynomial via Lagrange interpolation over GF(2^8) and evaluate at the missing positions. + +Implementations MUST include test vectors (see §Implementation Notes). + +Note: The deterministic padding bytes derived here MUST be the same bytes encrypted in Step 5. Because padding is derived from per-blob key material (not random), re-publication always produces the same padded chunks and therefore the same RS parity — enabling safe partial re-publication of missing blobs without invalidating the backup set. + +### Step 3d: Derive Cover Key for Dummy Blobs + +``` +H_cover = scrypt( + password = base, + salt = b"cover", + N = 2^SCRYPT_LOG_N, + r = SCRYPT_R, + p = SCRYPT_P, + dkLen = 32 +) +``` + +`H_cover` is used to derive d-tags and signing keys for all D dummy blobs via HKDF (no per-dummy scrypt call). This keeps the scrypt budget low while producing indistinguishable dummy blob metadata. + +### Step 4: Derive D-Tags and Signing Keys + +#### Real chunk blobs (indices 0..N-1) and parity blobs (indices N..N+1) + +Using the `H_i` values from Step 3b, derive d-tags and signing keys for each real and parity blob: + +``` +For each blob i in 0..N+P-1: + d_tag_i = hex(HKDF-SHA256(ikm=H_i, salt=b"", info=b"d-tag" ‖ relay_url_bytes, length=32)) + + signing_secret_i = HKDF-SHA256(ikm=H_i, salt=b"", info=b"signing-key" ‖ relay_url_bytes, length=32) + # Interpret signing_secret_i as a 256-bit big-endian unsigned integer. + # If the value is zero or ≥ secp256k1 order n, REJECT and re-derive: + # info=b"signing-key-1" ‖ relay_url_bytes, then b"signing-key-2" ‖ relay_url_bytes, etc. + # Do NOT reduce mod n (reject-and-retry avoids modular bias). + # Implementations MUST retry up to 255 times. If all attempts produce + # an invalid scalar, the backup MUST fail. + # (Probability of even one retry: ~3.7×10^-39. This will never happen.) + signing_keypair_i = keypair_from_secret(signing_secret_i) +``` + +Parity blobs (indices N and N+1) use the same derivation as real chunks. They carry real recovery data and deserve the same per-blob scrypt isolation. + +#### Dummy blobs (indices 0..D-1) + +Dummy blob keys are derived from `H_cover` via HKDF, not individual scrypt calls: + +``` +For each dummy j in 0..D-1: + d_tag_dummy_j = hex(HKDF-SHA256(ikm=H_cover, salt=b"", + info=b"dummy-d-tag-" ‖ to_string(j) ‖ relay_url_bytes, length=32)) + + signing_secret_dummy_j = HKDF-SHA256(ikm=H_cover, salt=b"", + info=b"dummy-signing-key-" ‖ to_string(j) ‖ relay_url_bytes, length=32) + # Interpret signing_secret_dummy_j as a 256-bit big-endian unsigned integer. + # If the value is zero or ≥ secp256k1 order n, REJECT and re-derive: + # info=b"dummy-signing-key-" ‖ to_string(j) ‖ b"-1" ‖ relay_url_bytes, + # then b"dummy-signing-key-" ‖ to_string(j) ‖ b"-2" ‖ relay_url_bytes, etc. + # Do NOT reduce mod n (reject-and-retry avoids modular bias). + # Implementations MUST retry up to 255 times. If all attempts produce + # an invalid scalar, the backup MUST fail. + # (Probability of even one retry: ~3.7×10^-39. This will never happen.) + signing_keypair_dummy_j = keypair_from_secret(signing_secret_dummy_j) +``` + +Dummy blobs are indistinguishable from real and parity blobs on the wire. Their d-tags and signing keys are unrelated to those of real blobs. + +### Step 5: Encrypt and Publish + +For each blob (real, parity, or dummy), prepare the 16-byte plaintext payload: + +``` +# Real chunk blobs (i in 0..N-1): +padded_i = chunk_i ‖ pad_bytes_i + # Deterministic padding from Step 3c (HKDF-derived, NOT random). + # Ensures re-publication produces identical padded chunks and + # consistent RS parity across generations. + +# Parity blobs (i in N..N+1): +padded_i = parity_row_{i-N} # 16 bytes from RS encoding (Step 3c) + +# Dummy blobs (j in 0..D-1): +padded_j = HKDF-SHA256(ikm=H_cover, salt=b"", info=b"dummy-pad-" ‖ to_string(j), length=CHUNK_PAD_LEN) + # Deterministic dummy payload — indistinguishable from ciphertext after encryption. + # Deterministic so dummy re-publication is idempotent. +``` + +Encrypt each payload identically: + +``` +nonce = random(24) # MUST be fresh cryptographically random bytes per blob +ciphertext = XChaCha20-Poly1305.encrypt( + key = enc_key, + nonce = nonce, + plaintext = padded, # 16 bytes (chunk, parity, or random) + aad = b"\x02" # key_security_byte per NIP-49 +) +# ciphertext = 16 bytes plaintext + 16 bytes Poly1305 tag = 32 bytes +blob_content = nonce ‖ ciphertext # 24 + 32 = 56 bytes, constant for ALL blob types +``` + +All N+P+D blobs produce identical 56-byte content regardless of type. After encryption, real chunks, parity blobs, and dummies are indistinguishable. + +Implementations MUST use fresh random 24-byte nonces for each blob. Deterministic nonces are not permitted. The random nonce ensures that re-running backup with the same password produces completely different ciphertext, preventing clustering attacks. + +Collect all N+P+D blobs and publish as NIP-01 events (see §Event Structure): + +Implementations MUST shuffle all N+P+D blobs into random order before publication. Publishing in index order would reveal blob roles to a timing observer. + +Implementations SHOULD publish blobs with random delays of 100ms–2s between events to prevent timing correlation. Implementations MAY use longer delays (minutes, hours, or days) for stronger steganographic cover. + +Implementations SHOULD jitter `created_at` timestamps within ±1 hour of the current time on initial publication. On re-publication (health-check refresh), implementations MUST use a `created_at` strictly greater than the existing event's timestamp to ensure the relay accepts the replacement per NIP-33 semantics. + +Implementations SHOULD publish to at least 2 relays for redundancy. + +Implementations SHOULD periodically verify blob existence (for example, on login) and re-publish any missing blobs. Because chunk padding and dummy payloads are deterministic (derived from key material, not random), re-publication of individual blobs is safe — the plaintext is identical across generations, so RS parity remains consistent even if only a subset of blobs is refreshed. Only the ciphertext changes (fresh random nonce), which prevents clustering attacks. + +### Recovery + +``` +1. User provides: password, pubkey (npub or hex), relay URL(s) + # The relay URL is a derivation input, not just a query target. + # Each relay produces different d-tags and signing keys. + # If the user doesn't remember which relay, the client can + # silently try each relay in the user's relay list — wrong + # relay URLs produce d-tags that match nothing (no harm). + +2. pw_bytes = NFKC(password) + base = len(pw_bytes).to_bytes(2, 'big') ‖ pw_bytes ‖ pubkey_bytes + +3. Derive parameters: + H = scrypt(base, salt="") → N = (H[0] % 14) + 3 + H_d = scrypt(base, salt="dummies") → D = (H_d[0] % 9) + 4 + H_enc = scrypt(base, salt="encrypt") → enc_key = HKDF(H_enc, "key") + H_cover = scrypt(base, salt="cover") (for dummy d-tags) + P = 2 + +4. Normalize the relay URL (see §Relay URL Normalization): + relay_url_bytes = UTF-8(WHATWG_normalize(relay_url)) + + Derive d-tags and signing pubkeys for all N+P+D blobs: + + For real and parity blobs (i in 0..N+P-1): + H_i = scrypt(base ‖ to_string(i), salt="") + d_tag_i = hex(HKDF(H_i, "d-tag" ‖ relay_url_bytes)) + signing_secret_i = HKDF(H_i, info="signing-key" ‖ relay_url_bytes, length=32) + # Reject-and-retry if zero or ≥ n (identical to Step 4) + signing_pubkey_i = pubkey_from_secret(signing_secret_i) + + For dummy blobs (j in 0..D-1): + d_tag_dummy_j = hex(HKDF(H_cover, "dummy-d-tag-" ‖ to_string(j) ‖ relay_url_bytes)) + signing_secret_dummy_j = HKDF(H_cover, "dummy-signing-key-" ‖ to_string(j) ‖ relay_url_bytes) + signing_pubkey_dummy_j = pubkey_from_secret(signing_secret_dummy_j) + +5. Collect all N+P+D (d-tag, signing_pubkey) pairs. + Shuffle into random order. + +6. Query relay for each d-tag with jittered delays: + For each (d_tag, expected_pubkey) in shuffled order: + REQ { "kinds": [30078], "#d": [d_tag] } + # NOTE: query by d-tag only, not by authors. + # Validate event.pubkey == expected_pubkey client-side (reject impostors). + # Validate event.id and event.sig per NIP-01 (reject forgeries). + # + # d-tag squatting mitigation: a third party who learns a blob's d-tag + # can publish many events with the same d-tag under different pubkeys, + # potentially pushing the legitimate event out of truncated result sets. + # Implementations MUST paginate through all results (to EOSE) before + # concluding a blob is missing. If the relay truncates results and the + # expected pubkey is not found, implementations SHOULD retry with a + # more specific filter: { "kinds": [30078], "#d": [d_tag], "authors": [expected_pubkey] }. + # This fallback reveals the expected pubkey to the relay but prevents + # d-tag squatting from causing false erasures. + + Implementations SHOULD introduce random delays of 100ms–2s between + queries to prevent timing correlation. Implementations MAY spread + recovery queries across multiple relay connections or sessions for + stronger cover. + + Under normal conditions, all N+P+D queries return events. If a query + returns no event, that blob is marked as an erasure. Dummy blob + erasures are ignored. Real or parity blob erasures are tolerated + up to P (2) total; beyond that, recovery fails. + +7. Separate results by role (client knows which indices are real, parity, dummy): + - Discard dummy blob results (encrypted filler, not nsec material) + - Decrypt real chunk blobs and parity blobs: + + For each real/parity blob: + raw = base64_decode(event.content) # 56 bytes + nonce = raw[0:24] + ciphertext = raw[24:56] + padded = XChaCha20-Poly1305.decrypt(enc_key, nonce, ciphertext, aad=b"\x02") + +8. Reassemble the private key: + a. If all N real chunks present: + For each real chunk i in 0..N-1: + chunk_len_i = base_len + 1 if i < remainder else base_len + chunk_i = padded_i[0 : chunk_len_i] + nsec_bytes = chunk_0 ‖ chunk_1 ‖ … ‖ chunk_{N-1} + + b. If up to P (2) blobs are missing from the N+P real-and-parity set: + The RS(N+2, N) code is an MDS code: any N of the N+2 symbols + (real chunks + parity) suffice to reconstruct all N data symbols. + Missing blobs may be any combination of real and parity blobs + (e.g., 2 real missing, or 1 real + 1 parity missing, or 2 parity + missing — all are recoverable). + Use Lagrange interpolation over GF(2^8) at the known N positions + to reconstruct the degree-(N-1) polynomial, then evaluate at the + missing positions to recover the missing padded chunks. + Extract chunks from reconstructed padded blocks. + nsec_bytes = chunk_0 ‖ chunk_1 ‖ … ‖ chunk_{N-1} + + c. If more than P (2) blobs are missing from the N+P set: + Recovery MUST fail. Surface error: "Too many blobs missing + ({missing_count} missing, maximum tolerated: {P}). Check relay + URL or re-publish backup." + +9. Validate the recovered nsec_bytes: + a. Check nsec_bytes is a valid secp256k1 scalar: interpret as a 256-bit + big-endian unsigned integer; MUST be in range [1, n-1] where n is the + secp256k1 group order. If not → wrong password. + b. Derive pubkey from nsec_bytes. + c. If derived pubkey == provided pubkey → recovery successful. + If not → wrong password (or corrupted blob). Do not use the key. +``` + +Total scrypt calls at recovery: 4 (for N, D, enc_key, cover) + N+P (for real and parity blob tags) = N+6. +At N=8: 14 scrypt calls. At approximately 1 second each on consumer hardware: approximately 14 seconds. This is acceptable for a one-time recovery operation. Dummy blob d-tags are derived via HKDF from the cover key and add negligible cost. + +### Password Rotation + +``` +1. Enter old password → recover nsec (full recovery flow above) +2. Enter new password → run full backup flow (new N, P, D, new blobs, new throwaway keys) +3. Delete ALL old blobs (real + parity + dummy) on EACH relay: + + Re-derive old N, P, D, and H_cover from old password + pubkey. + + For each relay URL that held old blobs: + relay_url_bytes = UTF-8(WHATWG_normalize(relay_url)) + + For each old real/parity blob i in 0..old_N+P-1: + Re-derive old_H_i from old password + pubkey + i (Step 3b) + Re-derive old d_tag_i and old signing_keypair_i using relay_url_bytes + Publish a NIP-09 kind:5 deletion event: + { + "kind": 5, + "pubkey": old_signing_keypair_i.public_key, + "tags": [ + ["a", "30078::"] + ], + "content": "", + ... + } + signed by old_signing_keypair_i + + For each old dummy blob j in 0..old_D-1: + Re-derive old dummy signing_keypair_j and d_tag_j from old H_cover + using relay_url_bytes + Publish a NIP-09 kind:5 deletion event (same format as above) + signed by old dummy signing_keypair_j +``` + +Deletion uses NIP-09 `a`-tag targeting (referencing the parameterized replaceable event by `kind:pubkey:d-tag`). Each old blob requires its own deletion event signed by that blob's throwaway key — one deletion per blob. Because d-tags and signing keys are relay-scoped (see §Relay URL Normalization), deletion MUST be performed per-relay with the correct `relay_url_bytes` for each relay. + +This works because all signing keys are deterministically derived from `password ‖ pubkey ‖ index ‖ relay_url` (for real/parity blobs) or `password ‖ pubkey ‖ relay_url` (for dummy blobs via the cover key) — they can be reconstructed from the old password, pubkey, and relay URL at any time. + +Note: deletion is best-effort. Relays MAY or MAY NOT honor `kind:5` deletions. Old blobs may persist in relay archives. Since the nsec has not changed (only the backup encryption changed), old blobs still decrypt to the valid nsec with the old password. If the old password was compromised, the user SHOULD rotate their nsec entirely (a separate concern outside the scope of this NIP). + +### Memory Safety + +Implementations MUST zero sensitive memory after use. This includes: the password string, nsec bytes, enc_key, H_cover, all H_i values, all signing_secret_i values, all chunk_i values, and all parity row values. Implementations SHOULD use a dedicated zeroing primitive (e.g., `zeroize` in Rust) rather than relying on language runtime garbage collection. + +## Event Structure + +Each backup blob is a standard NIP-01 event with the following structure: + +```jsonc +{ + "id": "", + "pubkey": "", + "kind": 30078, + "created_at": , + "tags": [ + ["d", ""], + ["alt", "application data"] + ], + "content": "", + "sig": "" +} +``` + +- `pubkey`: the throwaway signing public key for blob `i`. Has no relationship to the user's real identity. +- `kind`: `30078` (NIP-78 application-specific data, NIP-33 parameterized replaceable event). +- `tags[d]`: the derived d-tag for blob `i`. Indistinguishable from random 64-character hex. +- `tags[alt]`: the literal string `"application data"`. This is the standard NIP-31 alt tag for `kind:30078` and provides steganographic cover — it is identical to any other `kind:30078` event. +- `content`: base64-encoded 56-byte blob: 24-byte random nonce followed by 32-byte authenticated ciphertext. +- `sig`: Schnorr signature by `signing_keypair_i` over the NIP-01 event hash. + +The `content` field MUST be 76 characters of base64 (56 bytes; includes one `=` padding character since `56 mod 3 = 2`). Implementations MUST reject blobs whose decoded content is not exactly 56 bytes. + +No field in any blob contains or reveals the user's real pubkey. While the user's pubkey is an input to the KDF chain, the outputs (throwaway signing keys, d-tags, ciphertext) are computationally unlinkable to it without the password. The throwaway signing keys are the only pubkeys visible to the relay. + +## Event Validation + +If a relay returns multiple events for a single `#d` query, implementations MUST apply the full validation pipeline (steps 1–7 below) to each candidate event, discard all invalid events, and then select the valid event with the highest `created_at` timestamp. This "validate-then-select" ordering is critical: a malicious relay could inject a newer event with the correct `pubkey` but malformed content; selecting by `created_at` before validation would cause the client to ignore an older valid blob and count an unnecessary erasure. Events from pubkeys other than the locally derived `signing_pubkey_i` MUST be silently discarded regardless of their `created_at` — NIP-33 replaceability is scoped by `(kind, pubkey, d-tag)`, not by `d-tag` alone. + +Before processing any `kind:30078` event as a backup blob during recovery, implementations MUST apply the following validation steps to each candidate event: + +1. Validate the event `id` and `sig` per [NIP-01](01.md). Events with invalid IDs or signatures MUST be silently discarded. +2. Validate that `pubkey` is a valid, non-zero secp256k1 curve point per [BIP-340](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki). +3. Validate that `event.pubkey` matches the locally derived `signing_pubkey_i` for the queried blob index `i`. Events whose pubkey does not match MUST be silently discarded. This guards against relay-injected impostor events and d-tag squatting attacks. +4. Validate that `event.kind` is `30078`. +5. Validate that the event contains a `d` tag whose value matches the locally derived `d_tag_i`. Events with a mismatched d-tag MUST be silently discarded. +6. Validate that `event.content` is valid base64 and decodes to exactly 56 bytes. Events with content of any other length MUST be silently discarded. +7. Decrypt `event.content` using XChaCha20-Poly1305 with `enc_key`, the 24-byte nonce (first 24 bytes of decoded content), and AAD `0x02`. If decryption fails (authentication tag mismatch), the event MUST be silently discarded (not selected, even if it has the highest `created_at`). +8. Among all events that pass steps 1–7, select the one with the highest `created_at`. If no events pass, the blob is an erasure. + +After reassembly (§Recovery step 8–9), validate that the recovered `nsec_bytes` produces a pubkey matching the pubkey provided by the user. If not, the recovery MUST be rejected and the recovered key MUST NOT be used. + +Events that fail any of validation steps 1–7 MUST be silently discarded. Only events that pass all seven steps are candidates for selection (step 8). Implementations MUST NOT reveal validation failure details to the relay. + +**Erasure model:** A real or parity blob is an erasure if no event passes all validation steps 1–7 (missing from relay, all candidates invalid, or all candidates fail decryption). If the total number of erasures among the N+P real-and-parity blobs exceeds P (2), recovery MUST fail. Implementations SHOULD surface a clear error: "Too many blobs missing or corrupted ({count} erasures, maximum tolerated: {P}). Check relay URL or re-publish backup." + +Missing or corrupted dummy blobs do not affect recovery. Implementations SHOULD re-publish missing dummies to maintain steganographic cover. + +## Security Analysis + +### Adversary Classes + +NIP-SB's privacy properties vary by adversary. The table below separates the two properties — **unlinkability** (cannot determine which user a blob belongs to) and **steganographic cover** (cannot determine that a blob is a backup at all) — for each adversary class: + +| Adversary | What they observe | Unlinkability | Steganographic cover | +|-----------|-------------------|---------------|----------------------| +| **External network observer** (ISP, state actor) | TLS-encrypted WebSocket frames to a relay | **Strong under TLS confidentiality.** Cannot see event content, pubkeys, or d-tags. However, traffic analysis (packet counts, sizes, burst timing) may reveal that a backup/recovery session is occurring, even though the observer cannot determine the user or contents. | **Environment-dependent.** TLS hides content, but traffic shape (N+P+D events in a burst) may be distinguishable from normal Nostr usage patterns. | +| **Passive relay-dump adversary** (database leak, subpoena, bulk export) | `kind:30078` events with random d-tags, throwaway pubkeys, constant-size content | **Strong (computational).** No field in any blob references the user's real pubkey. Cannot link blobs to users or to each other. Cannot batch-crack. Holds under standard assumptions (one-wayness of scrypt/HKDF). | **Environment-dependent.** Blobs share the same event structure as other `kind:30078` data. Effectiveness depends on ambient traffic distribution and has not been empirically validated against a statistical classifier. | +| **Active relay operator** (timing, IP, session metadata, multi-snapshot) | Event insertion timing, query patterns, IP addresses, database snapshots over time | **Strong at the data layer; degraded at the network layer.** Cannot link blob metadata to a Nostr pubkey without the password (computational). However, can correlate blobs published or queried from the same IP/session within a short time window, grouping them into backup sets and associating them with a client identity. | **Weak.** A burst of N+P+D `EVENT` messages from one connection, or N+P+D `REQ` subscriptions during recovery, is a distinctive pattern regardless of ambient traffic. Jitter and delays help but do not eliminate the signal. Use Tor or a relay proxy to mitigate. | + +*Adversary classes adapted from the taxonomy in [SoK: Plausibly Deniable Storage](https://arxiv.org/abs/2111.12809) (Chen et al., 2021), mapped from disk storage to Nostr's relay architecture.* + +The security analysis below evaluates each threat against the relevant adversary class. + +### Threat: Multi-target accumulation (NIP-49's concern) + +**Substantially mitigated.** This is the primary security property of the scheme. + +With NIP-49, an attacker who dumps a relay can grep for `ncryptsec1` and instantly build a list of every user's encrypted backup. They then try one password against all blobs simultaneously — the cost is `|passwords| × 1 scrypt`, tested against all targets in parallel. + +With this NIP, the attacker sees thousands of `kind:30078` events from unrelated throwaway pubkeys with random-looking d-tags and constant-size content. **No field in any blob contains or reveals the user's real pubkey — the KDF outputs are computationally unlinkable to it without the password.** The throwaway signing keys sever the connection between the backup and the user entirely. + +The cryptographic unlinkability property means the attacker cannot: +- Determine which user any blob belongs to (no field references a real pubkey) +- Link any blob to any other blob (each has a different throwaway pubkey and an unrelated d-tag) +- Build a list of backup targets for cheap batch cracking +- Amortize a password guess across multiple users (each guess is bound to one pubkey) + +As a secondary benefit, steganographic cover means the attacker may also be unable to: +- Identify which `kind:30078` events are backup blobs (versus Cashu wallets, app settings, drafts) +- Confirm whether a specific user has a backup at all + +The steganographic benefit is environment-dependent and has not been empirically validated against a statistical classifier (see §Limitations). **The security argument does not depend on it.** Even if an attacker can classify blobs as probable backups, the unlinkability property prevents them from determining whose backups they are or batch-cracking them. + +To attack a specific user P, the attacker must already know P and then guess passwords. The rejection cost is `1× scrypt` per wrong guess (derive `H_0`, compute `d_tag_0`, check dump — if miss, stop). The verification cost for a correct guess is `(N+2)× scrypt`. Since wrong guesses dominate brute-force search, the effective per-guess cost is `1× scrypt` — same as NIP-49 for targeted attacks. **The security improvement is in the batch scenario:** to attack "any user," the cost is `|users| × |passwords| × scrypt` (each guess bound to one pubkey), multiplying the NIP-49 accumulation cost by `|users|`. + +**Active relay operator caveat:** A relay operator with connection logs can correlate blobs published or queried from the same IP address within a short time window, grouping them into backup sets and potentially associating them with a client identity — even though the operator cannot link the blobs to a specific Nostr pubkey without the password. Per-blob isolation holds at the database layer but degrades at the network layer. Implementations that require protection against active relay operators SHOULD publish blobs over separate connections with substantial time separation, ideally via Tor or a relay proxy. + +### Threat: Full relay database dump + +The attacker has all events but cannot identify which events are backup blobs. No field in any blob references a real user pubkey. The throwaway signing keys are unrelated to any known identity. The d-tags are indistinguishable from any other `kind:30078` application data. + +To attack a **specific known user** P: +1. `scrypt(password ‖ P)` → N (one scrypt call) +2. For i in 0..N-1: `scrypt(password ‖ P ‖ i)` → d_tag_i (N scrypt calls) +3. Search dump for events matching d_tag_i (cheap, indexed lookup) +4. If all N found: reassemble, derive enc_key, decrypt, validate + +Cost to **reject** a wrong guess for one target: `1× scrypt` (derive `H_0`, compute `d_tag_0`, check if it exists in the dump — if not, stop). This is the same rejection cost as NIP-49 for targeted attacks. The `(N+2)× scrypt` cost applies only to **verify** a correct guess (all N+P blob derivations), but correct guesses are astronomically rare in a brute-force search, so the rejection cost dominates. + +**The real security improvement is accumulation resistance, not per-target cost amplification.** With NIP-49, the attacker pays `1× scrypt` per password guess and tests that guess against *all* blobs simultaneously (batch cracking). With this NIP, each guess is bound to a specific pubkey — the attacker must pay `1× scrypt × |users|` per password guess to test all users. To attack **any user**: `|users| × |passwords| × scrypt` (rejection-dominated). For a relay with 10,000 users, the attacker's total work is multiplied by 10,000× compared to NIP-49's batch attack. + +### Threat: Blob content size analysis + +**Eliminated.** All blobs are exactly 56 bytes: 24-byte random nonce + 16-byte padded-and-encrypted chunk + 16-byte Poly1305 tag. Padding is deterministic HKDF-derived bytes, encrypted alongside the chunk — indistinguishable from ciphertext after encryption with a random nonce. An attacker cannot infer N, chunk sizes, or the total key size from content lengths. + +### Threat: Content-matching / clustering attack + +**Content-based clustering eliminated; cross-relay metadata clustering eliminated by relay-scoped derivation.** Each blob uses a fresh random 24-byte nonce, so re-running backup with the same password produces completely different ciphertext. An attacker cannot cluster events by content alone. Note: within a single relay, `(pubkey, d-tag)` tuples are intentionally stable across re-publications (NIP-33 replacement semantics), and `created_at` timestamps are observable. Content-based clustering is prevented; metadata-based same-relay persistence is by design. + +The throwaway signing keys and d-tags are deterministic for a given `password ‖ pubkey ‖ index ‖ relay_url`. Because the relay URL is mixed into the HKDF `info` parameter (see §Relay URL Normalization), the same backup published to different relays produces completely different `(pubkey, d-tag)` tuples on each relay. An attacker with dumps from multiple relays cannot intersect metadata to identify the same backup set across relays. + +Within a single relay, the `(pubkey, d-tag)` tuples are stable across re-publications and health checks — this is by design, as the d-tag is the address by which the blob is found during recovery (NIP-33 parameterized replaceable events update in place). An attacker with multiple snapshots of the same relay can observe that a blob persists, but cannot link it to blobs on other relays or to the user's real identity. + +Note: N and D are derived from `password ‖ pubkey` without relay URL input, so the total blob count (N+P+D) is identical across all relays for the same user. An attacker with dumps from multiple relays could correlate backup sets by total count. However, the range is 9–30 (22 possible values), providing only weak evidence — many unrelated users will share the same total count. + +### Threat: Timing correlation + +If all N blobs are published simultaneously, an attacker could cluster events by timestamp. **Mitigation**: implementations SHOULD introduce random delays of 100ms–2s between blob publications and SHOULD jitter `created_at` timestamps within ±1 hour on initial publication. On re-publication (health-check refresh), implementations MUST use a `created_at` strictly greater than the existing event's timestamp to ensure the relay accepts the replacement per NIP-33 semantics. Failure to do so may cause the relay to silently reject the refresh, leaving stale blobs with outdated parity data. + +### Threat: Relay garbage collection of throwaway-key events + +Events from unknown pubkeys with no followers or profile are candidates for relay garbage collection. **Mitigation**: implementations SHOULD publish to at least 2 relays and SHOULD periodically verify blob existence. For corporate relays (e.g., Sprout), operators SHOULD pin `kind:30078` events to prevent GC. + +### Threat: Missing blobs + +Reed-Solomon parity (P=2) tolerates loss of up to 2 blobs from the N+P real-and-parity set. Loss of more than 2 blobs makes recovery impossible. **Mitigations**: multi-relay publication, periodic health checks on login, and relay pinning for managed deployments. + +Missing dummy blobs do not affect recovery — dummies are discarded during reassembly. However, implementations SHOULD re-publish missing dummies to maintain the full N+P+D blob set for steganographic cover. + +### Threat: Blob count analysis + +An attacker observing the relay database sees N+P+D events (range: 9–30) from unrelated throwaway pubkeys. The attacker cannot determine which blobs are real chunks, which are parity, and which are dummies — all three types are identical in format, size, and metadata. The variable total (driven by password-derived D) prevents the attacker from inferring N from the blob count. Even if the attacker suspects a backup exists, they cannot determine the number of real chunks without the password. + +### Threat: Recovery-time observation + +During recovery, the client queries the relay for N+P+D d-tags in random order with jittered delays. Under normal conditions, all queries return events. If some blobs have been garbage-collected or corrupted, those queries return no event or fail AEAD validation — both are treated as erasures, tolerable up to P=2 (see §Event Validation). The relay sees a variable-size batch of d-tag lookups, most or all returning `kind:30078` events. + +However, an active relay operator with network-layer visibility (IP, session, timing) may be able to correlate the query burst with a recovery attempt. **Mitigations**: implementations SHOULD jitter recovery queries with random delays of 100ms–2s. Implementations MAY spread queries across multiple relay connections, sessions, or relays. Implementations MAY use Tor or a proxy for recovery to prevent IP correlation. + +Note: even if the relay identifies a recovery attempt, the d-tags and throwaway pubkeys are unlinkable to any Nostr identity without the password. However, an active relay operator with IP/session visibility may be able to identify the *client* (by IP address or authenticated session), even though they cannot link the backup blobs to a specific Nostr pubkey. Implementations SHOULD use Tor or a proxy for recovery to mitigate this. + +### Threat: Password weakness + +Same as any password-based scheme. **Mitigation**: implementations MUST enforce minimum password entropy of 80 bits (see §Password Requirements). The specific entropy estimation method is implementation-defined. Implementations SHOULD recommend generated passphrases of seven or more words from a standard wordlist (e.g., EFF large wordlist at ≥90 bits for 7 words). + +### Threat: Known plaintext structure + +An attacker knows the plaintext is a 32-byte secp256k1 private key. This is irrelevant — XChaCha20-Poly1305 is IND-CPA secure regardless of plaintext structure. + +### Cost Comparison + +| | NIP-49 single blob | This NIP (N=8, P=2, D=8) | +|---|---|---| +| Attacker cost: targeted rejection (1 user) | 1× scrypt per guess | 1× scrypt per guess (early-exit on first d-tag miss) | +| Attacker cost: targeted verification | 1× scrypt per guess | (N+2)× scrypt to confirm correct guess | +| Attacker cost: batch (all users) | 1× scrypt per guess, tested against all blobs | `|users|×` scrypt per guess (each guess bound to one pubkey) | +| Attacker can identify backup blobs | Yes (`ncryptsec1` prefix) | Environment-dependent — blobs share `kind:30078` structure but steganographic cover is unvalidated (see §Limitations) | +| Attacker can confirm backup exists | Yes (blob is visible) | Environment-dependent (against passive dump adversary with diverse ambient traffic) | +| Attacker can link blobs to user | Yes (signed by user's key) | No (passive dump) — throwaway keys, no reference to real pubkey. Active operator may correlate via IP/timing (see §Adversary Classes) | +| Deniability | No — backup existence is provable | Yes — probabilistic, passive dump adversary only (see §Adversary Classes) | +| Fault tolerance | Single blob (robust) | Tolerates loss of up to 2 blobs (RS parity) | +| Relay storage | ~400 bytes | ~8.1 KB (N+P+D=18 × ~450 bytes/event) | +| Client complexity | Low | Medium | + +### Comparison to Prior Art + +| Property | NIP-49 | BIP-38 | Kintsugi | SLIP-39 | This NIP | +|----------|--------|--------|----------|---------|----------| +| Public ciphertext | Single identifiable blob | Single identifiable blob | Distributed across recovery nodes | Identifiable shares (shared `id` field) | N+P+D unlinkable constant-size blobs, indistinguishable from other relay data | +| Multi-target accumulation | Vulnerable | Vulnerable | Mitigated (threshold OPRF) | Vulnerable | **Substantially mitigated** | +| Backup existence detectable | Yes | Yes | Yes (requires infra) | Yes (shares identifiable) | **Environment-dependent** (against passive dump adversary with diverse ambient traffic; unvalidated — see §Limitations) | +| Offline cracking cost (1 target, rejection) | 1× scrypt per guess | 1× scrypt per guess | Threshold OPRF (no offline attack) | N/A (no password) | 1× scrypt per guess (early-exit) | +| Offline cracking cost (all users) | 1× scrypt, all blobs | 1× scrypt, all blobs | N/A | N/A | `|users|×` scrypt per guess | +| Linkability to user | Signed by user's key | Encoded with user's address | Requires recovery nodes | Shares linked by `id` | No (against passive dump adversary; active operator may correlate via IP/timing) | +| Deniability | No | No | No | No | Probabilistic (passive dump adversary only) | +| Bootstrap problem | No (salt in blob) | No (salt in blob) | Requires node registration | Requires share distribution | No (everything from password + pubkey) | +| Fault tolerance | Single blob (robust) | Single blob | Threshold (t-of-n) | Threshold (t-of-n) | Tolerates 2 missing blobs (RS parity) | +| Infrastructure required | None | None | Dedicated recovery nodes | Trusted share holders | **None** (standard Nostr relays) | + +## Relation to Other NIPs + +- [NIP-01](01.md): All backup blobs are valid NIP-01 events. Implementations MUST compute `event.id` and `event.sig` per NIP-01. +- [NIP-09](09.md): Password rotation uses `kind:5` deletion events signed by the old throwaway keypairs to request deletion of superseded blobs. +- [NIP-31](31.md): Blobs include an `["alt", "application data"]` tag per NIP-31, providing steganographic cover identical to any other `kind:30078` event. +- [NIP-33](33.md): Blobs use parameterized replaceable events (kind 30000–39999). The `d` tag uniquely identifies each blob within its throwaway pubkey's namespace. +- [NIP-49](49.md): This NIP uses NIP-49's scrypt parameters (`log_N=20`, `r=8`, `p=1`) and the `key_security_byte` AAD convention (`0x02`), but does NOT use the `ncryptsec1` format. NIP-49 explicitly warns against publishing encrypted keys to relays; this NIP solves that problem. +- [NIP-59](59.md): Both NIPs use throwaway keypairs for metadata privacy. NIP-59 uses them for messaging (gift wrap); this NIP uses them for backup steganography. The pattern is the same: ephemeral Nostr identities for protocol-level operations that must not be linked to real identities. +- [NIP-78](78.md): Blobs use `kind:30078` (application-specific data) for steganographic cover. The `kind:30078` namespace is shared with Cashu wallets, app settings, drafts, and other application data, making backup blobs indistinguishable from legitimate application use. +- [NIP-AB](NIP-AB.md): **NIP-AB is the primary key backup and multi-device mechanism.** It provides device-to-device key transfer via QR code + ECDH + short authentication string — fast, interactive, and cryptographically strong without password-strength dependencies. **This NIP (NIP-SB) is the secondary "break glass" recovery option** for when no second device is available (all devices lost, single-device user, new user who never paired). Implementations SHOULD present NIP-AB as the default backup path and NIP-SB as an optional emergency fallback. The two are complementary: NIP-AB covers the common case; NIP-SB covers the catastrophic case. + +## Implementation Notes + +### Rust + +- `scrypt` crate (RustCrypto) — `scrypt::scrypt()` +- `hkdf` crate — `Hkdf::::new()` +- `chacha20poly1305` crate — `XChaCha20Poly1305` +- `zeroize` crate — zero sensitive memory after use; derive `Zeroize` on key structs +- `unicode-normalization` crate — NFKC normalization via `UnicodeNormalization::nfkc()` +- `zxcvbn` crate — password entropy enforcement + +### TypeScript + +- `@noble/hashes/scrypt` — `scrypt()` +- `@noble/hashes/hkdf` — `hkdf(sha256, ikm, salt, info, length)` +- `@noble/ciphers/chacha` — `xchacha20poly1305(key, nonce)` +- `String.prototype.normalize('NFKC')` — password normalization +- `zxcvbn` package — password entropy enforcement + +### Relay Requirements + +No special relay protocol extensions are required. Implementations need only standard NIP-33 behavior: + +- Support `kind:30078` (NIP-78/NIP-33 parameterized replaceable events) +- Store events from unknown pubkeys (throwaway keys have no profile or followers) +- Support `#d` tag filtering in REQ subscriptions (standard NIP-33 behavior) + +Note: relays that enforce authorization scopes (e.g., Sprout's `MessagesWrite` scope for `kind:30078`) require clients to hold the appropriate credential. This is standard relay access control, not a NIP-SB-specific requirement. + +### Sprout-Specific Notes + +- Operators SHOULD pin `kind:30078` events to prevent garbage collection of throwaway-key events. +- Backup blobs are inert database rows: stored with `d_tag` indexed, no subscription fan-out, no WebSocket traffic unless explicitly subscribed. +- Storage cost at N=16, P=2, D=12 (maximum): approximately 13.5 KB per user backup (30 × ~450 bytes/event). For 10,000 users: approximately 135 MB. Trivial. + +## Test Vectors + +### Reed-Solomon GF(2^8) Verification + +Field: GF(2^8) with irreducible polynomial `0x11B` (`x^8 + x^4 + x^3 + x + 1`). +Primitive element: `α = 0x03` (multiplicative order 255). + +**Primitive element verification:** +``` +α^1 = 0x03, α^2 = 0x05, α^3 = 0x0F, α^4 = 0x11, ... +α^255 = 0x01 (full cycle) +``` + +**RS encode test (N=3 data, P=2 parity):** +``` +Evaluation points: α^0=0x01, α^1=0x03, α^2=0x05, α^3=0x0F, α^4=0x11 +Data symbols: [0x42, 0xAB, 0x07] +Parity symbols: [0x62, 0x59] +Full codeword: [0x42, 0xAB, 0x07, 0x62, 0x59] +``` + +**RS decode tests (all must recover data = [0x42, 0xAB, 0x07]):** +``` +No erasures: [0x42, 0xAB, 0x07, 0x62, 0x59] → [0x42, 0xAB, 0x07] ✓ +1 erasure (pos 1): [0x42, None, 0x07, 0x62, 0x59] → [0x42, 0xAB, 0x07] ✓ +2 erasures (pos 0,2): [None, 0xAB, None, 0x62, 0x59] → [0x42, 0xAB, 0x07] ✓ +Mixed (data 1 + parity 3): [0x42, None, 0x07, None, 0x59] → [0x42, 0xAB, 0x07] ✓ +``` + +### GF(2^8) Multiplication Examples + +``` +gf_mul(0x03, 0x03) = 0x05 +gf_mul(0x03, 0x05) = 0x0F +gf_mul(0x57, 0x83) = 0xC1 (standard AES MixColumns test vector) +gf_inv(0x03) = 0xF6 (since gf_mul(0x03, 0xF6) = 0x01) +``` + +Implementations MUST reproduce these test vectors exactly. Any deviation indicates a GF(2^8) arithmetic or RS encoding bug. + +## References + +- [NIP-49](https://github.com/nostr-protocol/nips/blob/master/49.md) — Encrypted private key export +- [NIP-78](https://github.com/nostr-protocol/nips/blob/master/78.md) — Application-specific data +- [NIP-33](https://github.com/nostr-protocol/nips/blob/master/33.md) — Parameterized replaceable events +- [NIP-59](https://github.com/nostr-protocol/nips/blob/master/59.md) — Gift wrap / throwaway keys +- [BIP-38](https://github.com/bitcoin/bips/blob/master/bip-0038.mediawiki) — Encrypted Bitcoin private keys +- [BIP-340](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki) — Schnorr signatures for secp256k1 +- [RFC 7914](https://www.rfc-editor.org/rfc/rfc7914) — scrypt key derivation function +- [RFC 5869](https://www.rfc-editor.org/rfc/rfc5869) — HKDF +- [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119) — Key words for use in RFCs (MUST, SHOULD, MAY) +- [XChaCha20-Poly1305](https://datatracker.ietf.org/doc/html/draft-irtf-cfrg-xchacha) — Extended-nonce ChaCha20-Poly1305 +- [Apollo](https://arxiv.org/abs/2507.19484) — indistinguishable shares for social key recovery (Mishra et al., EPFL, 2025) +- [Kintsugi](https://arxiv.org/abs/2507.21122) — password-authenticated decentralized key recovery (Ma & Kleppmann, Cambridge, 2025) +- [SoK: Plausibly Deniable Storage](https://arxiv.org/abs/2111.12809) — systematization of plausible deniability (Chen et al., Stony Brook, 2021) +- [Shufflecake](https://arxiv.org/abs/2310.04589) — hidden volumes for plausible deniability (Anzuoni & Gagliardoni, ACM CCS 2023) +- [PASSAT](https://arxiv.org/abs/2102.13607) — single-password secret-shared cloud storage (2021) +- [MFKDF](https://arxiv.org/abs/2208.05586) — multi-factor key derivation with public parameters (Nair & Song, USENIX Security 2023) diff --git a/crates/sprout-core/src/backup/NIP-SB.spthy b/crates/sprout-core/src/backup/NIP-SB.spthy new file mode 100644 index 000000000..7e4f96675 --- /dev/null +++ b/crates/sprout-core/src/backup/NIP-SB.spthy @@ -0,0 +1,565 @@ +/* + * NIP-SB v3: Steganographic Key Backup — Tamarin Formal Verification + * + * Models and verifies CORE CRYPTOGRAPHIC PROPERTIES of NIP-SB v3 + * (KDF chain, encryption, RS parity, dummy isolation). Does NOT cover + * event-level validation, relay scoping, variable N/D, or traffic analysis. + * See "What this model does NOT prove" below for full scope limitations. + * + * All 10 lemmas verified by tamarin-prover 1.12.0 (--prove): + * + * executable_honest_backup_and_recovery (exists-trace): verified (16 steps) + * executable_recovery_1_erasure (exists-trace): verified (20 steps) + * executable_recovery_2_erasures (exists-trace): verified (16 steps) + * nsec_secrecy_without_password_compromise (all-traces): verified (47 steps) + * password_secrecy (all-traces): verified (2 steps) + * chunk_secrecy_without_password (all-traces): verified (128 steps) + * parity_secrecy_without_password (all-traces): verified (716 steps) + * password_compromise_enables_nsec_recovery(exists-trace): verified (13 steps) + * executable_password_compromise (exists-trace): verified (2 steps) + * enc_key_derivable_with_compromised_password(exists-trace):verified (6 steps) + * + * Processing time: ~150s. Run: tamarin-prover --prove NIP-SB.spthy + * + * == What this model proves (in the symbolic model, with abstractions listed below) == + * NOTE: These are properties of the REDUCED symbolic model, not the full + * protocol. The model fixes N=3/P=2/D=2, omits relay scoping, event + * validation, and variable parameters. Secrecy and compromise are proved + * in separate rule instances (not as a conditional property of one run). + * + * 1. Correctness: honest recovery from password + pubkey + relay data + * yields the original secret (all blobs present). + * 2. Correctness with 1 erasure (representative case): recovery + * succeeds when real chunk 0 is missing, using RS single-erasure + * decoding from the remaining 2 data + 2 parity symbols. + * 3. Correctness with 2 erasures (representative case): recovery + * succeeds when real chunks 0 and 1 are both missing, using RS + * double-erasure decoding from the remaining 1 data + 2 parity + * symbols. The model includes equations for all 3 double-erasure + * patterns (01, 02, 12) but only instantiates the 01 case in a + * rule/lemma. The other patterns are structurally symmetric. + * 4. Confidentiality: the nsec is not derivable from published blobs + * without the password, even though the pubkey is public. + * 5. Parity confidentiality: RS parity symbols are not derivable + * without the password. + * 6. Dummy isolation: dummy blob payloads (Fr() values, modeling + * HKDF-derived filler) do not leak the nsec or any chunk material. + * (Structural: dummy payloads have no equation linking them to nsec.) + * 7. Password compromise: if the password leaks, the nsec is recoverable. + * NOTE: compromise is modeled as a separate rule (Compromise_Password) + * that creates a fresh backup and immediately leaks the password. + * This proves recoverability in the compromised world but does not + * model a transition from honest to compromised for the same backup + * instance. The conditional "secrecy holds unless password leaks" + * is argued across the two separate rule instances, not as a single + * trace property. + * + * == What this model does NOT prove == + * - Unlinkability (blobs not attributable to user): this is an + * observational-equivalence property, not a trace property. Tamarin's + * trace mode cannot express "the attacker cannot distinguish which + * events belong to which user." This property is argued in the NIP + * spec's security analysis and would require diff-equivalence mode + * or a dedicated tool (e.g., ProVerif). + * - Accumulation resistance: same — requires observational equivalence. + * - Variable N and D: Tamarin cannot model password-dependent control + * flow. We fix N=3 (spec minimum), P=2, D=2 (reduced from spec's + * 4-12 for tractability). The variable-N/D property is argued + * separately in the NIP spec. + * - Byte-level correctness: scrypt parameters, NFKC normalization, + * chunk byte lengths, RS GF(2^8) arithmetic, and base64 encoding + * are outside Tamarin's symbolic model. + * - Steganographic indistinguishability: this is an observational + * property. The model publishes all blob types (real, parity, dummy) + * via Out() but cannot express that they are indistinguishable. + * - Event-level validation/selection attacks: NIP-01 signature checks, + * multiple returned events for one d-tag, d-tag squatting, and + * result-set truncation are not modeled. The spec's event validation + * rules (pubkey-first filtering, pagination to EOSE, authors-filter + * fallback) are argued in prose, not formally verified. + * + * == Abstractions == + * - scrypt(input, salt) → h() + * - HKDF(ikm, info) → hkdf(info, ikm) + * - XChaCha20-Poly1305 → senc/sdec (Tamarin built-in IND-CPA) + * - Reed-Solomon parity → symbolic rs_parity(part_0, part_1, part_2) + * function with equations enabling reconstruction from any 3 of 5 + * symbols (3 data + 2 parity). This models the MDS property. + * - N=3, P=2, D=2 (fixed for tractability) + * - Dummy blobs encrypt deterministic HKDF-derived values (modeled as + * Fr() for symbolic freshness — the security property is that dummy + * payloads are independent of nsec, which Fr() captures correctly). + * The spec uses HKDF(H_cover, "dummy-pad-"‖j) for deterministic + * dummy payloads; the model abstracts this as fresh values since + * Tamarin cannot distinguish HKDF-derived from random. + * - Chunk padding is deterministic in the spec (HKDF-derived) but + * omitted in the model — chunks are encrypted directly. The padding + * determinism property (safe partial re-publication) is not modeled. + * - Cover key: h(<'cover', password, pk>) for dummy derivation + * - Relay URL scoping omitted — all derivations model a single relay. + * The spec's relay-scoped HKDF info strings (e.g., "d-tag" ‖ + * relay_url_bytes) are modeled as bare info strings (e.g., "d-tag"). + * Cross-relay unlinkability is argued in the spec's security analysis + * and would require a multi-session model to verify formally. + */ + +theory NIP_SB_v3 +begin + +builtins: hashing, symmetric-encryption + +functions: + pubkey/1, + hkdf/2, + /* Secret splitting */ + split_0/1, split_1/1, split_2/1, + reassemble/3, + /* Reed-Solomon parity: rs_p0 and rs_p1 are the two parity symbols. + * + * Single-erasure recovery: rs_recover_X takes the other 2 data symbols + * + both parities → recovers missing data symbol X. + * + * Double-erasure recovery: rs_recover_01_fst / rs_recover_01_snd take + * the one remaining data symbol + both parities → recover both missing + * data symbols. This models the MDS property: any 3 of 5 symbols + * reconstruct all 3 data symbols. */ + rs_p0/3, rs_p1/3, + /* Single-erasure: 2 data + 2 parity → 1 missing */ + rs_recover_0/4, rs_recover_1/4, rs_recover_2/4, + /* Double-erasure: 1 data + 2 parity → 2 missing */ + rs_recover_01_fst/3, rs_recover_01_snd/3, + rs_recover_02_fst/3, rs_recover_02_snd/3, + rs_recover_12_fst/3, rs_recover_12_snd/3 + +equations: + reassemble(split_0(x), split_1(x), split_2(x)) = x, + + /* Single-erasure recovery: 2 data + 2 parity → missing data symbol */ + rs_recover_0(split_1(x), split_2(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_0(x), + rs_recover_1(split_0(x), split_2(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_1(x), + rs_recover_2(split_0(x), split_1(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_2(x), + + /* Double-erasure recovery: 1 data + 2 parity → both missing data symbols. + * Chunks 0,1 missing — recover from chunk_2 + both parities: */ + rs_recover_01_fst(split_2(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_0(x), + rs_recover_01_snd(split_2(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_1(x), + + /* Chunks 0,2 missing — recover from chunk_1 + both parities: */ + rs_recover_02_fst(split_1(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_0(x), + rs_recover_02_snd(split_1(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_2(x), + + /* Chunks 1,2 missing — recover from chunk_0 + both parities: */ + rs_recover_12_fst(split_0(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_1(x), + rs_recover_12_snd(split_0(x), rs_p0(split_0(x), split_1(x), split_2(x)), + rs_p1(split_0(x), split_1(x), split_2(x))) = split_2(x) + +/* ======================================================================== + * Backup creation (honest user) — v3 with parity and dummies + * ======================================================================== */ + +rule User_Creates_Backup: + let + pk = pubkey(~nsec) + + /* Step 2: master encryption key */ + h_enc = h(< 'encrypt', ~password, pk >) + enc_key = hkdf('key', h_enc) + + /* Step 3: split nsec into 3 symbolic parts */ + chunk_0 = split_0(~nsec) + chunk_1 = split_1(~nsec) + chunk_2 = split_2(~nsec) + + /* Step 3b: Reed-Solomon parity */ + parity_0 = rs_p0(chunk_0, chunk_1, chunk_2) + parity_1 = rs_p1(chunk_0, chunk_1, chunk_2) + + /* Step 3c: cover key for dummy blobs */ + h_cover = h(< 'cover', ~password, pk >) + + /* Step 4: per-blob derivations — real chunks (indices 0,1,2) */ + h_0 = h(< ~password, pk, '0' >) + d_tag_0 = hkdf('d-tag', h_0) + sign_sk_0 = hkdf('signing-key', h_0) + sign_pk_0 = pubkey(sign_sk_0) + + h_1 = h(< ~password, pk, '1' >) + d_tag_1 = hkdf('d-tag', h_1) + sign_sk_1 = hkdf('signing-key', h_1) + sign_pk_1 = pubkey(sign_sk_1) + + h_2 = h(< ~password, pk, '2' >) + d_tag_2 = hkdf('d-tag', h_2) + sign_sk_2 = hkdf('signing-key', h_2) + sign_pk_2 = pubkey(sign_sk_2) + + /* Parity blobs (indices 3,4) */ + h_3 = h(< ~password, pk, '3' >) + d_tag_3 = hkdf('d-tag', h_3) + sign_sk_3 = hkdf('signing-key', h_3) + sign_pk_3 = pubkey(sign_sk_3) + + h_4 = h(< ~password, pk, '4' >) + d_tag_4 = hkdf('d-tag', h_4) + sign_sk_4 = hkdf('signing-key', h_4) + sign_pk_4 = pubkey(sign_sk_4) + + /* Dummy blobs — derived from cover key, not per-blob scrypt */ + dummy_d_0 = hkdf('dummy-d-tag-0', h_cover) + dummy_sk_0 = hkdf('dummy-signing-key-0', h_cover) + dummy_pk_0 = pubkey(dummy_sk_0) + + dummy_d_1 = hkdf('dummy-d-tag-1', h_cover) + dummy_sk_1 = hkdf('dummy-signing-key-1', h_cover) + dummy_pk_1 = pubkey(dummy_sk_1) + + /* Step 5: encrypt all blobs */ + ct_0 = senc(< 'chunk', '0', 'aad02', chunk_0 >, < enc_key, ~nonce_0 >) + ct_1 = senc(< 'chunk', '1', 'aad02', chunk_1 >, < enc_key, ~nonce_1 >) + ct_2 = senc(< 'chunk', '2', 'aad02', chunk_2 >, < enc_key, ~nonce_2 >) + + ct_p0 = senc(< 'parity', '3', 'aad02', parity_0 >, < enc_key, ~nonce_p0 >) + ct_p1 = senc(< 'parity', '4', 'aad02', parity_1 >, < enc_key, ~nonce_p1 >) + + ct_d0 = senc(< 'dummy', 'aad02', ~dummy_payload_0 >, < enc_key, ~nonce_d0 >) + ct_d1 = senc(< 'dummy', 'aad02', ~dummy_payload_1 >, < enc_key, ~nonce_d1 >) + in + [ Fr(~nsec), Fr(~password), + Fr(~nonce_0), Fr(~nonce_1), Fr(~nonce_2), + Fr(~nonce_p0), Fr(~nonce_p1), + Fr(~nonce_d0), Fr(~nonce_d1), + Fr(~dummy_payload_0), Fr(~dummy_payload_1) ] + --[ + BackupCreated(pk, ~nsec, ~password), + HonestBackup(pk, ~nsec, ~password), + SecretIsSecret(~nsec), + PasswordIsSecret(~password) + ]-> + [ + /* Real chunk blobs */ + Out(< 'blob', sign_pk_0, d_tag_0, ~nonce_0, ct_0 >), + Out(< 'blob', sign_pk_1, d_tag_1, ~nonce_1, ct_1 >), + Out(< 'blob', sign_pk_2, d_tag_2, ~nonce_2, ct_2 >), + + /* Parity blobs — same format, indistinguishable */ + Out(< 'blob', sign_pk_3, d_tag_3, ~nonce_p0, ct_p0 >), + Out(< 'blob', sign_pk_4, d_tag_4, ~nonce_p1, ct_p1 >), + + /* Dummy blobs — same format, indistinguishable */ + Out(< 'blob', dummy_pk_0, dummy_d_0, ~nonce_d0, ct_d0 >), + Out(< 'blob', dummy_pk_1, dummy_d_1, ~nonce_d1, ct_d1 >), + + /* Pubkey is public */ + Out(pk), + + /* User remembers password (secure channel) */ + !UserKnows(~password, pk) + ] + +/* ======================================================================== + * Recovery — all blobs present (happy path) + * ======================================================================== */ + +rule User_Recovers_Full: + let + h_enc = h(< 'encrypt', password, pk >) + enc_key = hkdf('key', h_enc) + + h_0 = h(< password, pk, '0' >) + d_tag_0 = hkdf('d-tag', h_0) + sign_pk_0 = pubkey(hkdf('signing-key', h_0)) + + h_1 = h(< password, pk, '1' >) + d_tag_1 = hkdf('d-tag', h_1) + sign_pk_1 = pubkey(hkdf('signing-key', h_1)) + + h_2 = h(< password, pk, '2' >) + d_tag_2 = hkdf('d-tag', h_2) + sign_pk_2 = pubkey(hkdf('signing-key', h_2)) + + ct_0 = senc(< 'chunk', '0', 'aad02', chunk_0 >, < enc_key, nonce_0 >) + ct_1 = senc(< 'chunk', '1', 'aad02', chunk_1 >, < enc_key, nonce_1 >) + ct_2 = senc(< 'chunk', '2', 'aad02', chunk_2 >, < enc_key, nonce_2 >) + + recovered_nsec = reassemble(chunk_0, chunk_1, chunk_2) + recovered_pk = pubkey(recovered_nsec) + in + [ + !UserKnows(password, pk), + In(pk), + In(< 'blob', sign_pk_0, d_tag_0, nonce_0, ct_0 >), + In(< 'blob', sign_pk_1, d_tag_1, nonce_1, ct_1 >), + In(< 'blob', sign_pk_2, d_tag_2, nonce_2, ct_2 >) + ] + --[ + RecoverySucceeded(recovered_pk, recovered_nsec, password), + Eq(recovered_pk, pk) + ]-> + [ ] + +/* ======================================================================== + * Recovery with 1 erasure — chunk 0 missing, reconstructed from RS + * ======================================================================== */ + +rule User_Recovers_1_Erasure: + let + h_enc = h(< 'encrypt', password, pk >) + enc_key = hkdf('key', h_enc) + + /* Derive selectors for chunks 1, 2 and both parity blobs */ + h_1 = h(< password, pk, '1' >) + sign_pk_1 = pubkey(hkdf('signing-key', h_1)) + d_tag_1 = hkdf('d-tag', h_1) + + h_2 = h(< password, pk, '2' >) + sign_pk_2 = pubkey(hkdf('signing-key', h_2)) + d_tag_2 = hkdf('d-tag', h_2) + + h_3 = h(< password, pk, '3' >) + sign_pk_3 = pubkey(hkdf('signing-key', h_3)) + d_tag_3 = hkdf('d-tag', h_3) + + h_4 = h(< password, pk, '4' >) + sign_pk_4 = pubkey(hkdf('signing-key', h_4)) + d_tag_4 = hkdf('d-tag', h_4) + + /* Decrypt available blobs */ + ct_1 = senc(< 'chunk', '1', 'aad02', chunk_1 >, < enc_key, nonce_1 >) + ct_2 = senc(< 'chunk', '2', 'aad02', chunk_2 >, < enc_key, nonce_2 >) + ct_p0 = senc(< 'parity', '3', 'aad02', parity_0 >, < enc_key, nonce_p0 >) + ct_p1 = senc(< 'parity', '4', 'aad02', parity_1 >, < enc_key, nonce_p1 >) + + /* RS erasure decode: recover chunk_0 from chunk_1, chunk_2, parity_0, parity_1 */ + chunk_0 = rs_recover_0(chunk_1, chunk_2, parity_0, parity_1) + + recovered_nsec = reassemble(chunk_0, chunk_1, chunk_2) + recovered_pk = pubkey(recovered_nsec) + in + [ + !UserKnows(password, pk), + In(pk), + /* Chunk 0 is MISSING — not fetched from relay */ + In(< 'blob', sign_pk_1, d_tag_1, nonce_1, ct_1 >), + In(< 'blob', sign_pk_2, d_tag_2, nonce_2, ct_2 >), + In(< 'blob', sign_pk_3, d_tag_3, nonce_p0, ct_p0 >), + In(< 'blob', sign_pk_4, d_tag_4, nonce_p1, ct_p1 >) + ] + --[ + RecoveryWithErasure(recovered_pk, recovered_nsec, password), + Eq(recovered_pk, pk) + ]-> + [ ] + +/* ======================================================================== + * Recovery with 2 erasures — chunks 0 and 1 missing + * (models the maximum fault tolerance: any 2 of N+P) + * ======================================================================== */ + +rule User_Recovers_2_Erasures: + let + h_enc = h(< 'encrypt', password, pk >) + enc_key = hkdf('key', h_enc) + + h_2 = h(< password, pk, '2' >) + sign_pk_2 = pubkey(hkdf('signing-key', h_2)) + d_tag_2 = hkdf('d-tag', h_2) + + h_3 = h(< password, pk, '3' >) + sign_pk_3 = pubkey(hkdf('signing-key', h_3)) + d_tag_3 = hkdf('d-tag', h_3) + + h_4 = h(< password, pk, '4' >) + sign_pk_4 = pubkey(hkdf('signing-key', h_4)) + d_tag_4 = hkdf('d-tag', h_4) + + ct_2 = senc(< 'chunk', '2', 'aad02', chunk_2 >, < enc_key, nonce_2 >) + ct_p0 = senc(< 'parity', '3', 'aad02', parity_0 >, < enc_key, nonce_p0 >) + ct_p1 = senc(< 'parity', '4', 'aad02', parity_1 >, < enc_key, nonce_p1 >) + + /* RS: recover chunk_0 and chunk_1 from chunk_2 + both parities. + * With N=3, P=2, losing 2 data symbols leaves exactly 3 known + * symbols (1 data + 2 parity), which is the minimum for RS(5,3). + * Uses the double-erasure recovery functions that take only the + * available symbols — no placeholders needed. */ + chunk_0 = rs_recover_01_fst(chunk_2, parity_0, parity_1) + chunk_1 = rs_recover_01_snd(chunk_2, parity_0, parity_1) + + recovered_nsec = reassemble(chunk_0, chunk_1, chunk_2) + recovered_pk = pubkey(recovered_nsec) + in + [ + !UserKnows(password, pk), + In(pk), + /* Chunks 0 and 1 are MISSING */ + In(< 'blob', sign_pk_2, d_tag_2, nonce_2, ct_2 >), + In(< 'blob', sign_pk_3, d_tag_3, nonce_p0, ct_p0 >), + In(< 'blob', sign_pk_4, d_tag_4, nonce_p1, ct_p1 >) + ] + --[ + RecoveryWith2Erasures(recovered_pk, recovered_nsec, password), + Eq(recovered_pk, pk) + ]-> + [ ] + +/* Equality restriction */ +restriction Equality: + "All x y #i. Eq(x, y) @ i ==> x = y" + +/* ======================================================================== + * Attacker: password compromise + * ======================================================================== */ + +rule Compromise_Password: + let + pk = pubkey(~nsec) + h_enc = h(< 'encrypt', ~password, pk >) + enc_key = hkdf('key', h_enc) + + chunk_0 = split_0(~nsec) + chunk_1 = split_1(~nsec) + chunk_2 = split_2(~nsec) + + parity_0 = rs_p0(chunk_0, chunk_1, chunk_2) + parity_1 = rs_p1(chunk_0, chunk_1, chunk_2) + + h_cover = h(< 'cover', ~password, pk >) + + h_0 = h(< ~password, pk, '0' >) + h_1 = h(< ~password, pk, '1' >) + h_2 = h(< ~password, pk, '2' >) + h_3 = h(< ~password, pk, '3' >) + h_4 = h(< ~password, pk, '4' >) + + ct_0 = senc(< 'chunk', '0', 'aad02', chunk_0 >, < enc_key, ~nonce_0 >) + ct_1 = senc(< 'chunk', '1', 'aad02', chunk_1 >, < enc_key, ~nonce_1 >) + ct_2 = senc(< 'chunk', '2', 'aad02', chunk_2 >, < enc_key, ~nonce_2 >) + ct_p0 = senc(< 'parity', '3', 'aad02', parity_0 >, < enc_key, ~nonce_p0 >) + ct_p1 = senc(< 'parity', '4', 'aad02', parity_1 >, < enc_key, ~nonce_p1 >) + ct_d0 = senc(< 'dummy', 'aad02', ~dummy_payload_0 >, < enc_key, ~nonce_d0 >) + ct_d1 = senc(< 'dummy', 'aad02', ~dummy_payload_1 >, < enc_key, ~nonce_d1 >) + in + [ Fr(~nsec), Fr(~password), + Fr(~nonce_0), Fr(~nonce_1), Fr(~nonce_2), + Fr(~nonce_p0), Fr(~nonce_p1), + Fr(~nonce_d0), Fr(~nonce_d1), + Fr(~dummy_payload_0), Fr(~dummy_payload_1) ] + --[ + BackupCreated(pk, ~nsec, ~password), + SecretIsSecret(~nsec), + PasswordIsSecret(~password), + PasswordCompromised(pk, ~password) + ]-> + [ + Out(< 'blob', pubkey(hkdf('signing-key', h_0)), hkdf('d-tag', h_0), ~nonce_0, ct_0 >), + Out(< 'blob', pubkey(hkdf('signing-key', h_1)), hkdf('d-tag', h_1), ~nonce_1, ct_1 >), + Out(< 'blob', pubkey(hkdf('signing-key', h_2)), hkdf('d-tag', h_2), ~nonce_2, ct_2 >), + Out(< 'blob', pubkey(hkdf('signing-key', h_3)), hkdf('d-tag', h_3), ~nonce_p0, ct_p0 >), + Out(< 'blob', pubkey(hkdf('signing-key', h_4)), hkdf('d-tag', h_4), ~nonce_p1, ct_p1 >), + Out(< 'blob', pubkey(hkdf('dummy-signing-key-0', h_cover)), hkdf('dummy-d-tag-0', h_cover), ~nonce_d0, ct_d0 >), + Out(< 'blob', pubkey(hkdf('dummy-signing-key-1', h_cover)), hkdf('dummy-d-tag-1', h_cover), ~nonce_d1, ct_d1 >), + Out(pk), + Out(~password) + ] + +/* ======================================================================== + * Security lemmas + * ======================================================================== */ + +// ── Correctness ────────────────────────────────────────────────────────── + +/* Happy path: all blobs present */ +lemma executable_honest_backup_and_recovery: + exists-trace + "Ex pk nsec password #i #j. + HonestBackup(pk, nsec, password) @ i + & RecoverySucceeded(pk, nsec, password) @ j + & i < j" + +/* Recovery with 1 erasure */ +lemma executable_recovery_1_erasure: + exists-trace + "Ex pk nsec password #i #j. + HonestBackup(pk, nsec, password) @ i + & RecoveryWithErasure(pk, nsec, password) @ j + & i < j" + +/* Recovery with 2 erasures */ +lemma executable_recovery_2_erasures: + exists-trace + "Ex pk nsec password #i #j. + HonestBackup(pk, nsec, password) @ i + & RecoveryWith2Erasures(pk, nsec, password) @ j + & i < j" + +// ── Confidentiality ───────────────────────────────────────────────────── + +/* nsec secret if password not compromised */ +lemma nsec_secrecy_without_password_compromise: + "All pk nsec password #i. + HonestBackup(pk, nsec, password) @ i + ==> not (Ex #j. K(nsec) @ j)" + +/* Password not derivable from published blobs */ +lemma password_secrecy: + "All pk nsec password #i. + HonestBackup(pk, nsec, password) @ i + ==> not (Ex #j. K(password) @ j)" + +/* Individual chunks secret without password */ +lemma chunk_secrecy_without_password: + "All pk nsec password #i. + HonestBackup(pk, nsec, password) @ i + ==> not (Ex #j. K(split_0(nsec)) @ j) + & not (Ex #k. K(split_1(nsec)) @ k) + & not (Ex #l. K(split_2(nsec)) @ l)" + +/* Parity blobs don't leak chunks */ +lemma parity_secrecy_without_password: + "All pk nsec password #i. + HonestBackup(pk, nsec, password) @ i + ==> not (Ex #j. K(rs_p0(split_0(nsec), split_1(nsec), split_2(nsec))) @ j) + & not (Ex #k. K(rs_p1(split_0(nsec), split_1(nsec), split_2(nsec))) @ k)" + +// ── Password compromise ───────────────────────────────────────────────── + +/* Compromised password → nsec recoverable (expected) */ +lemma password_compromise_enables_nsec_recovery: + exists-trace + "Ex pk nsec password #i #j #k. + BackupCreated(pk, nsec, password) @ i + & PasswordCompromised(pk, password) @ j + & K(nsec) @ k" + +/* Compromise rule is reachable */ +lemma executable_password_compromise: + exists-trace + "Ex pk password #c. PasswordCompromised(pk, password) @ c" + +/* enc_key derivable with compromised password */ +lemma enc_key_derivable_with_compromised_password: + exists-trace + "Ex pk password #c #j. + PasswordCompromised(pk, password) @ c + & K(hkdf('key', h(< 'encrypt', password, pk >))) @ j" + +// ── Scope ──────────────────────────────────────────────────────────────── +// +// Properties NOT modeled (argued in spec): +// 1. UNLINKABILITY — observational equivalence (ProVerif/diff mode) +// 2. ACCUMULATION RESISTANCE — corollary of unlinkability +// 3. VARIABLE N, D — password-dependent control flow +// 4. CONSTANT-SIZE BLOBS — byte-level property +// 5. TIMING RESISTANCE — side-channel property +// 6. STEGANOGRAPHIC INDISTINGUISHABILITY — observational property +// (real, parity, and dummy blobs all published via Out() but +// Tamarin cannot express that they are indistinguishable) + +end diff --git a/crates/sprout-core/src/backup/nip_sb_demo.py b/crates/sprout-core/src/backup/nip_sb_demo.py new file mode 100755 index 000000000..41ef991ca --- /dev/null +++ b/crates/sprout-core/src/backup/nip_sb_demo.py @@ -0,0 +1,808 @@ +#!/usr/bin/env -S uv run --script +# /// script +# requires-python = ">=3.10" +# dependencies = ["PyNaCl>=1.5", "secp256k1>=0.14"] +# /// +""" +NIP-SB v3 Steganographic Key Backup — Protocol Demo + +Exercises the NIP-SB v3 backup/recovery cryptographic protocol with real crypto. +This is NOT a security-complete reference implementation — it omits Nostr event +id/sig validation, kind/d-tag verification, and relay transport. It validates +the KDF chain, encryption, RS coding, and relay-scoped derivation only. + +Crypto libraries used: + - scrypt (hashlib, stdlib — log_n reduced to 14 for demo speed) + - HKDF-SHA256 (hmac, stdlib) + - XChaCha20-Poly1305 (libsodium via PyNaCl) + - secp256k1 key derivation (secp256k1 lib) + - Reed-Solomon erasure coding over GF(2^8) (pure Python) + +v3 additions over v1: + - P=2 Reed-Solomon parity blobs (tolerates loss of any 2 blobs) + - D=4-12 variable dummy blobs (encrypted HKDF-derived filler) + - Cover key for cheap dummy derivation (1 scrypt, rest HKDF) + - Random-order publication and recovery + - d-tag-only queries with pubkey-first filtering + +The relay is simulated as an in-memory dict. Nostr event structure +(kind, id, sig) is not modeled — this demo covers the cryptographic +protocol, not the Nostr event layer. + +Simplifications vs. a full implementation: + - scrypt log_n=14 (spec requires 20) for demo speed + - No Nostr event id/sig generation or validation (spec steps 1-2, 4-5 + require id/sig/kind/d-tag checks that are omitted here since the + simulated relay has no Nostr event layer) + - Simulated relay (dict) instead of real WebSocket relay + - No jittered timestamps or publication delays + - No WHATWG URL parsing (relay URL normalized by hand for demo) + - No pagination/EOSE or authors-filter fallback for d-tag squatting + defense (spec §Recovery step 6). The simulated relay returns all + events for a d-tag; real relays may truncate. + +Usage: + uv run crates/sprout-core/src/backup/nip_sb_demo.py +""" + +from __future__ import annotations + +import base64 +import hashlib +import hmac +import os +import random +import sys +import time +import unicodedata +from dataclasses import dataclass + +import nacl.bindings as sodium +import secp256k1 + +# ── NIP-SB Constants (spec §Constants) ──────────────────────────────────────── + +SCRYPT_LOG_N = 14 # Reduced from spec's 20 for demo speed (~0.1s vs ~2s). + # Real implementations MUST use 20. +SCRYPT_R = 8 +SCRYPT_P = 1 +MIN_CHUNKS = 3 +MAX_CHUNKS = 16 +CHUNK_RANGE = MAX_CHUNKS - MIN_CHUNKS + 1 # 14 +PARITY_BLOBS = 2 +MIN_DUMMIES = 4 +MAX_DUMMIES = 12 +DUMMY_RANGE = MAX_DUMMIES - MIN_DUMMIES + 1 # 9 +CHUNK_PAD_LEN = 16 +AAD = b"\x02" # key_security_byte per NIP-49 +DEMO_RELAY_URL = b"wss://relay.example.com/" # Normalized relay URL for demo + + +# ── GF(2^8) arithmetic for Reed-Solomon ─────────────────────────────────────── +# Field: GF(2^8) with irreducible polynomial x^8+x^4+x^3+x+1 (0x11B, AES). +# Primitive element α = 0x03 (order 255, generates full multiplicative group). + +GF_POLY = 0x11B + +def gf_mul(a: int, b: int) -> int: + """Multiply two elements in GF(2^8).""" + p = 0 + for _ in range(8): + if b & 1: + p ^= a + hi = a & 0x80 + a = (a << 1) & 0xFF + if hi: + a ^= GF_POLY & 0xFF + b >>= 1 + return p + +def gf_pow(a: int, n: int) -> int: + """Exponentiate in GF(2^8).""" + result = 1 + base = a + while n > 0: + if n & 1: + result = gf_mul(result, base) + base = gf_mul(base, base) + n >>= 1 + return result + +def gf_inv(a: int) -> int: + """Multiplicative inverse in GF(2^8). a^254 = a^(-1) since a^255 = 1.""" + assert a != 0, "Cannot invert zero" + return gf_pow(a, 254) + +# Precompute evaluation points: α^0, α^1, ..., α^(MAX_CHUNKS+PARITY_BLOBS-1) +ALPHA = 0x03 +EVAL_POINTS = [gf_pow(ALPHA, i) for i in range(MAX_CHUNKS + PARITY_BLOBS)] + + +def rs_encode(data_symbols: list[int], n_parity: int = 2) -> list[int]: + """ + Systematic RS encode: given N data symbols, produce n_parity parity symbols. + Uses Lagrange interpolation at evaluation points α^0..α^{N-1} for data, + then evaluates at α^N..α^{N+n_parity-1} for parity. + All arithmetic in GF(2^8). + """ + n = len(data_symbols) + points = EVAL_POINTS[:n] + parity = [] + for k in range(n_parity): + x = EVAL_POINTS[n + k] + # Lagrange interpolation: P(x) = sum_i data[i] * prod_{j!=i} (x - points[j]) / (points[i] - points[j]) + val = 0 + for i in range(n): + num = data_symbols[i] + for j in range(n): + if j != i: + num = gf_mul(num, x ^ points[j]) + num = gf_mul(num, gf_inv(points[i] ^ points[j])) + val ^= num + parity.append(val) + return parity + + +def rs_decode(symbols: list[int | None], n_data: int) -> list[int]: + """ + RS erasure decode: given n_data+2 symbol slots (some None = erased), + reconstruct all n_data data symbols using any n_data available symbols. + Returns the n_data data symbols. + """ + n_total = n_data + PARITY_BLOBS + assert len(symbols) == n_total + + # Collect known positions and values + known_pos = [] + known_val = [] + for i, s in enumerate(symbols): + if s is not None: + known_pos.append(EVAL_POINTS[i]) + known_val.append(s) + + assert len(known_pos) >= n_data, f"Need at least {n_data} symbols, got {len(known_pos)}" + + # Use first n_data known symbols for interpolation + pos = known_pos[:n_data] + val = known_val[:n_data] + + # Reconstruct data symbols by evaluating polynomial at data positions + result = [] + for k in range(n_data): + x = EVAL_POINTS[k] + # Check if this position is already known + found = False + for i, s in enumerate(symbols): + if i == k and s is not None: + result.append(s) + found = True + break + if found: + continue + # Lagrange interpolation at x using known points + v = 0 + for i in range(n_data): + num = val[i] + for j in range(n_data): + if j != i: + num = gf_mul(num, x ^ pos[j]) + num = gf_mul(num, gf_inv(pos[i] ^ pos[j])) + v ^= num + result.append(v) + return result + + +def rs_encode_rows(padded_chunks: list[bytes]) -> tuple[bytes, bytes]: + """ + Compute 2 parity rows across N padded chunks using 16 parallel RS codes. + Each byte position gets its own RS(N+2, N) code over GF(2^8). + Returns (parity_row_0, parity_row_1), each 16 bytes. + """ + n = len(padded_chunks) + parity_0 = bytearray(CHUNK_PAD_LEN) + parity_1 = bytearray(CHUNK_PAD_LEN) + for b in range(CHUNK_PAD_LEN): + data = [padded_chunks[i][b] for i in range(n)] + p = rs_encode(data, PARITY_BLOBS) + parity_0[b] = p[0] + parity_1[b] = p[1] + return bytes(parity_0), bytes(parity_1) + + +def rs_decode_rows( + padded_slots: list[bytes | None], + n_data: int, +) -> list[bytes]: + """ + RS erasure decode across 16 parallel byte positions. + padded_slots has n_data + 2 entries (real + parity), some may be None. + Returns the n_data reconstructed padded chunks. + """ + n_total = n_data + PARITY_BLOBS + assert len(padded_slots) == n_total + result = [bytearray(CHUNK_PAD_LEN) for _ in range(n_data)] + for b in range(CHUNK_PAD_LEN): + symbols: list[int | None] = [] + for i in range(n_total): + if padded_slots[i] is None: + symbols.append(None) + else: + symbols.append(padded_slots[i][b]) + decoded = rs_decode(symbols, n_data) + for i in range(n_data): + result[i][b] = decoded[i] + return [bytes(r) for r in result] + + +# ── Simulated Relay ─────────────────────────────────────────────────────────── + +@dataclass +class RelayEvent: + pubkey: str # throwaway signing pubkey (hex, 32 bytes x-only) + d_tag: str # NIP-33 d-tag (hex, 32 bytes) + content: str # base64-encoded blob (56 bytes: 24 nonce + 32 ciphertext) + created_at: int = 0 # unix timestamp (for NIP-33 replacement semantics) + +SimulatedRelay = dict[str, list[RelayEvent]] + +def relay_publish(relay: SimulatedRelay, event: RelayEvent) -> None: + relay.setdefault(event.d_tag, []).append(event) + +def relay_query(relay: SimulatedRelay, d_tag: str) -> list[RelayEvent]: + """Query by d-tag only (v3: no authors filter).""" + return relay.get(d_tag, []) + + +# ── Crypto helpers ──────────────────────────────────────────────────────────── + +def nfkc(password: str) -> bytes: + return unicodedata.normalize("NFKC", password).encode("utf-8") + +def nip_sb_scrypt(input_bytes: bytes, salt: bytes = b"") -> bytes: + return hashlib.scrypt( + input_bytes, salt=salt, + n=2**SCRYPT_LOG_N, r=SCRYPT_R, p=SCRYPT_P, dklen=32, + ) + +def nip_sb_hkdf(ikm: bytes, info: bytes, length: int = 32) -> bytes: + prk = hmac.new(b"\x00" * 32, ikm, "sha256").digest() + return hmac.new(prk, info + b"\x01", "sha256").digest()[:length] + +def xchacha20poly1305_encrypt(key: bytes, nonce: bytes, plaintext: bytes, aad: bytes) -> bytes: + return sodium.crypto_aead_xchacha20poly1305_ietf_encrypt(plaintext, aad, nonce, key) + +def xchacha20poly1305_decrypt(key: bytes, nonce: bytes, ciphertext: bytes, aad: bytes) -> bytes: + return sodium.crypto_aead_xchacha20poly1305_ietf_decrypt(ciphertext, aad, nonce, key) + +def secret_to_pubkey(secret_bytes: bytes) -> bytes: + sk = secp256k1.PrivateKey(secret_bytes) + return sk.pubkey.serialize(compressed=True)[1:] + + +def make_base(password: str, pubkey_bytes: bytes, suffix: bytes = b"") -> bytes: + """Length-prefixed password ‖ pubkey ‖ optional suffix (injective encoding).""" + pw = nfkc(password) + return len(pw).to_bytes(2, "big") + pw + pubkey_bytes + suffix + + +# ── Backup (spec §Steps 1-5) ───────────────────────────────────────────────── + +@dataclass +class BlobInfo: + index: int + role: str # "real", "parity", "dummy" + d_tag: str + sign_pk: str + +def backup( + nsec_bytes: bytes, + pubkey_bytes: bytes, + password: str, + relay: SimulatedRelay, + relay_url_bytes: bytes = b"wss://relay.example.com/", +) -> list[BlobInfo]: + base = make_base(password, pubkey_bytes) + + # Step 1: Determine N and D + h = nip_sb_scrypt(base, salt=b"") + n = (h[0] % CHUNK_RANGE) + MIN_CHUNKS + h_d = nip_sb_scrypt(base, salt=b"dummies") + d = (h_d[0] % DUMMY_RANGE) + MIN_DUMMIES + p = PARITY_BLOBS + + # Step 2: Master encryption key + h_enc = nip_sb_scrypt(base, salt=b"encrypt") + enc_key = nip_sb_hkdf(h_enc, b"key") + + # Step 3: Split nsec into N chunks + remainder = 32 % n + base_len = 32 // n + chunks: list[bytes] = [] + offset = 0 + for i in range(n): + chunk_len = base_len + (1 if i < remainder else 0) + chunks.append(nsec_bytes[offset : offset + chunk_len]) + offset += chunk_len + assert offset == 32 and b"".join(chunks) == nsec_bytes + + # Step 3b + 4 combined: Derive per-blob keys, pad deterministically, compute RS parity + h_cover = nip_sb_scrypt(base, salt=b"cover") + + # Pre-derive all per-blob H_i for real+parity blobs (needed for deterministic padding) + h_values: list[bytes] = [] + for i in range(n + PARITY_BLOBS): + base_i = make_base(password, pubkey_bytes, str(i).encode("ascii")) + h_values.append(nip_sb_scrypt(base_i, salt=b"")) + + # Pad chunks deterministically using per-blob key material + padded_chunks: list[bytes] = [] + for i in range(n): + pad_len = CHUNK_PAD_LEN - len(chunks[i]) + pad_bytes = nip_sb_hkdf(h_values[i], b"pad", length=pad_len) if pad_len > 0 else b"" + padded_chunks.append(chunks[i] + pad_bytes) + parity_row_0, parity_row_1 = rs_encode_rows(padded_chunks) + + # Step 4 + 5: Derive keys, encrypt, collect all blobs + all_blobs: list[tuple[BlobInfo, RelayEvent]] = [] + + # Real chunk blobs (indices 0..N-1) + for i in range(n): + h_i = h_values[i] + d_tag = nip_sb_hkdf(h_i, b"d-tag" + relay_url_bytes).hex() + sign_sk = _derive_signing_key(h_i, b"signing-key", relay_url_bytes) + sign_pk = secret_to_pubkey(sign_sk).hex() + + nonce = os.urandom(24) + ct = xchacha20poly1305_encrypt(enc_key, nonce, padded_chunks[i], AAD) + content = base64.b64encode(nonce + ct).decode("ascii") + + info = BlobInfo(i, "real", d_tag, sign_pk) + event = RelayEvent(pubkey=sign_pk, d_tag=d_tag, content=content, created_at=int(time.time())) + all_blobs.append((info, event)) + + # Parity blobs (indices N..N+1) + parity_rows = [parity_row_0, parity_row_1] + for k in range(p): + i = n + k + h_i = h_values[i] + d_tag = nip_sb_hkdf(h_i, b"d-tag" + relay_url_bytes).hex() + sign_sk = _derive_signing_key(h_i, b"signing-key", relay_url_bytes) + sign_pk = secret_to_pubkey(sign_sk).hex() + + nonce = os.urandom(24) + ct = xchacha20poly1305_encrypt(enc_key, nonce, parity_rows[k], AAD) + content = base64.b64encode(nonce + ct).decode("ascii") + + info = BlobInfo(i, "parity", d_tag, sign_pk) + event = RelayEvent(pubkey=sign_pk, d_tag=d_tag, content=content, created_at=int(time.time())) + all_blobs.append((info, event)) + + # Dummy blobs (indices 0..D-1, separate namespace) + for j in range(d): + d_tag = nip_sb_hkdf(h_cover, f"dummy-d-tag-{j}".encode() + relay_url_bytes).hex() + sign_sk = _derive_dummy_signing_key(h_cover, j, relay_url_bytes) + sign_pk = secret_to_pubkey(sign_sk).hex() + + dummy_payload = nip_sb_hkdf(h_cover, f"dummy-pad-{j}".encode(), length=CHUNK_PAD_LEN) + nonce = os.urandom(24) + ct = xchacha20poly1305_encrypt(enc_key, nonce, dummy_payload, AAD) + content = base64.b64encode(nonce + ct).decode("ascii") + + info = BlobInfo(n + p + j, "dummy", d_tag, sign_pk) + event = RelayEvent(pubkey=sign_pk, d_tag=d_tag, content=content, created_at=int(time.time())) + all_blobs.append((info, event)) + + # Shuffle and publish in random order (spec: MUST shuffle) + random.shuffle(all_blobs) + blob_infos = [] + for info, event in all_blobs: + relay_publish(relay, event) + blob_infos.append(info) + + # Sort for display (publication was shuffled) + blob_infos.sort(key=lambda b: ({"real": 0, "parity": 1, "dummy": 2}[b.role], b.index)) + return blob_infos + + +def _derive_signing_key(h_i: bytes, prefix: bytes, relay_url_bytes: bytes) -> bytes: + """Reject-and-retry signing key derivation (spec §Step 4).""" + for retry in range(256): + info = (prefix if retry == 0 else prefix + f"-{retry}".encode()) + relay_url_bytes + sk = nip_sb_hkdf(h_i, info) + try: + secret_to_pubkey(sk) # validates scalar + return sk + except Exception: + continue + raise RuntimeError("All 256 signing key derivations invalid") + + +def _derive_dummy_signing_key(h_cover: bytes, j: int, relay_url_bytes: bytes) -> bytes: + """Reject-and-retry for dummy signing keys (spec §Step 4, dummy section).""" + for retry in range(256): + suffix = f"-{retry}" if retry > 0 else "" + info = f"dummy-signing-key-{j}{suffix}".encode() + relay_url_bytes + sk = nip_sb_hkdf(h_cover, info) + try: + secret_to_pubkey(sk) + return sk + except Exception: + continue + raise RuntimeError(f"Dummy {j}: all 256 signing key derivations invalid") + + +# ── Recovery (spec §Recovery) ───────────────────────────────────────────────── + +def recover( + pubkey_bytes: bytes, + password: str, + relay: SimulatedRelay, + relay_url_bytes: bytes = b"wss://relay.example.com/", + delete_indices: set[int] | None = None, +) -> bytes: + """ + Recover nsec from password + pubkey + relay. + delete_indices: if set, simulate missing blobs by skipping these real/parity indices. + """ + base = make_base(password, pubkey_bytes) + + # Step 3: Derive N, D, enc_key, cover key + h = nip_sb_scrypt(base, salt=b"") + n = (h[0] % CHUNK_RANGE) + MIN_CHUNKS + h_d = nip_sb_scrypt(base, salt=b"dummies") + d = (h_d[0] % DUMMY_RANGE) + MIN_DUMMIES + p = PARITY_BLOBS + h_enc = nip_sb_scrypt(base, salt=b"encrypt") + enc_key = nip_sb_hkdf(h_enc, b"key") + h_cover = nip_sb_scrypt(base, salt=b"cover") + + remainder = 32 % n + base_len = 32 // n + + # Step 4: Derive all d-tags and expected signing pubkeys + all_queries: list[tuple[str, str, str, int]] = [] # (d_tag, expected_pk, role, index) + + for i in range(n + p): + base_i = make_base(password, pubkey_bytes, str(i).encode("ascii")) + h_i = nip_sb_scrypt(base_i, salt=b"") + d_tag = nip_sb_hkdf(h_i, b"d-tag" + relay_url_bytes).hex() + sign_sk = _derive_signing_key(h_i, b"signing-key", relay_url_bytes) + sign_pk = secret_to_pubkey(sign_sk).hex() + role = "real" if i < n else "parity" + all_queries.append((d_tag, sign_pk, role, i)) + + for j in range(d): + d_tag = nip_sb_hkdf(h_cover, f"dummy-d-tag-{j}".encode() + relay_url_bytes).hex() + sign_sk = _derive_dummy_signing_key(h_cover, j, relay_url_bytes) + sign_pk = secret_to_pubkey(sign_sk).hex() + all_queries.append((d_tag, sign_pk, "dummy", n + p + j)) + + # Step 5: Shuffle and query all d-tags (spec: random order, d-tag only) + random.shuffle(all_queries) + + # Collect results by role + padded_slots: list[bytes | None] = [None] * (n + p) # real + parity + for d_tag, expected_pk, role, idx in all_queries: + if delete_indices and idx < (n + p) and idx in delete_indices: + continue # simulate missing blob + + events = relay_query(relay, d_tag) + # Spec §Event Validation: filter by expected pubkey, validate, then select newest valid + matched = [e for e in events if e.pubkey == expected_pk] + + if role == "dummy": + continue # discard dummies + + if not matched: + continue # missing blob — will try RS recovery + + # Validate-then-select: try each candidate, keep valid ones, pick newest + valid_candidates: list[tuple[int, bytes]] = [] # (created_at, padded) + for candidate in matched: + try: + content_str = candidate.content + if len(content_str) % 4: + content_str += "=" * (4 - len(content_str) % 4) + raw = base64.b64decode(content_str, validate=True) + if len(raw) != 56: + continue # malformed content → skip this candidate + nonce = raw[:24] + ciphertext = raw[24:] + padded = xchacha20poly1305_decrypt(enc_key, nonce, ciphertext, AAD) + valid_candidates.append((candidate.created_at, padded)) + except Exception: + continue # base64 or AEAD failure → skip this candidate + + if not valid_candidates: + continue # no valid candidates → erasure + + # Select the newest valid event (spec §Event Validation step 8) + _, padded = max(valid_candidates, key=lambda x: x[0]) + padded_slots[idx] = padded + + # Step 8: Reassemble + missing = [i for i in range(n + p) if padded_slots[i] is None] + if len(missing) > p: + raise ValueError(f"Too many blobs missing ({len(missing)} missing, max tolerated: {p})") + + if missing: + # RS erasure decode + reconstructed = rs_decode_rows(padded_slots, n) + for i in range(n): + padded_slots[i] = reconstructed[i] + + # Extract chunks from padded data + nsec_parts = [] + for i in range(n): + chunk_len = base_len + (1 if i < remainder else 0) + nsec_parts.append(padded_slots[i][:chunk_len]) + + nsec_bytes = b"".join(nsec_parts) + assert len(nsec_bytes) == 32 + + # Step 9: Validate + try: + recovered_pk = secret_to_pubkey(nsec_bytes) + except Exception: + raise ValueError("Recovered key is not a valid secp256k1 scalar") + if recovered_pk != pubkey_bytes: + raise ValueError("Pubkey mismatch — wrong password") + + return nsec_bytes + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main() -> None: + print("╔══════════════════════════════════════════════════════════════╗") + print("║ NIP-SB v3 Protocol Demo — Real Crypto, Simulated Relay ║") + print("║ ║") + print("║ scrypt + HKDF-SHA256 + XChaCha20-Poly1305 + secp256k1 ║") + print("║ + Reed-Solomon GF(2^8) + Dummy Blobs ║") + print("╚══════════════════════════════════════════════════════════════╝") + print() + + relay: SimulatedRelay = {} + + # Generate a test identity + sk = secp256k1.PrivateKey() + nsec_bytes = sk.private_key + pubkey_bytes = secret_to_pubkey(nsec_bytes) + password = "correct-horse-battery-staple-orange-purple-mountain" + + print(f"Identity: {pubkey_bytes.hex()[:16]}…") + print(f"Password: {password}") + print() + + # ── Phase 1: Backup ─────────────────────────────────────────────────── + + print("── Phase 1: Backup ──────────────────────────────────────────") + blobs = backup(nsec_bytes, pubkey_bytes, password, relay) + n_real = sum(1 for b in blobs if b.role == "real") + n_parity = sum(1 for b in blobs if b.role == "parity") + n_dummy = sum(1 for b in blobs if b.role == "dummy") + print(f" N={n_real} real + P={n_parity} parity + D={n_dummy} dummy = {len(blobs)} total") + for b in blobs: + print(f" Blob {b.index:2d} [{b.role:6s}]: d={b.d_tag[:12]}… pk={b.sign_pk[:12]}… ✅") + + # Add decoy events (simulates other kind:30078 data) + for _ in range(5): + fake_sk = secp256k1.PrivateKey() + relay_publish(relay, RelayEvent( + pubkey=secret_to_pubkey(fake_sk.private_key).hex(), + d_tag=os.urandom(32).hex(), + content=base64.b64encode(os.urandom(56)).decode(), + )) + + total = sum(len(v) for v in relay.values()) + print(f"\n Relay: {total} total events ({len(blobs)} backup + 5 decoy)") + + # ── Phase 2: Full Recovery ──────────────────────────────────────────── + + print("\n── Phase 2: Full Recovery (all blobs present) ────────────────") + recovered = recover(pubkey_bytes, password, relay) + assert recovered == nsec_bytes + print(f" ✅ RECOVERED — secret key matches (byte-for-byte)") + + # ── Phase 3: Recovery with 1 Missing Blob ───────────────────────────── + + print("\n── Phase 3: Recovery with 1 missing real chunk ───────────────") + recovered = recover(pubkey_bytes, password, relay, delete_indices={0}) + assert recovered == nsec_bytes + print(f" ✅ RECOVERED — RS parity reconstructed chunk 0") + + # ── Phase 4: Recovery with 2 Missing Blobs ──────────────────────────── + + print("\n── Phase 4: Recovery with 2 missing blobs (1 real + 1 parity)") + recovered = recover(pubkey_bytes, password, relay, delete_indices={1, n_real}) + assert recovered == nsec_bytes + print(f" ✅ RECOVERED — RS parity reconstructed mixed erasures") + + # ── Phase 4b: Recovery with 2 Missing Real Chunks ──────────────────── + + print("\n── Phase 4b: Recovery with 2 missing real chunks ─────────────") + recovered = recover(pubkey_bytes, password, relay, delete_indices={0, n_real - 1}) + assert recovered == nsec_bytes + print(f" ✅ RECOVERED — RS parity reconstructed 2 missing real chunks") + + # ── Phase 4c: Recovery with 2 Missing Parity Blobs ──────────────────── + + print("\n── Phase 4c: Recovery with 2 missing parity blobs ────────────") + recovered = recover(pubkey_bytes, password, relay, delete_indices={n_real, n_real + 1}) + assert recovered == nsec_bytes + print(f" ✅ RECOVERED — all real chunks present, parity not needed") + + # ── Phase 4d: Recovery with corrupted blob (AEAD failure → erasure) ── + + print("\n── Phase 4d: Recovery with 1 corrupted blob (AEAD erasure) ───") + # Corrupt a real blob's content to trigger AEAD failure + real_blobs = [b for b in blobs if b.role == "real"] + target_tag = real_blobs[0].d_tag + original_content = relay[target_tag][0].content + relay[target_tag][0].content = base64.b64encode(os.urandom(56)).decode() + recovered = recover(pubkey_bytes, password, relay) + assert recovered == nsec_bytes + relay[target_tag][0].content = original_content # restore + print(f" ✅ RECOVERED — AEAD failure treated as erasure, RS reconstructed") + + # ── Phase 5: Recovery with 3 Missing (should fail) ──────────────────── + + print("\n── Phase 5: Recovery with 3 missing blobs (should fail) ──────") + try: + recover(pubkey_bytes, password, relay, delete_indices={0, 1, 2}) + print(" ❌ UNEXPECTED SUCCESS") + sys.exit(1) + except ValueError as e: + print(f" ✅ Correctly rejected: {e}") + + # ── Phase 6: Wrong Password ─────────────────────────────────────────── + + print("\n── Phase 6: Wrong Password ──────────────────────────────────") + try: + recover(pubkey_bytes, "wrong-password-totally-different-words", relay) + print(" ❌ UNEXPECTED SUCCESS") + sys.exit(1) + except ValueError as e: + print(f" ✅ Correctly rejected: {e}") + + # ── Phase 7: Different User, Same Password ──────────────────────────── + + print("\n── Phase 7: Different User, Same Password ───────────────────") + other_sk = secp256k1.PrivateKey() + other_pk = secret_to_pubkey(other_sk.private_key) + try: + recover(other_pk, password, relay) + print(" ❌ UNEXPECTED SUCCESS") + sys.exit(1) + except ValueError as e: + print(f" ✅ Correctly rejected: {e}") + + # ── Phase 7b: Cross-Relay Isolation ────────────────────────────────── + + print("\n── Phase 7b: Cross-Relay Isolation ──────────────────────────") + relay_a: SimulatedRelay = {} + relay_b: SimulatedRelay = {} + url_a = b"wss://relay-a.example.com/" + url_b = b"wss://relay-b.example.com/" + blobs_a = backup(nsec_bytes, pubkey_bytes, password, relay_a, relay_url_bytes=url_a) + blobs_b = backup(nsec_bytes, pubkey_bytes, password, relay_b, relay_url_bytes=url_b) + tags_a = {b.d_tag for b in blobs_a} + tags_b = {b.d_tag for b in blobs_b} + assert tags_a.isdisjoint(tags_b), "d-tags must differ across relays" + print(f" ✅ Same password+pubkey → completely different d-tags per relay") + recovered_a = recover(pubkey_bytes, password, relay_a, relay_url_bytes=url_a) + assert recovered_a == nsec_bytes + print(f" ✅ Recovery from relay A succeeds with correct relay URL") + try: + recover(pubkey_bytes, password, relay_a, relay_url_bytes=url_b) + print(" ❌ UNEXPECTED SUCCESS (wrong relay URL should fail)") + sys.exit(1) + except ValueError as e: + print(f" ✅ Wrong relay URL correctly rejected: {e}") + + # ── Phase 7c: d-tag Squatting Resistance ───────────────────────────── + + print("\n── Phase 7c: d-tag Squatting Resistance ─────────────────────") + # Inject impostor events with same d-tags but different pubkeys + real_blobs_list = [b for b in blobs if b.role == "real"] + for b in real_blobs_list[:2]: + impostor_sk = secp256k1.PrivateKey() + impostor_pk = secret_to_pubkey(impostor_sk.private_key).hex() + relay_publish(relay, RelayEvent( + pubkey=impostor_pk, d_tag=b.d_tag, + content=base64.b64encode(os.urandom(56)).decode(), + created_at=int(time.time()) + 9999, # newer than legitimate + )) + recovered = recover(pubkey_bytes, password, relay) + assert recovered == nsec_bytes + print(f" ✅ RECOVERED — impostor events with same d-tags ignored (pubkey filter)") + + # ── Phase 7d: Malformed-Newer-Duplicate Resistance ──────────────────── + + print("\n── Phase 7d: Malformed-Newer-Duplicate Resistance ───────────") + # Inject a malformed event with CORRECT pubkey but garbage content, newer timestamp + target_blob = real_blobs_list[0] + relay_publish(relay, RelayEvent( + pubkey=target_blob.sign_pk, d_tag=target_blob.d_tag, + content=base64.b64encode(os.urandom(56)).decode(), # wrong ciphertext + created_at=int(time.time()) + 99999, # much newer than legitimate + )) + recovered = recover(pubkey_bytes, password, relay) + assert recovered == nsec_bytes + print(f" ✅ RECOVERED — malformed newer event skipped, older valid event used") + + # ── Phase 8: What an Attacker Sees ──────────────────────────────────── + + print("\n── Phase 8: What an Attacker Sees (relay dump) ──────────────") + backup_tags = {b.d_tag for b in blobs} + for events in relay.values(): + for evt in events: + label = "" + if evt.d_tag in backup_tags: + role = next((b.role for b in blobs if b.d_tag == evt.d_tag), "?") + label = f" ← {role.upper()}" + print(f" pk={evt.pubkey[:12]}… d={evt.d_tag[:12]}… " + f"content={evt.content[:16]}…{label}") + print(f"\n {len(blobs)} backup + 5 decoy = {total} total") + print(f" Labels are only visible because this demo knows the password.") + print(f" Steganographic cover is environment-dependent (see spec §Limitations).") + + # ── Phase 9: RS Test Vectors ────────────────────────────────────────── + + print("\n── Phase 9: RS Test Vectors ─────────────────────────────────") + # GF(2^8) arithmetic verification (NORMATIVE — spec §Test Vectors) + assert gf_mul(0x03, 0x03) == 0x05, f"gf_mul(0x03,0x03)={hex(gf_mul(0x03,0x03))}" + assert gf_mul(0x03, 0x05) == 0x0F, f"gf_mul(0x03,0x05)={hex(gf_mul(0x03,0x05))}" + assert gf_mul(0x57, 0x83) == 0xC1, f"gf_mul(0x57,0x83)={hex(gf_mul(0x57,0x83))}" + assert gf_inv(0x03) == 0xF6, f"gf_inv(0x03)={hex(gf_inv(0x03))}" + assert gf_mul(0x03, 0xF6) == 0x01, "gf_mul(0x03,0xF6) should be 0x01" + print(f" ✅ GF(2^8) multiplication vectors match spec") + + # Verify α=0x03 is primitive in GF(2^8) under 0x11B + x = 1 + for i in range(1, 256): + x = gf_mul(x, ALPHA) + if x == 1: + assert i == 255, f"α=0x03 has order {i}, expected 255" + break + print(f" ✅ α=0x03 is primitive (order 255 in GF(2^8)/0x11B)") + + # Small RS test: 3 data symbols, 2 parity (NORMATIVE — spec §Test Vectors) + test_data = [0x42, 0xAB, 0x07] + test_parity = rs_encode(test_data, 2) + assert test_parity == [0x62, 0x59], f"RS parity mismatch: {[hex(p) for p in test_parity]}" + print(f" RS encode [0x42, 0xAB, 0x07] → parity {[hex(p) for p in test_parity]}") + print(f" ✅ RS parity matches normative vector [0x62, 0x59]") + + # Verify decode with no erasures + full = test_data + test_parity + decoded = rs_decode([full[0], full[1], full[2], full[3], full[4]], 3) + assert decoded == test_data + print(f" ✅ RS decode (no erasures): {[hex(d) for d in decoded]}") + + # Verify decode with 1 erasure (position 1) + erased1 = [full[0], None, full[2], full[3], full[4]] + decoded1 = rs_decode(erased1, 3) + assert decoded1 == test_data + print(f" ✅ RS decode (1 erasure at pos 1): {[hex(d) for d in decoded1]}") + + # Verify decode with 2 erasures (positions 0 and 2) + erased2 = [None, full[1], None, full[3], full[4]] + decoded2 = rs_decode(erased2, 3) + assert decoded2 == test_data + print(f" ✅ RS decode (2 erasures at pos 0,2): {[hex(d) for d in decoded2]}") + + # Verify decode with mixed erasure (1 data + 1 parity) + erased3 = [full[0], None, full[2], None, full[4]] + decoded3 = rs_decode(erased3, 3) + assert decoded3 == test_data + print(f" ✅ RS decode (mixed: data pos 1 + parity pos 3): {[hex(d) for d in decoded3]}") + + print() + print("╔══════════════════════════════════════════════════════════════╗") + print("║ ALL TESTS PASSED ║") + print("╚══════════════════════════════════════════════════════════════╝") + + +if __name__ == "__main__": + main() diff --git a/crates/sprout-core/src/kind.rs b/crates/sprout-core/src/kind.rs index 27507c471..65f252605 100644 --- a/crates/sprout-core/src/kind.rs +++ b/crates/sprout-core/src/kind.rs @@ -159,6 +159,11 @@ pub const KIND_MEMBER_ADDED_NOTIFICATION: u32 = 44100; /// Stored globally (channel_id = None) with p-tag = target, h-tag = channel UUID. pub const KIND_MEMBER_REMOVED_NOTIFICATION: u32 = 44101; +// NIP-78 application-specific data (30078) +/// Application-specific data — NIP-78 parameterized replaceable events. +/// Used by NIP-SB steganographic key backup, Cashu wallets, app settings, etc. +pub const KIND_APP_SPECIFIC_DATA: u32 = 30078; + // Forum / social (45000–45999) // V1 used addressable range (30001–30003) — wrong. /// A forum post (thread root). @@ -286,6 +291,7 @@ pub const ALL_KINDS: &[u32] = &[ KIND_MEMBER_ADDED_NOTIFICATION, KIND_MEMBER_REMOVED_NOTIFICATION, KIND_LONG_FORM, + KIND_APP_SPECIFIC_DATA, KIND_FORUM_POST, KIND_FORUM_VOTE, KIND_FORUM_COMMENT, diff --git a/crates/sprout-relay/src/handlers/ingest.rs b/crates/sprout-relay/src/handlers/ingest.rs index 60c5f8c83..de3335c23 100644 --- a/crates/sprout-relay/src/handlers/ingest.rs +++ b/crates/sprout-relay/src/handlers/ingest.rs @@ -184,6 +184,9 @@ fn required_scope_for_kind(kind: u32, event: &Event) -> Result Ok(Scope::ChannelsWrite), KIND_NIP29_JOIN_REQUEST | KIND_NIP29_LEAVE_REQUEST => Ok(Scope::ChannelsRead), + // NIP-78 application-specific data (kind:30078) — used by NIP-SB backup, + // Cashu wallets, app settings, etc. No channel scope required. + sprout_core::kind::KIND_APP_SPECIFIC_DATA => Ok(Scope::MessagesWrite), // Huddle lifecycle events + guidelines KIND_HUDDLE_STARTED | KIND_HUDDLE_PARTICIPANT_JOINED