diff --git a/README.md b/README.md index 511a25c..74a49da 100644 --- a/README.md +++ b/README.md @@ -82,6 +82,7 @@ loramesh/ │ ├── REQUIREMENTS.md # Measurable requirements │ ├── LORA_CONSTRAINTS.md # EU868 duty cycle, SF tradeoffs │ ├── PROTOCOL.md # Full protocol specification +│ ├── CRYPTOGRAPHY.md # Crypto primer for contributors │ └── OPS.md # Operations/deployment guide ├── proto/ │ └── messages.cddl # CBOR message schema (CDDL notation) diff --git a/docs/CRYPTOGRAPHY.md b/docs/CRYPTOGRAPHY.md new file mode 100644 index 0000000..443024e --- /dev/null +++ b/docs/CRYPTOGRAPHY.md @@ -0,0 +1,519 @@ +# Cryptography in LoraMesh — A Guide for Programmers + +This guide explains the cryptographic concepts and algorithms used in LoraMesh. It is written for programmers who understand code and data structures but have limited cryptography background. No math beyond high-school level is required. + +> **Golden rule:** LoraMesh does **not** invent any cryptography. Every algorithm here is a well-vetted standard, implemented via [libsodium](https://doc.libsodium.org/) (exposed to Python through [PyNaCl](https://pynacl.readthedocs.io/)). When in doubt, trust the library. + +--- + +## Table of Contents + +1. [Core Concepts](#1-core-concepts) + - [Symmetric Encryption](#11-symmetric-encryption) + - [Asymmetric (Public-Key) Cryptography](#12-asymmetric-public-key-cryptography) + - [Hash Functions](#13-hash-functions) + - [Authenticated Encryption (AEAD)](#14-authenticated-encryption-aead) + - [Key Derivation Functions (KDFs)](#15-key-derivation-functions-kdfs) + - [Digital Signatures](#16-digital-signatures) + - [Key Exchange](#17-key-exchange) + - [Forward Secrecy](#18-forward-secrecy) +2. [Primitives Used in LoraMesh](#2-primitives-used-in-loramesh) + - [Ed25519 — Identity Signatures](#21-ed25519--identity-signatures) + - [X25519 — Key Exchange](#22-x25519--key-exchange) + - [ChaCha20-Poly1305 — Message Encryption](#23-chacha20-poly1305--message-encryption) + - [XChaCha20-Poly1305 — USB Key Encryption](#24-xchacha20-poly1305--usb-key-encryption) + - [BLAKE2b — Fast Hashing and KDF](#25-blake2b--fast-hashing-and-kdf) + - [Argon2id — Password Hashing](#26-argon2id--password-hashing) +3. [Protocols](#3-protocols) + - [Noise Protocol — Secure Session Handshake](#31-noise-protocol--secure-session-handshake) + - [Double Ratchet — Per-Message Forward Secrecy](#32-double-ratchet--per-message-forward-secrecy) +4. [How It All Fits Together](#4-how-it-all-fits-together) +5. [Common Pitfalls](#5-common-pitfalls) +6. [Where to Read More](#6-where-to-read-more) + +--- + +## 1. Core Concepts + +### 1.1 Symmetric Encryption + +**One key, used for both encrypting and decrypting.** + +``` +plaintext + key → [encrypt] → ciphertext +ciphertext + key → [decrypt] → plaintext +``` + +Think of it like a lockbox where the same physical key locks and unlocks it. Both parties need to share the key in advance. The challenge is always: *how do you get the shared key to the other person without an eavesdropper seeing it?* Key exchange (section 1.7) solves this. + +**Used in LoraMesh for:** all actual message encryption — both end-to-end and hop-by-hop. + +--- + +### 1.2 Asymmetric (Public-Key) Cryptography + +**Two linked keys: a public key (share freely) and a private key (never share).** + +``` +alice_public_key ← share with everyone +alice_private_key ← never leaves Alice's device +``` + +- Anything encrypted with Alice's *public* key can only be decrypted by Alice's *private* key. +- Alice can *sign* data with her *private* key; anyone with her *public* key can verify the signature. + +This is how two strangers can communicate securely without meeting first — but the keys are much more expensive to compute with than symmetric keys, so in practice asymmetric cryptography is used only for key exchange and authentication. The actual message data is then encrypted with the cheaper symmetric key. + +**Used in LoraMesh for:** identity (Ed25519) and session key establishment (X25519). + +--- + +### 1.3 Hash Functions + +**One-way transformation: any input → fixed-size output (the "digest").** + +``` +BLAKE2b("hello world") → 64 bytes that look random +BLAKE2b("Hello world") → completely different 64 bytes +``` + +Key properties: +- **Deterministic**: same input always gives same output. +- **One-way**: you cannot reverse the hash to find the input. +- **Collision-resistant**: finding two inputs that produce the same hash is computationally infeasible. +- **Avalanche effect**: one bit of input change flips ~half the output bits. + +Hashes are used everywhere: node IDs, key derivation, handshake transcripts. + +**Used in LoraMesh for:** node IDs (BLAKE2b of public key), key derivation, handshake binding. + +--- + +### 1.4 Authenticated Encryption (AEAD) + +**Encryption + integrity check in a single operation.** + +Plain encryption hides the *content* of a message but an attacker can still flip bits in the ciphertext and corrupt it — you will not know. AEAD (Authenticated Encryption with Associated Data) adds a **Message Authentication Code (MAC)** that detects any tampering. + +``` +encrypt(key, nonce, plaintext, associated_data) → ciphertext + tag +decrypt(key, nonce, ciphertext, tag, associated_data) → plaintext OR "tampered!" +``` + +- **key**: the symmetric key. +- **nonce** ("number used once"): a value that must be unique for every encryption under the same key. Reusing a nonce is catastrophic — it can expose the key. +- **tag** (or MAC): a 16-byte authentication code appended to the ciphertext. +- **associated_data**: data that is *authenticated but not encrypted* (e.g., a packet header). If the associated data is altered, decryption fails. + +**Used in LoraMesh for:** all encrypted payloads. `ChaCha20` provides the stream cipher; `Poly1305` provides the MAC. + +--- + +### 1.5 Key Derivation Functions (KDFs) + +**Stretch or transform key material into one or more keys.** + +KDFs take a secret (e.g., a Diffie–Hellman output or a passphrase) and produce a cryptographically strong key. There are two main flavors: + +| Type | Purpose | Example | +|------|---------|---------| +| Fast KDF | Derive session keys from a shared secret | BLAKE2b / HKDF | +| Password KDF | Slow by design — resists brute force on passphrases | Argon2id | + +**Fast KDF example** (used in session setup): + +``` +session_key = BLAKE2b(shared_secret, key=context_string) +``` + +**Password KDF example** (used for USB key encryption): + +``` +encryption_key = Argon2id(password, salt, memory=256MB, iterations=3) +``` + +The memory and iteration parameters make password cracking slow even with GPUs/ASICs. + +**Used in LoraMesh for:** deriving session keys from ECDH output (BLAKE2b) and deriving USB key encryption keys from passphrases (Argon2id). + +--- + +### 1.6 Digital Signatures + +**Prove that a message was produced by the holder of a specific private key.** + +``` +signature = sign(alice_private_key, message) +valid = verify(alice_public_key, message, signature) # True or False +``` + +Signatures provide **authenticity** (it came from Alice) and **non-repudiation** (Alice cannot deny sending it). They do not encrypt the message — anyone can read it, but tampering or forgery is detectable. + +**Used in LoraMesh for:** signing identity public keys in contact bundles; signing handshake ephemeral keys to authenticate the session initiator. + +--- + +### 1.7 Key Exchange + +**Two parties derive a shared secret over a public channel without ever sending the secret itself.** + +The standard algorithm is **Diffie–Hellman** (DH). The elliptic-curve variant used here is **X25519** (ECDH on Curve25519). + +Simplified intuition: + +``` +Alice: generates (alice_private, alice_public) +Bob: generates (bob_private, bob_public) + +Alice → Bob: sends alice_public (eavesdropper sees this) +Bob → Alice: sends bob_public (eavesdropper sees this) + +Alice computes: shared_secret = X25519(alice_private, bob_public) +Bob computes: shared_secret = X25519(bob_private, alice_public) +→ Both arrive at the same shared_secret, which the eavesdropper cannot compute +``` + +An eavesdropper who records both public keys cannot compute the shared secret without solving the **elliptic curve discrete logarithm problem** — currently infeasible for Curve25519 key sizes. + +**Used in LoraMesh for:** every session handshake (Noise protocol), including both static (identity-derived) and ephemeral key pairs. + +--- + +### 1.8 Forward Secrecy + +**If today's session keys are leaked, past messages remain safe.** + +Without forward secrecy, an attacker who steals your long-term private key can decrypt all past recorded traffic. With forward secrecy, each session uses **ephemeral** (one-time-use) keys that are deleted after use, so compromising the long-term key only affects future sessions. + +**Perfect Forward Secrecy (PFS):** every message uses a unique key, deleted immediately after use. This is what the [Double Ratchet](#32-double-ratchet--per-message-forward-secrecy) provides. + +**Used in LoraMesh for:** Noise handshakes (session-level PFS), plus the Double Ratchet (per-message PFS). + +--- + +## 2. Primitives Used in LoraMesh + +### 2.1 Ed25519 — Identity Signatures + +| | | +|--|--| +| **Type** | Digital signature scheme | +| **Key size** | 32-byte public key, 64-byte private key (seed + public) | +| **Signature size** | 64 bytes | +| **libsodium function** | `crypto_sign_*` | + +Ed25519 is built on **Curve25519** (hence the "25519"). It produces 64-byte signatures and is very fast. Each LoraMesh node has exactly one Ed25519 key pair — this is its long-term identity. + +```python +# In sim/crypto.py +from nacl.signing import SigningKey +sk = SigningKey.generate() # private key +vk = sk.verify_key # public key (share freely) +signature = sk.sign(message).signature # 64 bytes +vk.verify(message, signature) # raises if tampered +``` + +**Why Ed25519 over RSA?** +- Keys and signatures are far smaller (32 bytes vs 256+ bytes for RSA-2048) +- Operations are ~100× faster +- No padding-related vulnerabilities +- Well-audited and widely deployed (SSH, TLS 1.3, Signal) + +**Ed25519 → X25519 conversion:** Both Ed25519 and X25519 use Curve25519, so there is a mathematical mapping between the two key types. LoraMesh uses this to derive X25519 keys (for key exchange) directly from Ed25519 identity keys, so users only manage one key pair. libsodium provides `crypto_sign_ed25519_pk_to_curve25519` for this. + +--- + +### 2.2 X25519 — Key Exchange + +| | | +|--|--| +| **Type** | Elliptic-curve Diffie–Hellman (ECDH) | +| **Key size** | 32 bytes (public and private) | +| **Output** | 32-byte shared secret | +| **libsodium function** | `crypto_scalarmult_*` | + +X25519 is the key exchange algorithm used in every Noise handshake. It generates a 32-byte shared secret from two 32-byte keys. + +```python +# In sim/crypto.py +from nacl.bindings import crypto_scalarmult +shared_secret = crypto_scalarmult(alice_private_bytes, bob_public_bytes) +# → 32 bytes that both parties can derive independently +``` + +The output is then fed into a KDF to produce actual session keys. + +**Why X25519 over P-256 (NIST curves)?** +- Simpler, faster, constant-time by design +- No NIST-specific concerns (P-256 coefficients have an unexplained seed) +- Used in TLS 1.3, Signal, WireGuard + +--- + +### 2.3 ChaCha20-Poly1305 — Message Encryption + +| | | +|--|--| +| **Type** | AEAD stream cipher | +| **Key size** | 32 bytes | +| **Nonce size** | 12 bytes (IETF variant) | +| **Tag size** | 16 bytes | +| **libsodium function** | `crypto_aead_chacha20poly1305_ietf_*` | + +Every message in LoraMesh — both end-to-end and hop-by-hop — is encrypted with ChaCha20-Poly1305. + +- **ChaCha20**: the stream cipher that turns a key + nonce into a keystream XOR'd with the plaintext. +- **Poly1305**: the MAC that authenticates both the ciphertext and any associated data. + +``` +ciphertext = ChaCha20(key, nonce) XOR plaintext +tag = Poly1305(key, ciphertext || associated_data) +``` + +The nonce must be unique per (key, message) pair. LoraMesh builds the nonce from an **epoch counter** (4 bytes) and a **message counter** (8 bytes): + +```python +# In sim/crypto.py +nonce = epoch.to_bytes(4, 'big') + counter.to_bytes(8, 'big') # 12 bytes +``` + +**Why ChaCha20-Poly1305 over AES-GCM?** +- No hardware AES acceleration needed (important for constrained ESP32 nodes) +- Constant-time by design (no timing side-channels) +- AES-GCM has dangerous nonce-reuse failure modes; ChaCha20-Poly1305 is slightly more forgiving +- Used in TLS 1.3, WireGuard, Signal + +--- + +### 2.4 XChaCha20-Poly1305 — USB Key Encryption + +| | | +|--|--| +| **Type** | AEAD stream cipher (extended nonce variant) | +| **Nonce size** | 24 bytes | +| **libsodium function** | `crypto_secretstream_*` / `SecretBox` | + +When identity keys are stored on a USB drive (`--usb-key`), they are encrypted with the **X** (extended) variant of ChaCha20-Poly1305. The 24-byte nonce makes random nonce generation safe: with 192 bits of randomness, generating two identical nonces for the same key is astronomically unlikely, which simplifies the implementation. + +--- + +### 2.5 BLAKE2b — Fast Hashing and KDF + +| | | +|--|--| +| **Type** | Cryptographic hash / MAC / KDF | +| **Output size** | 1–64 bytes (configurable) | +| **libsodium function** | `crypto_generichash_*` / `crypto_kdf_*` | + +BLAKE2b is the workhorse hash function in LoraMesh. It is used for: + +1. **Node IDs** — 8-byte truncated hash of the identity public key: + ```python + node_id = BLAKE2b(identity_public_key, digest_size=8) + ``` + +2. **Key derivation** — with a 32-byte key parameter, BLAKE2b becomes a secure MAC / KDF: + ```python + session_key = BLAKE2b(shared_secret, key=context_string, digest_size=32) + ``` + +3. **Handshake binding** — the Noise protocol mixes every handshake message into a running hash, ensuring that the final session keys are bound to the exact sequence of messages exchanged (preventing transcript manipulation). + +**Why BLAKE2b over SHA-256?** +- ~3× faster in software +- Natively supports keyed hashing (no HMAC wrapper needed) +- Designed after SHA-3 competition; has no known weaknesses + +--- + +### 2.6 Argon2id — Password Hashing + +| | | +|--|--| +| **Type** | Memory-hard password KDF | +| **Parameters** | `opslimit` (CPU), `memlimit` (RAM) | +| **libsodium function** | `crypto_pwhash_*` (Argon2id variant) | + +When a passphrase protects a USB key, the raw passphrase cannot be used directly as an encryption key — it is too short and too predictable. Argon2id stretches the passphrase into a 32-byte key in a way that is deliberately expensive. + +```python +# In sim/usb_key.py +from nacl.pwhash import argon2id +key = argon2id.kdf( + KEY_SIZE, + passphrase.encode(), + salt, # 16 random bytes stored in the key file + opslimit=argon2id.OPSLIMIT_MODERATE, # ~3 iterations + memlimit=argon2id.MEMLIMIT_MODERATE # ~256 MB RAM +) +``` + +The memory requirement means an attacker cannot run millions of password guesses in parallel on a GPU — each guess needs 256 MB of RAM and several seconds of CPU time. + +**id** in Argon2**id** means it combines two variants: +- **Argon2i** (data-independent): resistant to timing side-channels. +- **Argon2d** (data-dependent): resistant to GPU cracking. + +The combined `id` variant inherits both properties. + +--- + +## 3. Protocols + +### 3.1 Noise Protocol — Secure Session Handshake + +The **Noise Protocol Framework** ([noiseprotocol.org](https://noiseprotocol.org/)) is a systematic way to build authenticated key-exchange protocols from simpler building blocks. Think of it as a standardised recipe for combining ECDH, signatures, and AEAD into a complete handshake. + +LoraMesh uses the **Noise_IK** pattern (`Noise_IK_25519_ChaChaPoly_BLAKE2b`): + +- **I** = initiator's static key is known to the responder (i.e., Alice already has Bob's public key). +- **K** = responder's static key is known to the initiator. +- This allows a **1-RTT handshake** (one round trip). + +#### Handshake flow + +``` +Alice (initiator) Bob (responder) +───────────────── ────────────────── +knows: bob_static_public knows: alice_static_public + +→ e, es, s, ss (message 1) + alice_e = random ephemeral key + send: alice_e.public + mix: ECDH(alice_e, bob_static) ← es + send encrypted: alice_s.public ← s (encrypted with running key) + mix: ECDH(alice_s, bob_static) ← ss + + ← e, ee, se (message 2) + bob_e = random ephemeral key + send: bob_e.public + mix: ECDH(bob_e, alice_e) ← ee + mix: ECDH(bob_s, alice_e) ← se + +Both sides now derive: + send_key, recv_key = split(chaining_key) +``` + +Every ECDH result is "mixed" into a **chaining key** and **handshake hash** via BLAKE2b. At the end both sides call `split()` to derive two independent symmetric keys (one per direction). The final session keys are cryptographically bound to everything that was exchanged — if an attacker replayed or tampered with any message, the keys would not match and the connection would fail. + +**Authentication:** Because Alice encrypts her static key using a key derived from `ECDH(alice_e, bob_static)`, only someone who holds `bob_static_private` can decrypt it. This prevents man-in-the-middle attacks as long as each party has the correct public key for the other (which is guaranteed by in-person USB contact exchange). + +See `sim/noise.py` for the implementation and [PROTOCOL.md](PROTOCOL.md) for the full specification. + +--- + +### 3.2 Double Ratchet — Per-Message Forward Secrecy + +The **Double Ratchet** algorithm (used in Signal, WhatsApp, etc.) provides **per-message forward secrecy**: each message uses a unique key that is discarded immediately after use. Even if an attacker captures the current ratchet state, they cannot decrypt past messages. + +It has two interlocking ratchets: + +#### Symmetric Ratchet (send / receive chains) + +``` +chain_key[0] → message_key[0] → message_key[1] → message_key[2] ... + chain_key[1] → chain_key[2] → chain_key[3] ... +``` + +Each step of the chain produces: +1. A **message key** used to encrypt/decrypt that one message — then deleted. +2. A **new chain key** for the next message. + +The chain is a one-way ratchet: you can advance it but not reverse it. + +#### DH Ratchet (root chain) + +Every `N` messages, the two peers exchange new ephemeral DH public keys. Each exchange updates the **root key**, which seeds fresh send/receive chain keys. This breaks the cryptographic chain so that even if a chain key is later compromised, it cannot be used to re-derive the keys for messages that already used the previous DH epoch. + +``` +root_key[0] + ECDH(alice_e[0], bob_e[0]) → root_key[1] → send_chain_key[1] +root_key[1] + ECDH(bob_e[1], alice_e[1]) → root_key[2] → recv_chain_key[2] +... +``` + +#### Out-of-order delivery + +LoRa's store-and-forward model means messages can arrive out of order. The implementation (see `sim/double_ratchet.py`) keeps a small buffer of **skipped keys** so that a message arriving late can still be decrypted. + +--- + +## 4. How It All Fits Together + +Here is the lifecycle of a single chat message from Alice to Bob over LoRa: + +``` +1. IDENTITY SETUP (once) + Alice generates Ed25519 keypair → alice_identity + Derives X25519 keypair from Ed25519 seed (birational map) + Node ID = BLAKE2b(alice_identity.public, size=8) + +2. CONTACT EXCHANGE (once, in person) + Alice exports ContactBundle { identity_pubkey, onion_addr, … } + Signs bundle with alice_identity (Ed25519) + Bob imports bundle, verifies signature + +3. SESSION HANDSHAKE (once per session) + Noise_IK handshake using X25519 keys + → Both sides derive (send_key, recv_key) via BLAKE2b key schedule + Double Ratchet state initialised from Noise shared secret + +4. MESSAGE SEND + Double Ratchet advances: derive message_key, advance chain_key + Encrypt plaintext: ChaCha20-Poly1305(message_key, nonce, plaintext) + Nonce = epoch_counter || message_counter + Result: ciphertext + 16-byte MAC + +5. HOP ENCRYPTION (relay transport) + Each LoRa relay re-encrypts the routing envelope with a per-hop session key + (same ChaCha20-Poly1305; the payload is opaque to the relay) + +6. RECEIPT & DECRYPTION + Bob's relay strips hop encryption, delivers ciphertext to Bob's node + Double Ratchet advances on receive side → same message_key + Decrypt: MAC verified, plaintext recovered; message_key discarded +``` + +**Key storage at rest:** +``` +passphrase → Argon2id (slow, memory-hard) → wrap_key +wrap_key + random_nonce → XChaCha20-Poly1305 encrypt → { identity_seed } +Stored in loramesh_identity.enc on USB drive +``` + +--- + +## 5. Common Pitfalls + +These are mistakes that break security even when using good algorithms: + +| Pitfall | Why it is dangerous | LoraMesh mitigation | +|---------|---------------------|---------------------| +| **Nonce reuse** | With ChaCha20-Poly1305, reusing a (key, nonce) pair leaks the key | Epoch + monotonic counter; AEAD authentication catches accidents | +| **Timing side-channels** | Comparing MACs with `==` leaks timing information | libsodium uses constant-time comparison everywhere | +| **Rolling your own crypto** | Easy to make subtle, catastrophic mistakes | LoraMesh delegates 100% of crypto to libsodium | +| **Weak passphrase** | Argon2id slows guessing but a 4-character password is still guessable | Minimum 6 diceware words (~65 bits); wordlist provided | +| **Keys left in memory** | GC may not zero freed memory, leaving keys in RAM | `secure_memory.py` calls `sodium_memzero` on exit; `--usb-key` limits exposure window | +| **Skipping the handshake hash** | Allows transcript substitution attacks | Noise mixes every message into the running hash before deriving session keys | +| **Re-using long-term keys for encryption** | Breaks non-repudiation and complicates key rotation | LoraMesh derives X25519 keys from Ed25519 but keeps them logically separate | + +--- + +## 6. Where to Read More + +**Beginner-friendly:** +- [Crypto 101](https://www.crypto101.io/) — free book covering hashes, AEAD, public-key crypto with minimal math +- [Serious Cryptography by Jean-Philippe Aumasson](https://nostarch.com/seriouscrypto) — the standard programmer's reference + +**Algorithm documentation:** +- [libsodium documentation](https://doc.libsodium.org/) — the library used in LoraMesh +- [PyNaCl documentation](https://pynacl.readthedocs.io/) — Python bindings for libsodium +- [Curve25519/Ed25519/X25519](https://cr.yp.to/ecdh.html) — original papers by D.J. Bernstein + +**Protocols:** +- [Noise Protocol Framework specification](https://noiseprotocol.org/noise.html) — read sections 1–5 for the concepts, then look at `sim/noise.py` +- [Signal Double Ratchet specification](https://signal.org/docs/specifications/doubleratchet/) — the reference for `sim/double_ratchet.py` +- [PROTOCOL.md](PROTOCOL.md) — LoraMesh protocol specification (assumes you've read this document) + +**Background and history:** +- [A Graduate Course in Applied Cryptography](https://toc.cryptobook.us/) — free textbook, very thorough +- [THREAT_MODEL.md](THREAT_MODEL.md) — how LoraMesh's crypto choices map to specific threats