Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 64 additions & 2 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,34 @@
# Architecture

## APIs
## APIs & Conventions

### Dependencies

Constantine has no external dependencies in `src` to avoid supply chain attack risks, and to ensure auditing only involves code written with cryptographic security in mind.

This also includes Nim standard library except:

- std/atomics
- Anything used at compile-time only
- std/macros
- std/os and std/strutils to create compile-time paths relative to the currentSourcePath
- Anything used for tests or debugging
- json and yaml libraries are used in tests
- sequences and strings can be used in tests
- `toHex` is provided for debugging and testing
- Anything related to code generation, for example:
- the Nim -> Cuda/WebGPU runtime compiler uses tables but the actual GPU cryptographic code doesn't.
- the LLVM JIT uses string for identifier

To be clear, we do not use the following:
- std/tables, std/sequtils, std/strutils, std/algorithm, std/math, std/streams ...
- bit manipulations, endianness manipulation, ...
- Nim's `seq` and Nim's `string`
- Nim's locks, condition variables
- external threadpools
- external bigint libraries

For file IO, we reimplement our primitives on top of the C standard library in `constantine/platforms/fileio.nim`

### Argument orders

Expand Down Expand Up @@ -44,7 +72,7 @@ subroutines.
- Extremely inefficient codegen in Constantine itself https://github.com/mratsim/constantine/issues/145
with useless moves instead of in-place construction.
- In other languages like Rust, users have seen a dramatic 20% increase in performance by moving from out-of-place to in-place mutation: https://www.reddit.com/r/rust/comments/kfs0oe/comment/ggc0dui/
- And they are struggling with GCE (Guarenteed Copy Elision) and NRVO/RVO(Named) Return Value Optimization
- And they are struggling with GCE (Guaranteed Copy Elision) and NRVO/RVO(Named) Return Value Optimization
- https://github.com/rust-lang/rust/pull/76986
- https://github.com/rust-lang/rfcs/pull/2884

Expand All @@ -65,6 +93,40 @@ the failure is communicated through a boolean.
Similarly in Go, errors are used to communicate breaking protocol rules
and the result of verification is communicated through a bool.

We can use Nim's effect tracking `{.raises:[].}` to ensure no exceptions are raised in a function call stack.

### Memory allocation

We actively avoid memory allocation for any protocol that:
- handle secrets
- could be expected to run in a trusted enclave or embedded devices

This includes encryption, hashing and signatures protocols.

When memory allocation is necessary, for example for multithreading, GPU computing or succinct or zero-knowledge proof protocol, we use custom allocators from `constantine/platforms/allocs.nim`. Those are thin wrappers around the OS `malloc`/`free` with effect tracking `{.tags:[HeapAlloc].}` or `{.tags:[Alloca].}`. Then we can use Nim's effect tracking `{.tags:[].}` to ensure no *heap allocation* or *alloca* is used in the function call stack.

### Constant-time and side-channel resistance

Low-level operations that can handle secrets SHOULD be constant-time by default.
Constantine assumes that addition, subtraction (including carries/borrows), multiplication, bit operations, shifts are constant-time in hardware.

Constantine has `SecretWord` and `SecretBool` types that reimplements basic primitives
with side-channel resistance in mind, though smart compilers may unravel them and reintroduce branches. Constantine provides primitives like `cadd`, `csub`, `cneg`, `ccopy` (conditional add, sub, neg, copy) and `isZero`, `isMsbSet` that allow simulating conditional execution without branches. Check `constantine/platforms/constant_time/ct_routines.nim`

Hence a significant portion of low-level routines has an assembly path (x86-64 and ARM64) to avoid compiler-introduced side-channels like in https://www.cl.cam.ac.uk/~rja14/Papers/whatyouc.pdf

More information in the wiki: https://github.com/mratsim/constantine/wiki/Constant-time-arithmetics

Some primitives may have a variable time optimization (for example field inversion or elliptic curve scalar multiplication) or may be variable-time only (for example multi-scalar multiplication), in that case they are suffixed `_vartime`.

We use Nim's effect tracking `{.tags:[VarTime].}` so we can ensure that in protocols that handle secrets it is a compile-time error to have VarTime anywhere in the call stack via `{.tags:[].}`.

High-level protocols or subroutines for high-level protocols do not need this `_vartime` suffix hence this mostly concerns bigint, fields, elliptic curves and polynomial arithmetic.

Another source of side-channel attacks is table access. A non-uniform table access can be exploited by cache attacks like Colin Percival attack on Hyperthreading.
A lookup table access can be simulated via `secretLookup` to ensure the whole table is uniformly accessed, or `ccopy` for non-trivial indexing patterns like for exponentiation.


## Code organization

TBD
Loading