⚡ FerrumDB

A high-performance, embedded document database written from scratch in Rust.
No server. No config files. No migrations. Open a file and go.

What is FerrumDB?

FerrumDB is an embedded key-value database engine built in Rust, designed for applications that need fast local persistence without the overhead of a server process. It is inspired by Bitcask and implements a custom binary log format, in-memory indexing, AES-256-GCM encryption at rest, atomic transactions, and a live web dashboard — all in ~1,000 lines of safe, async Rust.

It ships Python bindings via PyO3 (pip install ferrumdb) and Node.js bindings via NAPI-RS (npm install ferrumdb).

🌟 Features

Feature	Detail
⚡ O(1) reads & writes	Append-only log + in-memory `HashMap` index rebuilt on startup
📄 Native JSON documents	Store any structured data; values are `serde_json::Value`
🔍 Secondary indexing	O(1) field lookups via `create_index()` — maintained live on writes
🔐 AES-256-GCM encryption	Per-block encryption with random nonces; data is protected at rest
⚛️ Atomic transactions	All-or-nothing batches written as a single log entry
⏱️ Configurable fsync policy	`Always` / `Periodic(ms)` / `Never` — tune durability vs. throughput
🖥️ Ferrum Studio	Built-in web dashboard (Axum) at `localhost:7474`
🐍 Python bindings	`pip install ferrumdb` — no Rust toolchain required
🛡️ Crash resilience	Log compaction via atomic `rename()`; incomplete records are skipped
📊 Observability	Lock-free atomic metrics: ops/sec, uptime, GET/SET/DELETE counts

🏗️ Architecture

FerrumDB was built ground-up without using an existing storage library. Every layer is custom:

┌─────────────────────────────────────────┐
│                FerrumDB API              │  ← High-level Rust & Python interface
├─────────────────────────────────────────┤
│             StorageEngine               │  ← Core engine: index + log management
│  ┌─────────────────┐  ┌──────────────┐  │
│  │  In-Memory Index │  │ Secondary    │  │
│  │  HashMap<K,V>   │  │ Indexes      │  │
│  │  RwLock async   │  │ HashMap<F,V> │  │
│  └────────┬────────┘  └──────────────┘  │
│           │ append / reads              │
│  ┌────────▼────────────────────────┐    │
│  │   Append-Only Log (AOF)         │    │  ← Bitcask-inspired binary format
│  │   [len: u64][JSON bytes]...     │    │     length-prefixed, sequential
│  └────────┬────────────────────────┘    │
├───────────┼─────────────────────────────┤
│  ┌────────▼────────────────────────┐    │
│  │  AsyncFileSystem trait          │    │  ← Pluggable I/O abstraction
│  │  ┌──────────┐  ┌─────────────┐  │    │
│  │  │   Disk   │  │  Encrypted  │  │    │  ← Decorator pattern
│  │  │  (tokio) │  │  (AES-GCM)  │  │    │     random nonce per block
│  │  └──────────┘  └─────────────┘  │    │
│  └─────────────────────────────────┘    │
└─────────────────────────────────────────┘

Key design decisions:

Bitcask AOF: Writes are append-only (fast, sequential I/O). The in-memory index is the source of truth for reads. On startup, the engine replays the log to rebuild state — making recovery deterministic and crash-safe.
Pluggable AsyncFileSystem trait: The I/O layer is fully abstracted. DiskFileSystem and EncryptedFileSystem implement the same trait — swapped via the decorator pattern. This makes the storage engine 100% testable without touching disk.
AES-256-GCM per block: Each binary record is individually encrypted with a cryptographically random 12-byte nonce. The nonce is stored alongside the ciphertext. GCM authentication tags detect any file tampering.
Tokio async throughout: Reads use RwLock (many concurrent readers), writes serialize via write lock. Metrics use AtomicU64 — no lock contention on the hot path.
Log compaction: A background compact() rewrites only live (non-expired, non-deleted) records to a temp file, then swaps atomically via rename() — POSIX-atomic, no data loss possible.

⚙️ Technical Stack

Component	Technology
Language	Rust (2021 edition)
Async runtime	Tokio
Serialization	serde + serde_json
Encryption	aes-gcm (AES-256-GCM)
Web dashboard	Axum
Python bindings	PyO3 (via maturin)
Benchmarking	Criterion
Testing	tokio::test + tempfile

📊 Performance

Benchmarked with Criterion on an append-only log with FsyncPolicy::Never (max throughput):

Operation	Performance
Single `SET`	~1–3 µs
Single `GET` (in-memory)	< 1 µs
1,000 sequential `SET`s	~2–5 ms
100 concurrent `SET`s (Tokio tasks)	~3–8 ms
Secondary index query (100 docs)	< 1 µs

Run benchmarks yourself: cargo bench

🐍 Python Installation & Usage

FerrumDB is available on PyPI. Install it using pip:

pip install ferrumdb

from ferrumdb import FerrumDB

# Zero-setup: creates myapp.db if it doesn't exist
db = FerrumDB.open("myapp.db")

# Store any JSON-serializable value
db.set("user:1", '{"name": "alice", "role": "admin", "score": 99}')
db.set("user:2", '{"name": "bob",   "role": "user",  "score": 45}')

# Read back
print(db.get("user:1"))       # {"name": "alice", "role": "admin", "score": 99}
print(db.count())             # 2
print(db.keys())              # ["user:1", "user:2"]

# Secondary indexing — O(1) field lookups
db.create_index("role")
admins = db.find("role", '"admin"')   # => ["user:1"]

# Delete
db.delete("user:2")

🦀 Rust Installation & Usage

FerrumDB is available on crates.io. Add it to your project:

cargo add ferrumdb
cargo add tokio -F full
cargo add serde_json

Or manually add to your Cargo.toml:

[dependencies]
ferrumdb = "0.1.1"
tokio = { version = "1", features = ["full"] }
serde_json = "1"

use ferrumdb::{FerrumDB, Config, Transaction, FsyncPolicy};
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Standard open (zero-setup, uses ferrum.db)
    let db = FerrumDB::open_default().await?;

    // Store documents
    db.set("user:1".into(), json!({"name": "alice", "role": "admin"})).await?;

    // Secondary index query
    db.create_index("role").await?;
    let admins = db.find("role", &json!("admin")).await;

    // Atomic transaction
    let tx = Transaction::new()
        .set("k1".into(), json!({"tag": "blue"}))
        .set("k2".into(), json!({"tag": "red"}))
        .delete("k1".into());
    db.commit(tx).await?;

    // Encrypted database (AES-256-GCM, random nonce per block)
    let key: [u8; 32] = *b"my_super_secret_key_32_bytes_!!?";
    let db_enc = FerrumDB::open(
        Config::new()
            .with_encryption(key)
            .with_fsync_policy(FsyncPolicy::Periodic(std::time::Duration::from_millis(100)))
    ).await?;

    Ok(())
}

🖥️ Ferrum Studio

Ferrum Studio is a built-in web dashboard to browse, query, and inspect your database with real-time metrics.

Option 1 — Via the REPL (auto-launches when you cargo run):

cargo run --release
# 🔥 Ferrum Studio → http://localhost:7474

Option 2 — Standalone CLI (works with any .db file, any language):

cargo install ferrumdb-cli
ferrumdb web myapp.db              # opens http://localhost:7474
ferrumdb web myapp.db --port 8080  # custom port
ferrumdb info myapp.db             # show key count & file size
ferrumdb compact myapp.db          # remove deleted/expired entries

The CLI works regardless of whether you use the Rust, Python, or Node.js bindings — just point it at your .db file.

🖥️ CLI REPL

cargo run
cargo run -- --fsync=always   # strongest durability

Command	Description
`SET <key> <json>`	Store a document
`GET <key>`	Retrieve and pretty-print
`DELETE <key>`	Remove a key
`KEYS`	List all keys
`COUNT`	Total number of entries
`INDEX <field>`	Create secondary index on JSON field
`FIND <field> <value>`	Query by indexed field
`HELP`	Show commands + live session metrics

📂 Examples

Full working examples for each language are in the examples/ directory:

Example	Language	Description	Run
rust-example	Rust	Task Manager — CRUD, secondary indexes, transactions, TTL	`cd examples/rust-example && cargo run`
python-example	Python	Contact Book — CRUD, secondary indexes, transactions	`cd examples/python-example && python main.py`
node-example	Node.js	Note Taker — CRUD, secondary indexes, transactions	`cd examples/node-example && node main.mjs`

Each example is self-contained and demonstrates the core FerrumDB API in its respective language.

⚠️ Known Limitations

FerrumDB optimizes for simplicity and embedded use cases. Understand the trade-offs:

Limitation	Reason	Workaround
Entire index in RAM	O(1) reads require full `HashMap` in memory	Best for databases < 1 GB
Single-writer only	Append-only log has no cross-process lock protocol	One process per DB file
No range queries	Secondary indexes store exact value matches	Use Tantivy for range scans
No nested field indexes	Indexes only top-level JSON keys	Flatten documents before storing
Blocking compaction	Rewrites entire log — hold write lock	Schedule during low-traffic
No WAL / MVCC	Simpler append-only design	Accept occasional contention
No replication	Single-file, embedded design	Handle replication at app level

Best for: local-first apps, desktop tools, embedded caching, session/config stores, write-heavy workloads.

Not for: large datasets (> 1 GB), complex queries (JOINs, aggregations), multi-writer or distributed scenarios.

Environment Config

set FERRUMDB_FSYNC=always        # sync every write (safest)
set FERRUMDB_FSYNC=never         # never sync (fastest)
set FERRUMDB_FSYNC=periodic:200  # sync every 200ms

let db = FerrumDB::open_from_env().await?;

📋 Changelog

See CHANGELOG.md for a full list of changes per version.

📝 License

MIT — see LICENSE for details.

Built with 🦀 by Muhammad Usman

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
benches		benches
examples		examples
ferrumdb-cli		ferrumdb-cli
ferrumdb-node		ferrumdb-node
ferrumdb-python		ferrumdb-python
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
ferrum.db		ferrum.db
ferrumdb.json		ferrumdb.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ FerrumDB

What is FerrumDB?

🌟 Features

🏗️ Architecture

⚙️ Technical Stack

📊 Performance

🐍 Python Installation & Usage

🦀 Rust Installation & Usage

🖥️ Ferrum Studio

🖥️ CLI REPL

📂 Examples

⚠️ Known Limitations

Environment Config

📋 Changelog

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ FerrumDB

What is FerrumDB?

🌟 Features

🏗️ Architecture

⚙️ Technical Stack

📊 Performance

🐍 Python Installation & Usage

🦀 Rust Installation & Usage

🖥️ Ferrum Studio

🖥️ CLI REPL

📂 Examples

⚠️ Known Limitations

Environment Config

📋 Changelog

📝 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages