You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Crabka is a Rust reimplementation of Apache Kafka.
It speaks the Apache Kafka wire protocol byte-for-byte (targeting the 4.3.0
message schemas), stores data in Kafka-compatible log segments, runs its metadata
quorum on KRaft, and integrates cleanly with the standard JVM tooling —
kafka-topics.sh, kafka-configs.sh, kafka-acls.sh, kafka-consumer-groups.sh,
kafka-leader-election.sh, kafka-reassign-partitions.sh, and the official Java
client. Existing producers, consumers, and operator workflows work against a
Crabka broker without modification.
Beyond the broker, Crabka ships native Rust clients, a KIP-1071 Streams client,
a Schema Registry-compatible service, a gRPC / Connect-RPC gateway, a Kubernetes
operator (Strimzi-equivalent), and a Cruise-Control-equivalent partition
rebalancer.
Distributed under the Apache License 2.0 as a derivative work.
Why Crabka
Drop-in protocol compatibility. Crabka is validated against the JVM Kafka
client via differential byte-equality tests: every encode/decode is checked
against kafka-clients 4.3.0, and a JVM acceptance suite drives the official
cp-kafka admin tools against a live Crabka broker.
Memory-safe, fearlessly concurrent. Written in async Rust on tokio, with
no JVM and no GC pauses. unsafe_code = "forbid" across the workspace.
Single static binary. No JDK, no ZooKeeper, no separate controller process.
KRaft-native. Metadata lives in a native KRaft quorum speaking the real
KIP-595 wire — interoperable with JVM controllers in a mixed quorum.
Modern crypto. TLS via rustls; SASL/SCRAM-SHA-256/512, SASL/PLAIN,
SASL/OAUTHBEARER (signed-JWT / JWKS), and SASL/GSSAPI (Kerberos) out of the box.
Batteries included. Native producer/consumer/admin/streams clients, Schema
Registry, a gateway, a Kubernetes operator, and an automated rebalancer live in
the same workspace.
Performance
On a single 4-vCPU box, driven by the same Rust load driver over the Kafka wire
protocol against both stacks, Crabka matches Apache Kafka 4.3's
produce-and-consume throughput within a few percent — ahead on the 1 KiB
workloads — while resident in 24–32 MiB versus Kafka's ~1 GiB JVM heap
(32–43× lighter), sustaining 1.15–1.2× more messages per CPU-core, with
tighter tail latency and a 1–2 s cold start versus 8–9 s.
Crabka is in beta (v0.3.2). The Kafka-parity surface — wire protocol,
storage, replication, KRaft metadata, security, authorization, quotas, Schema
Registry, gateway, Kubernetes operator, and rebalancer — is now broad enough,
and validated hard enough against the JVM, that the project has matured out of
its alpha phase.
It remains greenfield and pre-1.0 — undeployed, with no production users and
no on-disk compatibility guarantees yet. The Kafka wire protocol is the contract
that matters, and it is locked to byte-exactness via the differential oracle and
JVM acceptance tests. Treat Crabka as beta: ready for evaluation and non-critical
workloads, not yet hardened by production mileage.
What works today:
Single-broker and multi-broker clusters with KRaft metadata (including Raft
snapshots, KIP-630, dynamic quorum reconfiguration, KIP-853, and separate
process.roles controller-only / broker-only nodes with observer metadata
fetch), replication, ISR maintenance, leader election (including offset-aware
unclean recovery, KIP-966 / KIP-841), fetch-from-follower / rack-aware reads
(KIP-392), partition reassignment, JBOD (multi-log-dir) with intra-broker
log-dir reassignment (AlterReplicaLogDirs, KIP-113).
Idempotent and transactional produce/consume (exactly-once), consumer groups
with both the classic (eager) and cooperative-incremental (KIP-429) rebalance
protocols, static membership (KIP-345), incremental fetch sessions (KIP-227),
and log compaction.
Share groups / queues (KIP-932): the ShareGroupHeartbeat / ShareFetch /
ShareAcknowledge RPCs, the share-state coordinator and __share_group_state
topic, the share-group admin offset APIs, and a native share consumer —
validated against the JVM share-group client.
Streams rebalance protocol (KIP-1071): broker-side task assignment for Kafka
Streams groups — the StreamsGroupHeartbeat / StreamsGroupDescribe RPCs,
topology ingestion with internal repartition/changelog topic auto-creation,
active/standby/warmup task assignment via the highly-available assignor with
changelog catch-up, __consumer_offsets persistence, and the streams.version
feature gate. Live classic↔streams group migration is not yet wired.
TLS / mTLS, SASL (PLAIN, SCRAM-256/512, OAUTHBEARER with JWKS / signed-JWT
and opaque-token introspection, GSSAPI/Kerberos), SASL re-authentication
(KIP-368), delegation tokens (KIP-48 / KIP-373), ACL authorization, the OPA
cluster-authorizer bridge, and the full client-quota surface.
Native Rust producer / consumer / admin clients plus a KIP-1071 Streams client
with a DSL/runtime for common stream-processing workloads.
A Schema Registry-compatible REST service, a gRPC / Connect-RPC + HTTP gateway,
a Kubernetes operator (Strimzi-equivalent CRDs, including tiered-storage and
Schema Registry surfaces), and a Cruise-Control-equivalent rebalancer.
Notable gaps (see the KIP matrix for detail): the
next-gen consumer group protocol (KIP-848) is fully implemented — __consumer_offsets
persistence, a rack-aware UniformAssignor, the pluggable server-side assignor
surface, and subscribed_topic_regex are all in tree, and live bidirectional
classic↔next-gen group migration is now wired: in-place upgrade (classic→consumer on
ConsumerGroupHeartbeat), downgrade (consumer→classic when the last native consumer
member leaves), hosted classic members served through the unified coordinator, and
the transition governed by group.consumer.migration.policy (default: bidirectional).
JVM-validated with a classic cp-kafka consumer and an apache/kafka:4.0.0
consumer-protocol consumer co-existing in the same group with a coherent
cross-protocol assignment. Tiered
storage (KIP-405) is fully wired: the topic-backed RemoteLogMetadataManager
(durable __remote_log_metadata internal topic) is the default RLMM whenever
tiered storage is enabled; in-memory metadata is an explicit opt-out for
in-process tests only. Copy/read/retention, metadata-topic (RLMM) snapshots, dynamic per-broker
metadata-partition assignment, and TLS/SASL on the metadata client are all in
tree. JVM interoperability is validated via a single-broker restart-durability
test (MinIO/S3) and an in-process multi-broker failover test that proves a
survivor broker can serve remote reads from metadata it consumed off
__remote_log_metadata after leader failover. The __remote_log_metadata
record format is byte-exact with the JVM's RemoteLogMetadataSerde
(AbstractApiMessageSerde envelope + flexible message bodies, verified against
apache/kafka:4.0.0 golden vectors), so a mixed JVM+Crabka cluster can share
the internal metadata topic. Full segment-data interop additionally requires a
shared RemoteStorageManager layout and producer-snapshot conventions, which
are not yet validated against the JVM, so segment-level mixing is not claimed. The broker-side Streams rebalance protocol
(KIP-1071) is implemented and serves real JVM Streams-group admin clients, and
crabka-client-streams provides a Rust Streams client/runtime, but it is still
not a full JVM Kafka Streams replacement. Kafka Connect and MirrorMaker
equivalents are not yet implemented. ZooKeeper mode and ZK→KRaft migration are
deliberately out of scope — Crabka is KRaft-only.
Load driver + report aggregator for the Crabka-vs-Strimzi benchmark harness
Feature compatibility
The following tables list Apache Kafka functional surface area and whether Crabka
implements it today. Legend: ✅ implemented · ⚠️ partial · ❌ not yet · ⛔ out of scope.
This matrix tracks the significant Kafka Improvement Proposals that define
Apache Kafka's protocol and feature surface, with Crabka's status for each. It
is not exhaustive of every accepted KIP (Kafka has well over a thousand) — it
covers the user-visible protocol, storage, replication, security, and operations
KIPs. Legend: ✅ implemented · ⚠️ partial · ❌ not yet · ⛔ out of scope.