An in-memory database storing key-value pairs based on the RESP protocol. It features persistence by snapshotting the database and storing RDB files on disk. It supports replication to ensure durability where each replica is connected to a master instance that forwards all write commands to it. Finially failover recoveries are supported through Sentinels which monitors master and replica instances and selecting a replica to become the new master when failure is detected.
An instance at its core is a TCP server that handles commands from its clients. Clients communicate with the RESP protocol (Redis serialization protocol).
- Master: Handles all write commands and data changes. It's the main source of truth.
- Replica: An exact copy of the master for redundancy. It can only handle Read commands and not Write commands.
- Replication ID: Marks a given history of the data set. If an instance starts as a Master, then it would get a random generated ID and all replicas will inherit it.
- Offset: Tracks how many commands (in bytes) an instance has processed. If a Replica falls behind while the Master continues, the Replica will replay the missed commands when it reconnects to the master until its offset is in-sync.
How Masters and Replicas interact together to ensure consistency
The master expects three things from the replica
- listening port
- replication ID
- offset
- Master performs a Full resync by taking a snapshot of its current data and sends it along with the current offset to the replica
- The difference in offsets determines the number of missing bytes
- The master sends these missing bytes to the replica and recovering it
Master | Replica
|
127.0.0.1:6377> set name Youssef | # Received 37 bytes: *3\r\n$3\r\nSET\r\n$4\r\nname\r\n$7\r\nYoussef\r\n
OK |
| 127.0.0.1:6380> get name
| "Youssef"
127.0.0.1:6377> get name |
"Youssef" |
| 127.0.0.1:6380> info
| role:slave
| replid:7f3ac9de
| offset:37 # Number of bytes processed
127.0.0.1:6377> info |
role:master |
replid:7f3ac9de |
offset:37 |
|
| **Replica goes down**
|
127.0.0.1:6377> set age 50 | (replica down)
OK |
127.0.0.1:6377> set city Cairo | (replica down)
OK |
127.0.0.1:6377> info |
role:master |
replid:7f3ac9de |
offset:104 |
| **Replica comes back online**
| Reconnects and issues a PSYNC with its last known offset '37'.
| The master replies with +CONTINUE and sends the missing data
| from offset ['37' → '104']
|
| 127.0.0.1:6380> PSYNC 7f3ac9de 37
| +CONTINUE
| (applied missing commands: set age 50, set city Cairo)
|
| 127.0.0.1:6380> info
| role:slave
| replid:7f3ac9de
| offset:104
|
| 127.0.0.1:6380> get age
| "50"
| 127.0.0.1:6380> get city
| "Cairo"Buffer storing the latest stream of commands
If the handshake falls into scenario 2 (same replication ID but different offsets), the master tries to fetch the missing commands from the backlog.
- If available → the master forwards them to the replica (partial resync).
- If not → the master performs a full resync, sending a snapshot of the entire dataset.
Data is persisted by creating RDB (Redis Database) files, which are point-in-time snapshots of the dataset stored on disk.
- The master periodically saves its in-memory state into an RDB file.
- On restart, Redis loads the RDB file back into memory to restore the dataset.
Sentinels monitor masters and replicas to handle automatic failover.
- A sentinel continuously checks if the master is reachable.
- If the master does not respond for a configured period of time, the sentinel marks it as down.
- The sentinel then promotes one of the replicas to become the new master, choosing the replica with the latest offset (most up-to-date data).
- Other replicas are reconfigured to follow the new master, ensuring the system stays available automatically.
