6 July, 2020
Salvatore Sanfilippo described an algorithm using Redis to implement distributed locking called Redlock. In this article, we will talk about this algorithm.
Local and Distributed Mutex
The mutex or mutual exclusion is used to serialise access to a given resource in programs. It is useful to prevent inconsistent read/write operation on the shared memory in a multithreaded software. It is required to acquire a lock explicitly before accessing the memory. Until the lock is released, other threads will block to prevent inconsistencies.
The atomic operations are primitives implemented in the hardware to compute two or more things in a single instruction. In Golang, the
sync package implements Mutex using the
atomic operations. An integer is used to start the lock state. When a Goroutine is ready to acquire the lock, it checks and updates the value atomically. The operation has to be atomic because there is no guarantee of the order across multiple Goroutines. For instance, while one Goroutine is comparing the value, others might overtake and change the value. The previous Goroutine will not realise it, and it will change the value again.
This mutex implementation uses a variable (memory) shared across threads. The distributed programs run as separate processes potentially on different machines. So the programs won't have access to some shared memory. This makes our problem more difficult. Now, we need a system that can be used by separate processes to agree upon the lock ownership without conflicts. This system is called distributed lock.
What is Deadlock?
Before we can implement a distributed lock, we need to understand the concept of Deadlock.
Let me ask a question: what will happen if a process acquires a lock and refuse to release it? Other programs will keep waiting indefinitely. This situation is called Deadlock. Deadlock is not limited to distributed systems and, it can happen in a concurrent program as well. Deadlock may or may not halt the entire application in certain situations. So while building a distributed locking, we need to make sure our system is deadlock-free.
Distributed systems pose yet another challenge for us. There can be many causes of deadlocks in distributed systems (network partition, resource starvation, crashes, et cetera).
The Redlock algorithm defines an algorithm for distributed locking. The official document uses the Redis database, but it can theoretically use other databases. The algorithm claims to provide the following guarantees.
- Mutual Exclusion - At any given time, only one client can hold the lock.
- Deadlock Free - Eventually, it is always possible to acquire a lock.
- Fault-Tolerant - As long as the majority of Redis nodes are up, clients can acquire and release locks.
Simple Redlock - Single Instance
Let's implement the first two guarantees before making the system fault-tolerant. Redis is a key-value datastore. It exposes the
SET command to assign values to the keys. Redis conveniently accepts the
NX option, which prevents
SET to run if the key already exists. Like Golang, this is the primitive atomic operation used by Redlock to implement distributed mutual exclusion.
That was pretty straightforward, wasn't it? The algorithm is not deadlock-free yet. To solve this, Redis conveniently accepts yet another option
PX to adds expiration to the key. The one acquiring lock needs to finish the task before the lock expires. The lock will auto-expire after the set duration, which makes our system deadlock-free.
SET service random_value NX PX 30000
The overall command roughly translates to this English sentence: Set the value of "service" key to "some_value" if the key does not exist. Also, remember to auto-expire the key after 30 seconds. If the command succeeds, Redis will return
OK. The program can safely assume the lock now.
But wait, isn't 30 seconds too long? It depends on the programs, but in most cases, it is indeed long. So even though our algorithm guarantees, it is not very efficient so far. The remedy is to release the lock. The program must delete the key in Redis once it is done with the computation.
Redis exposes the
DEL command to perform the operation. But there is a potential for a race condition. If the program is too late in deleting that it crossed the expiration time, some other program might have acquired the lock. So, blindly removing the key is not safe. This is where the "some_value" part of the previous command comes. The value needs to be reasonably random, and the program needs to remember it. When it is time to release the lock, the process must atomically check the value and delete the key only if it matches. Redis does not support this operation natively, but it does let us run arbitrary Lua scripts atomically. We can use it to our advantage here. The following Lua script can be passed using the
EVAL command with the value as its argument.
if redis.call("get",KEYS) == ARGV then return redis.call("del",KEYS) else return 0 end
The algorithm so far implements the following two guarantees:
- Mutual Exclusion - Since it uses atomic primitives, at any given time, only one program can succeed and acquire the lock.
- Deadlock-Free - The key is set with an expiration time. After the expiration time, the lock will be automatically released.
If the process crashes after acquiring the lock, the algorithm imposes the penalty. All other programs need to wait until the expiration time.
The Simple Redlock algorithm only works with a single Redis node. So, the availability of the lock is as good as the availability of the Redis node. In this case, using the Redis cluster does not help either because Redis implements asynchronous replication.
Distributed Redlock - Fault Tolerant
To implement fault-tolerance, we need to overcome the reliance on the single Redis node. We need more than one independent Redis master. The document suggests that 5 nodes are reasonable.
So, how will the process acquire a distributed lock? Well, it will need to do that all the Redis masters. But there are few edge cases here, so let's discuss them. The process of acquiring a lock must take less time then expiration time itself. To ensure that we need to set much smaller timeouts on the Redis calls. If it fails on one or more nodes, we move on to the next node in the list. Finally, once all the Redis calls are done, the program needs to ensure that at least (N/2 + 1) calls succeeded, and the time spent is less than expiration time. Only then will the program have ownership on the lock. If the requirements are not fulfilled, it needs to release the lock immediately on all Redis masters. This means all the nodes, including the ones which didn't acknowledge the lock in the first place. If the requirements are fulfilled, then the program can assume the lock.
Let's discuss the guarantees now. At any given moment, only one program will be able to acquire lock on (N/2 + 1) instances making it mutually exclusive. This is not that simple.
Redis is an in-memory database with optional persistence. In the case of node failure without persistence, the keys will not be available, effectively releasing the lock on the given Redis node. The author proposes a simple solution to this by letting the keys expire before starting the Redis node again. Let's take an example to discuss this situation. Let's say there were 5 nodes, and a program was able to acquire locks on 4. Now, if one node without lock goes down, the program still holds the lock on the majority of the nodes. The same is true if the node with lock goes down. If, however, persistence is enabled, it changes our story a bit. Redis thankfully implements expiration in a way such that time elapses even if the node goes down. This ensures both fault tolerance and mutual exclusivity.
Distributed systems have yet another inherent limitation, no universal clock. That means there is no agreement on the time across systems. So, instead of time on the wall clock, we need to deal with time as duration. The program then has (Total expiry duration - Time duration for acquiring all locks) before the lock will be released automatically. As discussed previously, the key expiration of Redis itself is fault-tolerant, and elapses time even it the node is not running. So, we can be reasonably sure that the lock will be there for the duration and will expire on time, even in cases of node failure. This assures the deadlock-free guarantee.
Is Redlock Perfect?
Martin Kleppman has written an excellent post criticizing the Redlock algorithm by highlighting some of the shortcomings. He claims that the guarantees don't stand in certain situations making Redlock not safe. It is also worth noting that Fault-tolerant Redlock requires multiple (~5) Redis nodes and might be an overkill for a simple use case. So, Redlock is not perfect for all situations.
Distributed Systems. Redlock defines an algorithm to solve the problem. It may or may not be suitable for all use cases, but what's interesting is its simplicity. Redis conveniently takes care of most of the complexity, and the algorithm is very simple and straightforward to implement. With that said, there are multiple implementations of Redlock available. Try it out for yourself and maybe write a new implementation of it. As for me, I'll be exploring more Distributed Locks research and algorithms and hopefully post more about them in the future.