Consistency In Distributed Systems

Photo by Growtika on Unsplash

Consistency In Distributed Systems

So the question arises: what does consistency refer to? Well, it’s about getting the latest data from the server. Let me give an example: you have a LinkedIn account, and you created a post. If someone likes that post, you should see the like count in real time—this defines consistency.

Consistency Levels

  1. Lineraizable Consistency

  2. Eventual Consistency

  3. Casual Consistency

  4. Quorum

Lineraizable Consistency Level

Imagine you're reading data from a database and want to ensure you're seeing the most up-to-date information. Every change that occurred before your read request should be visible to you at this consistency level. In other words, when you ask the database for data, it reflects all the updates that happened before you hit 'read.' This ensures you’re always seeing the most current state of the data.

Note: We can achieve this type of consistency using single-threaded (ordering request)

This is useful when the system needs perfect consistency

Example: suppose we have x=17

  • update x to 13

  • update x to 20

  • read x will return 20

  • update x to 10

  • read x will return 10

Eventual Consistency

Let’s break down what happens in a system with eventual consistency. Imagine you have a variable, say x, which starts at 30. Now, if you run two operations—one that writes x = 10 and another that reads the value of x—in a multi-threaded environment, there's no guarantee which operation will happen first. If the read operation runs first, you’ll see the old value, 30. If the write happens first, you’ll get the new value, 10. This uncertainty is what defines eventual consistency.

While this might sound risky, it’s perfectly fine for certain applications, like email systems. Sending an email takes time, but you trust that once it’s in your outbox, it’ll soon appear in your sent folder. The operation may not be instantaneous, but you know it’ll get there eventually, which is exactly how eventual consistency works.

Casual Consistency

In this type of consistency level if previous operations are related to the current operation then the previous must be executed first to execute the current operation.

For example:

  1. Write (Y,10)

  2. Write (X,30)

  3. Read (Y)

  4. Write (Y,40)

  5. Read (X)

So the Value at the 3rd step Read(Y) depends on the 1st operation Write(Y,10) so the first operation must be executed first. However Write(X,30) does not affect the value of Y so it does not matter whether Write(X,30) executes before or after Read(Y) operation.

So 1st,3rd,4th will be executed on a server1 or thread1 and 2nd,5th will be executed on server2 or thread2.

Quorum

In a distributed system, we often have multiple copies of our database (replicas). These replicas might not always be in sync. When we read data, we check all the replicas and choose the best value (like the majority value or the most recent one). This process relies on getting a common agreement among the replicas.

For Example: Suppose we have three replicas, all with x = 20. If we update the second replica to x = 40, but it crashes right after, and we try to read the value, we only get data from the other two replicas, which still show x = 20. This means we get the old value. However, once the second replica is back, the system will eventually give the correct value of x = 40. This is called eventual consistency.

Strong Consistency: To make the system strongly consistent (where the data is the same across all replicas), we use a rule: R + W > N. Here:

  • R = Number of replicas to read from.

  • W = Number of replicas to write to.

  • N = Total number of replicas.

For example, if N = 5 and W = 2, R should be at least 4 to ensure consistency. If we can’t read from 4 replicas, we return an error.

Quorum ensures fault tolerance—tune the values of R, W, and N to choose between eventual consistency (R + W ≤ N) and strong consistency (R + W > N).

Disadvantages:

  • Costly: More replicas mean higher costs.

  • Split-Brain Problem: If there’s an even number of replicas, the system might split into two parts, causing confusion.

Data Consistency Levels Tradeoffs

LevelConsistencyEfficiency
LinearizableHighLow
Eventual ConsistencyLowHigh
Causal ConsistencyHigher than eventual but lower than causalHigher than Lineraizable but lower then eventual consistency
QuorumConfigurableConfigurable