CAP Theorem

In 2000, Eric brewer propose an idea on distributed system. When you design a distributed system, you need consistency, availability and partition tolerance, but it is impossible to have all three, it is also called CAP Conjecture.

In 2002, CAP conjecture was proved by Seth Gilbert and Nancy Lynch from MIT, it became CAP Theorem. In the proof, it is impossible achieve all the three, but it is possible to achieve two of them, upon choosing the two will define characteristics of your system.

CAP Theorem
Consistency

In a consistent system the view of the data is atomic at the all time. At any given point of time, if there are series of operation happened and state of the data is changed, any query being served post the change should have modified data.

Every read receives the most recent write or an error
Availability

In any highly available system should respond to the request with response, even if there is only one node available. So the data can be stale, but goal to serve with non failing response.

Every request receives a (non-error) response, without the guarantee that it contains the most recent write
Partition Tolerence

In case of network failure, if nodes split in two groups, the system should support the processing in both sub-groups and system should be tolerant to network partition or drop in message between nodes.

Partition Tolerance
The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
What to choose?

CAP Conjecture states that it is impossible to have all three property for a distributed system. So we have left with combination two of the properties, as per proof, no distributed system is safe from network failure and hence the network partition tolerance should be there, so we have left with option to choose either Consistency or Availability

CP System

In CP systems we guarantee the consistency over availability, which means we are fine with sending error to user rather than sending stale data.

Usecase

Pricing service in e-commerce need consistency over availability, any stale price will give wrong perception to user and lead to confusion. So we need our system to be Consistent and Partition tolerant.

Systems Supports CP

HBase, Redis, MongoDB etc.,

AP System

AP systems always guarantee the availability over consistency, if there is at least one node available it should be able to respond to request with response, even if the data is stale.

Usecase

In e-commerce, cart is one of the service where you need high availability, even if you are not able to give current of the view cart, but you have to support actions such add to cart, change item in cart etc., In this case we are fine with stale data, but we need to able to available for the user.

Systems Supports AP

Elasticsearch, Amazon Dynamo DB, Cassandra and RIAK etc.,

CA System

As the partition tolerance can’t be removed from distributed systems. Any system which is CA should be centralised system. So we can have both Consistent and Available characteristics.

Systems Supports CA

MySQL, PostgreSQL and Oracle etc.,

Reference

https://users.ece.cmu.edu/~adrian/731-sp04/readings/GL-cap.pdf

https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing