In 2000, Eric brewer propose an idea on distributed system. When you design a distributed system, you need consistency, availability and partition tolerance, but it is impossible to have all three, it is also called CAP Conjecture.
In 2002, CAP conjecture was proved by Seth Gilbert and Nancy Lynch from MIT, it became CAP Theorem. In the proof, it is impossible achieve all the three, but it is possible to achieve two of them, upon choosing the two will define characteristics of your system.
In a consistent system the view of the data is atomic at the all time. At any given point of time, if there are series of operation happened and state of the data is changed, any query being served post the change should have modified data.
Every read receives the most recent write or an error
In any highly available system should respond to the request with response, even if there is only one node available. So the data can be stale, but goal to serve with non failing response.
Every request receives a (non-error) response, without the guarantee that it contains the most recent write
In case of network failure, if nodes split in two groups, the system should support the processing in both sub-groups and system should be tolerant to network partition or drop in message between nodes.
The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes
What to choose?
CAP Conjecture states that it is impossible to have all three property for a distributed system. So we have left with combination two of the properties, as per proof, no distributed system is safe from network failure and hence the network partition tolerance should be there, so we have left with option to choose either Consistency or Availability
In CP systems we guarantee the consistency over availability, which means we are fine with sending error to user rather than sending stale data.
Pricing service in e-commerce need consistency over availability, any stale price will give wrong perception to user and lead to confusion. So we need our system to be Consistent and Partition tolerant.
Systems Supports CP
HBase, Redis, MongoDB etc.,
AP systems always guarantee the availability over consistency, if there is at least one node available it should be able to respond to request with response, even if the data is stale.
In e-commerce, cart is one of the service where you need high availability, even if you are not able to give current of the view cart, but you have to support actions such add to cart, change item in cart etc., In this case we are fine with stale data, but we need to able to available for the user.
Systems Supports AP
Elasticsearch, Amazon Dynamo DB, Cassandra and RIAK etc.,
As the partition tolerance can’t be removed from distributed systems. Any system which is CA should be centralised system. So we can have both Consistent and Available characteristics.
Systems Supports CA
MySQL, PostgreSQL and Oracle etc.,