Performance
Performance refers to the efficiency and speed of a system under a specific workload. If a system takes too long to respond or uses excessive resources even under low or minimal load, it has poor performance. A high performance system should consistently deliver response with low response time.
Scalability
Scalability refers to the system's ability to maintain performance as a system demand increases. If a system performs well under minimal load but slows down signficantly under high load, then the system is not scalable. A scalable system can efficiently handle the growing number of request by dymically adapting to the number of requests and allocate new resource capacities of the system. The resources can be scaled in two ways:
- Vertical scaling(scaling up): This is the type of scaling in which we increase the compute such as no of core, memory, disk space, etc in order to make the server serve more requests.
- Horizontal scaling(scaling out): This type of scaling adds multiple redundant servers that perform the same job, with a load balancer placed in front of them to distribute incoming traffic.
Latency
Latency refers to the time it takes for a system to process and response to a specific request or perform task. It is measured in units of time such as milliseconds, seconds, minutes, etc. As shown below the time for each request is the latency.
|> ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
64 bytes from 1.1.1.1: icmp_seq=1 ttl=59 time=3.02 ms
64 bytes from 1.1.1.1: icmp_seq=2 ttl=59 time=4.83 msThroughput
Throughput refers to number of successfull requests or data packets a system can process per unit of time, typically measured in requests per time
CAP Theorem
The CAP theorem stands for Consistency, Availability and Partition Tolerance and the theorem states that in a distrubuted system, it is impossible to have all three guarantees
Consistency
Consistency ensures that every read request receives the most recent write or an error response. This means that at any given time, all working nodes would respond with exact same data.
Availability
Availability ensures that every request recieves a non-error response, without guaranteeing that it contains the most recent data. This means that the system must stay operational and return a non-error response despite some nodes holding outdated data.
Partition Tolerance
Partition Tolerance ensures that the system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes.
The CAP tradeoff
It asserts that a system must choose 2 out of 3 guarantees but in distributed system network partitions are inevitable so one must choose between consistency and availability.
Consistency and Partition Tolerance
These systems ensure that the data is consistent and can tolerate network partitions, but at the cost of availability. During a partition, the system may reject some requests to maintain consistency. This is generally critical software that need data consistency such as banking systems and can afford some downtime.
Consistency and Availability
These systems ensure that the data is consistent and ensure full availability under the conditions that no network partitions occur. This is impractical for distributed systems as network partitions are inevitable in distributed systems. In a single node database systems, they can provide consistency and availability unless that singular nodes fails.
Availability and Partition Tolerance
These systems ensure full availability of data and can tolerate network partitions but at the cost of consistency. During a partition, different nodes may return different data. This is generally software that is not critical and would prefer to have full availability rather than consistent data such as social media platforms.
