Caching

Caching is the process of storing frequently accessed data in a temporary storage location, called a cache, in order to quickly retrieve it without the need to query the original data source thus improving the performance of application by reducing the latency to fetch data. Additionally, caching can help reduce the load on web servers and other resources, which can help reduce costs and improve scalability.

Cache Strategies

There are several caching strategies:

Cache-Aside

The application sits between the cache and the database. The cache does not interact with the database directly

The application checks the cache. If it is a cache hit, the data is returned. If it is a cache miss, the application queries the database, writes the retrieved data to the cache, and then returns it to the client.
The application writes updates directly to the database and then invalidates (deletes) the corresponding key in the cache
Highly resilient to cache failures, memory-efficient because data is only cached when explicitly requested.
high latency on a cache miss because three network hops are required.

Read Through

The application treats the cache as the primary data store. The cache engine itself handles fetching data from the database on a miss

The application queries the cache. On a cache miss, the cache library/software automatically queries the database, updates its internal memory, and returns the data to the application.
Typically paired with a direct database write or a write-through strategy
Simplifies application code since the data fetching logic is abstracted into the caching layer. Optimized for heavy read workloads
Requires custom plugins or configuration within the caching layer to communicate with specific database schemas.

Write Through

Every write operation must pass through the cache before being committed to the database.

The application writes data to the cache. The cache synchronously updates the database. The write transaction only completes when both stores are updated.
Standard read lookup from the cache.
Guarantees strong data consistency between the cache and the database. Freshly written data is instantly available for reads with zero cache misses.
High write latency because every write requires a synchronous dual-write penalty to both memory and disk.

Write Behind

The application writes data directly to the cache, which acknowledges the write immediately. The cache then asynchronously flushes the updates to the database in batches.

The application writes to the cache. The cache confirms success immediately. A background daemon or queue later flushes these accumulated changes to the database.
Extremely low write latency and high write throughput. It cushions the database from write spikes by coalescing multiple updates to the same row into a single database write.
Risk of data loss. If the cache node crashes or suffers power failure before the asynchronous queue is flushed, the data is permanently lost.

Refresh Ahead

Refresh-Ahead is a proactive caching strategy where the cache engine automatically reloads a cached item before it expires, based on its access patterns. It prevents the latency penalty of a cache miss by ensuring that frequently accessed data is always hot and up to date in memory Refresh Ahead

Types of Caching

Client Caching: Client-side caching refers to the practice of storing frequently accessed data on the client's device rather than the server such as web browsers cache frequently accessed web pages and resources.
CDN Caching: Data is cached on a distributed network of proxy servers deployed globally, geographically closer to the end-users. When a user requests content, the request is routed to the closest CDN Edge server. If the edge server has the asset cached, it serves it directly, bypassing the origin server.
Web Server Caching: Data is cached at the entry point of the server infrastructure, right before requests hit the application logic. A reverse proxy server intercepts incoming HTTP requests. It caches entire web page outputs or API responses generated by the backend application.
Application Caching: Data is cached within the application runtime environment, storing specific application objects or execution results. Uses local memory (RAM) allocation within the application process to save high-overhead computational results, such as parsed configuration files, session tokens, or the output of complex algorithms.
Distributed In-Memory Caching: An independent, dedicated caching tier shared across multiple application server nodes, sitting between the application layer and the database
Database Caching: The database reserves a portion of system memory to store execution plans, indexes, and frequently accessed disk blocks.

Measuring Cache Effectiveness

Calculating cache hit rate
Calculating cache eviction rate
Monitor data consistency
Determining the right TTL

references:

Caching Strategies

Caching ​

Cache Strategies ​

Cache-Aside ​

Read Through ​

Write Through ​

Write Behind ​

Refresh Ahead ​

Types of Caching ​

Measuring Cache Effectiveness ​