Caching: Enhancing Performance and Scalability

Caching: Enhancing Performance and Scalability

·

8 min read

Caching is a pivotal component in system design that significantly influences the performance and scalability of applications. In this blog, we'll delve into details of caching, its types, eviction strategies, and cache consistency, all with real-world examples and Java Spring. Let's understand how caching optimizes your system.

Table of Contents

  1. How It Works: An Overview of Caching

    • Caching Fundamentals

    • Benefits of Caching

    • When to Use Caching

  2. Types of Caching

    • Application Caching

    • Browser Caching

    • Distributed Caching

    • Global Cache

    • Content Delivery Network (CDN)

    • Database Query Caching

    • In-Memory Caching (e.g., Redis, Memcached)

  3. Cache Invalidation: Managing Stale Data

    • Write Through Cache

    • Write Around Cache

    • Write Back Cache

  4. Cache Eviction Strategies: Making Space for New Data

    • Refresh-Ahead

    • Least Recently Used (LRU)

    • Least Frequently Used (LFU)

    • Most Recently Used (MRU)

    • Random Replacement

How It Works: An Overview of Caching

Caching Fundamentals

Caching is the process of storing frequently accessed data in a high-speed data storage layer, reducing the need to fetch the data from the original source. It can be a database, a web service, or any other data repository. Caching works on the principle of fetching data once and serving it from the cache for subsequent requests.

Benefits of Caching

Caching offers several advantages:

  • Faster Response Times: Cached data retrieval is much quicker than fetching from the source, resulting in faster response times for users.

  • Reduced Load on Resources: It eases the load on the data source, ensuring it doesn't get overwhelmed with frequent requests.

  • Improved Scalability: Caching can be distributed, making it easier to scale your application horizontally.

  • Enhanced Reliability: Cached data can act as a backup, ensuring system availability even if the original source experiences downtime.

When to Use Caching

Caching is invaluable in scenarios where data access is a bottleneck. Use it when:

  • Data doesn't change frequently.

  • Data is costly to compute or retrieve.

  • Frequent data requests impact system performance.

In the next sections, we'll explore different types of caching and delve into cache invalidation strategies.

Disadvantages of Caching

While caching brings substantial benefits to a system, it's important to be aware of its limitations:

  • Potential Stale Data: Caching introduces the risk of serving outdated or stale data, especially if the cache is not properly managed or invalidated.

  • Increased Complexity: Implementing caching adds complexity to the system, requiring careful consideration of cache synchronization and invalidation strategies.

  • Memory Overhead: In-memory caching solutions like Redis or Memcached may consume a significant amount of RAM, potentially limiting the amount of data that can be cached.

  • Cache Coherency: In distributed environments, maintaining cache coherency across multiple nodes can be challenging and may require additional synchronization mechanisms.

When Not to Use Caching

While caching is a powerful tool, there are scenarios where it may not be the best solution:

  • Highly Dynamic Data: If the data changes rapidly and frequently, caching may lead to a high rate of stale data, rendering it less effective.

  • Limited Available Memory: In situations where memory resources are constrained, caching may not be feasible, as it could lead to performance degradation or even system crashes.

  • Security Concerns: Caching sensitive or confidential information may pose security risks if the cache is not adequately protected or encrypted.

  • Small-scale Applications: For small-scale applications with minimal traffic and low data access latency, the overhead of implementing caching may outweigh the benefits.

Types of Caching

Application Caching

Application caching involves storing data within an application's memory to optimize data retrieval. Java Spring provides a robust framework for implementing application caching. to know more click

Java Spring Code Example

// Define a simple cache
@Cacheable("myCache")
public MyData retrieveData(String key) {
    // Logic to fetch data
}

Browser Caching

Browser caching stores web page assets like images, scripts, and styles locally on the user's device. This reduces the need for repeated downloads and speeds up page loading.

Distributed Caching

Distributed caching spans multiple servers, enabling data sharing and load balancing across applications.

Global Cache

Global caching involves a centralized cache accessible across all application instances.

Content Delivery Network (CDN)

CDNs cache content geographically closer to users, ensuring fast content delivery.

Database Query Caching

Database query caching stores frequently used SQL query results to optimize database access.

In-Memory Caching

In-memory caching systems like Redis and Memcached store data in RAM, ensuring blazing-fast data retrieval.

Page caching

Saving a webpage's final rendered content so that it can be quickly delivered to subsequent visitors without the need to reprocess or regenerate the page.

Object Caching

Storing the results of expensive computations, database queries, or other operations in memory or a cache store. This technique is commonly used to improve the performance of applications by reducing the time it takes to access frequently requested data.

In the upcoming sections, we'll discuss cache invalidation and eviction policies.

Cache Invalidation: Managing Stale Data

Cache invalidation is the process of removing or updating cached data to maintain data consistency.

Write Through Cache

A write-through cache writes data both to the cache and the underlying data source, ensuring data consistency.

Write Around Cache

A write-around cache writes data only to the underlying data source, skipping the cache. Useful for infrequently accessed data.

Write Back Cache

A write-back cache initially writes data to the cache and later synchronizes it with the underlying data source.

Cache Eviction Strategies: Making Space for New Data

Cache eviction strategies decide which data to remove from the cache when it reaches its capacity.

Refresh-Ahead

Refresh-ahead involves pre-fetching data before it expires, reducing the chances of serving stale data.

Least Recently Used (LRU)

LRU removes the least recently accessed data from the cache when it reaches capacity.

Least Frequently Used (LFU)

LFU removes the least frequently accessed data from the cache.

Most Recently Used (MRU)

MRU keeps the most recently accessed data in the cache.

Random Replacement

Random replacement evicts data randomly from the cache.

Conclusion

Caching enhances the performance and scalability of systems. By implementing the right caching strategy and eviction policy, you can significantly improve your application's responsiveness. Understanding caching is a fundamental skill for any technical professional.

In your system design journey, remember that caching is your ally in achieving fast, responsive, and reliable applications.


Sharding is a database optimization technique for achieving horizontal scalability in databases, instead of increasing the size of the database we can split the database (existing database) into smaller ones this is the concept of sharding. Sharding can also be said as a specific type of partitioning, this is about horizontal sharding (not vertical sharding), only horizontal sharding is applicable for scaling large databases. In short horizontal sharding is the optimization technique that splitting one large database into multiple small databases so that now instead of having one big database we have smaller individual ones and each of those can have the same hardware setup thus getting more performance and more data storage.

Optimization techniques before Sharding:

  1. Scaling up the Hardware - there is a limit as increasing the RAM/CPU/hardware components has a limit and its cost to too much and with an increase in components the performance need not increase linearly or whatever because there is a limit to the load system can handle.

  2. Add replicas of the same database/ duplicates - say we have 3 instances of same database one master and 2 slaves. Every write operation is performed on the master and all read operation is done on the slave database in this way the load is balanced, but with this comes the problem of consistency, every time the write operation is done on the master it has to update all its slave. There may be delays and data is not consistent across the database.

  3. Finally SHARDING – splitting one full data base into small databases say using the most unique key possible. For example we have a customer table with attributes of csut_id,cust_namr,cust_points. Cust_id will be the most unique key and using that instead of having all the data in a single database this is split into different databases say from cus_id from 1-3 is on one database, 4-6 on second instance of database, 7-9 is another database. Now we have different database for the same table it becomes mandatory to have a hashing function (which maps the correct customer request to the correct database table) or a Routing interface/layer that maps to the correct customer request to the correct database table. THIS IS CALLED A INTERMEDIARY layer(which is between the application layer and database layer). This process forces to have this layer

Pros:

  1. Infinite horizontal scalability
  1. Availability & Fault tolerance – even If one set or one database goes down (1-3) the other database still exists so the application runs.

Cons:

  1. Complexity – pattern Mapping
  1. Interaction layer/ intermediary layer

  2. Non-uniformity - say the data is sharded based on the customer but we have one client who is very big and not held in a single sharded database in this case for the same customer resharding is required

4. Make queries complicated – say we need to get all the customer data the query has to go through all the database and its tables so the intermediatory layer or routing layer needs to know where all this is located and querry should have logic to a. perform the query, summarize, c. aggregate and send back


Consistent hashing