Three Problems with Caching Systems

Posted on 2019-12-27 Edited on 2025-10-12 In Sundry Waline:

With the gradual improvement of the development of the Internet system and the improvement of the system’s qps, most of the current systems have increased the caching mechanism to avoid excessive requests and direct database operations, resulting in system bottlenecks, greatly improving User Experience and system stability.

While caching brings performance improvements, it also has some problems.

Cache avalanche

Definition

Cache avalanche refers to the situation where the cache server is restarted or a large number of caches are concentrated in a certain period of time, causing instantaneous load increase pressure on the backend database, or even crushing the database.

Solution

Thread mutual exclusion

Only one thread is allowed to build the cache, and the other threads wait for the thread that built the cache to finish executing before retrieving data from the cache. Only one thread is executing the request at each time, which relieves the pressure on db, but the disadvantage is also obvious, reducing the system’s qps.

Staggered failure time

This method is relatively simple and crude time, since the failure at the same time will cause too many requests avalanche, then we staggered different failure time can avoid this problem from a certain length, when the cache failure time setting, from a random time in the appropriate value range as the failure time.

Downgrade

For the overall service, when the overall load exceeds the overall load capacity, delay or suspend non-important services, such as log collection services, to ensure the normal operation of important services.

Fusing

For a single service, when a service is overloaded, a protective measure is adopted to prevent the failure of the entire system. Directly close the service (more violent) or ensure that part of the request succeeds, and the other part directly returns failure (does not occupy service resources), eg. If the number of consecutive request failures reaches 20 times within 5 seconds, then trigger the circuit breaker mechanism to filter 60% of the requests.

Limited viewership of

Limit viewership of can be considered a kind of service degradation, limit viewership of is to limit the amount of access to the system has reached the purpose of shielding system.

Common limited viewership of algorithms are: counter, leaky bucket and token bucket algorithm.

** Counter: **

A typical application scenario is to stipulate that only a fixed number of requests are allowed to pass within a fixed time. For example, only 100 requests are allowed to be processed in one minute.

** Basic idea **: There is a counter. When a request arrives, first determine whether it is in a new time period. If it is in a new time period, you can clear the counter to 0, and then add one to the count of the current request. If it is not in the new time period, determine whether the number of current counters exceeds the specified number, such as 100. If it exceeds, the request will be discarded, otherwise add one to the counter to allow the request to pass.
** There is a problem **: The accuracy is not enough, and the requests are not evenly distributed. When in the critical section of time, problems may occur. For example, if it is stipulated to process 100 requests a minute, there will be 100 requests in the first second, which will also cause system stress, and the system will be idle for the next 59 seconds.

** Leaky cylinder algorithm: **

a leaky bucket of fixed capacity, with droplets flowing out at a constant fixed rate;
If the bucket is empty, there is no need for droplets to flow out;
Can flow water droplets into leaky buckets at any rate;
If the inflow droplet exceeds the capacity of the bucket, the inflow droplet overflows (is discarded) and the leaking bucket capacity is unchanged.

** Token bucket algorithm: **

A bucket with a fixed capacity of tokens, and tokens are added to the bucket at a fixed rate.

Tokens will be placed in the token bucket at a fixed rate. For example 10 per second.
A maximum of b tokens are stored in the bucket, and when the bucket is full, the newly added tokens are discarded or rejected.
When a data packet of n bytes arrives, n tokens are removed from the bucket, and the data packet is sent to the network.
If there are less than n tokens in the bucket, the tokens will not be deleted and the data packet will be limited viewership of (either discarded or buffered waiting).

Cache penetration

Definition

Cache penetration refers to the use of non-existent keys for a large number of highly concurrent queries, which leads to cache misses, and each request must penetrate to the backend Database System for queries, causing excessive pressure on the database and even crushing the database service.

Solution

We usually cache the null value. When we receive the same query request again, if it hits the cache and the value is empty, it will return directly, and will not pass-through to the database to avoid cache penetration. Of course, sometimes malicious attackers can guess that we use this scheme, and we will use different parameters to query each time, which requires us to filter the input parameters. For example, if we use ID to query, we can analyze the format of the ID. If it does not meet the rules for generating the ID, it will be rejected directly, or put time information on the ID, and judge whether the ID is legal or whether it is the ID we have generated according to the time information, so that certain invalid requests can be intercepted.

Of course, a more common solution is to use a Bloom filter

Bloom filter

Principle

When an element is added to the set, the K hash function maps the element to K points in a bit array and sets them to 1. When searching, we only need to see if these points are all 1 to know (approximately) whether it is in the set: if these points have any zeros, the checked element must not be there; if they are all 1, the checked element is likely to be there. This is the basic idea of Bloom filter.

Assuming that we now have a blom filter with size = 10000, what is assigned to us is actually 10000 binary bits. Assuming ‘k = 8’, then when the new element’huyanshi ‘is added, through 8 different’hash functions’, you can get 8 index values, assuming it is’ 1,2300,222,11… ', we will set the position of these 8 indices as subscripts to 1.

When we want to retrieve whether’huyanshi 'exists, we use 8 hash functions again to obtain 8 indices. If the position of these 8 indices is 1, then this element is very likely to exist. If one of them is 0, then the element must not exist.

The principle of Bloom filter is relatively simple, but the main difficulty is how to design an array of reasonable length and how to design multiple hash functions to map the data to the entire array.

How to choose the number of hash functions and bloom filter length

Obviously, if the Bloom filter is too small, all bits will be 1 soon, so any query value will return “may exist”, which will not serve the purpose of filtering. The length of the Bloom filter will directly affect the false positive rate, and the longer the Bloom filter, the smaller the false positive rate.

In addition, the number of hash functions also needs to be weighed. The more the number, the faster the speed of the Bloom filter bit position 1, and the lower the efficiency of the Bloom filter; but if it is too small, then our false positive rate will become higher.

How to choose the k and m values suitable for the business? Here is a formula directly posted:

m = -(n * lnp) / (ln2)^2

k = (m /n) * ln2

K is the number of hash functions, m is the length of the Bloom filter, n is the number of elements inserted, and p is the false positive rate

Cache breakdown

Definition

Cache breakdown refers to the use of non-existent keys for a large number of highly concurrent queries, which leads to cache misses, and each request must be broken down to the backend Database System for queries, causing excessive pressure on the database and even crushing the database service. ** The difference between this and cache avalanche is that here is a cache for a certain key, while the former is a lot of keys **.

Solution

Level 2 cache

For hotspot data level 2 cache, and for different levels of cache set different expiration time, the request will not directly penetrate the cache layer reaches the database.

These solutions already have relatively mature third-party packages that can be called.

Reference article:

https://juejin.im/post/5b604b9ef265da0f62639001

https://zhuanlan.zhihu.com/p/43263751