Building my own Rate Limiter

September 10, 2024

A rate limiter is a very useful component while building applications. It is useful for throttling unwanted or excess requests that the server recieves. Common rate limiting algorithms are Token Bucket, Leaky Bucket, Fixed Window Counter, Sliding Window Log, Sliding Window Counter. Depending on the usecase, requirements, the algorithms are used accordingly.

So here are few of the algorithms that I tried to implement and understand its workings. I have performance tested all the algorithms in postman with virtual users and tried to understand the patterns of success rates.

Token Bucket algorithm

It seemed the simplest of all algorithms. A bucket of size N is used to store tokens. The bucket is filled with tokens at the rate of R every time period.When a request arrives, a token is removed. If there are no tokens, the request is not accepted. Usually the bucket is assigned per user or IP address.

I tested with fixed 10 virtual users for over a minute, bucket capacity as 10 tokens and refill rate as 1 token per second.

Token bucket performance chart

As you can see, initially there are no errors. The bucket had a capacity of 10 tokens, which are exhausted within a second and thus, 90 percent of the request fails as there are 10 users trying to request the API, but only 1 token added per second.

Fixed Window Counter algorithm

The time is divided into fixed windows of N time interval and requests are counted in each window. Each request increases the counter value. If the counter is greater than the threshold, the request is rejected. The counter resets during the start of a new window.

I tested this with 10 virtual users with 60 second window and 60 request threshold. Fixed window counter performance chart

It clearly shows how the success rates are high at the start of each window followed by a number of rejections.At the start of the window, as the counter is zero, it allows all requests. But as soon as the counter exceeds the threshold value, the requests are rejected until the window resets.

Sliding Window Log algorithm

Each request's timestamp log are stored. Whenever a new request comes, the logs older than the window size are removed. If the number of existing requests are within the threshold, the request is accepted and its timestamp is stored or else rejected. You can observe that the window is not fixed here, as we are comparing each request's timestamp with the requests that fall within the window size.

Sliding Window Log performance chart

I tested this also with 10 virtual users with 60 second window and 60 request threshold. An observation that can be made is that it avoids the sudden spike in requests as depicted in the fixed window counter. It exhibits a smoother distribution of error requests.

Sliding Window Counter algorithm

It is a combination of fixed window counter and sliding window log algorithms. Here, we store the no of requests made in the previous and current window. Whenever a request arrives, we consider the time from the current timestamp to the past timestamps which fall within the window size.If the weighted number of requests is within the threshold, the request is accepted and the count is increased; otherwise, it is rejected.

Sliding Window Counter performance chart

For this too, I tested with 10 virtual users with 60 second window and 60 request threshold. The graph shows smooth distribution of error and success rates. It also avoids the spike in requests as observed in the fixed counter algorithm.

Conclusion

These implementations are just to understand how stuff works. The optimal window size or threshold value greatly depends on the very individual systems and their behaviour. The above values were just for demonstration.