TTL in Redis

Redis is the most enjoyable and powerful technology I've used, and its TTL feature has been incredibly useful in many situations. Curious about how it works, I decided to learn more and share what I discovered.

What is TTL in Redis?

TTL stands for "time to live" in Redis. It's a mechanism used to set an expiration time for keys in Redis. Once the TTL expires, the key is automatically deleted from the database.

When to use TTL?

It's commonly used for managing resources with time constraints. I have mainly used it for 2 use cases:

For caching the results of an expensive database query that doesn’t change for a certain amount of time.
Blocking certain troublesome user IDs for a certain period.

Where is the TTL information stored?

The TTL information is not stored alongside the keys in the main hash table, but instead, only the info of volatile keys is stored in a much smaller secondary hash table.

How does Redis expire keys pre-v6.0?

Passive eviction: When a key is requested, it is checked if it has expired or not. If yes then return nil else return the key.
Active eviction: Every x seconds, Redis runs a process that will randomly select a sample of keys from the secondary hashtable (because we need to run this process only on the volatile keys) to check for expiration and delete them. If the number of keys expired > 25% of the sample set then run the function again. (Note: Even though Redis is a single-threaded application, the eviction process runs in a separate thread from the read/write thread). We can reduce this threshold to less than 25% but then more CPU cycles will be uitilised.

How does Redis expire keys post-v6.0?

The problem with the original approach, in the worst case 25% of memory is wasted on expired keys. If we could store keys that are currently not expired but are about to expire very shortly efficiently then this would solve the problem.

So Redis team came up with a brilliant approach to do this. While running active eviction, if a key is expired then it is immediately deleted. But if it is about to expire then it is stored in a radix tree sorted by their expiry timestamp. In the next cycle instead of first running the active eviction on the secondary hash table, it runs it first on the radix tree and then proceeds to the secondary hash table.

But why use the Radix tree to store this data? In a radix tree, each node represents a common prefix of some keys or part of the timestamps, and edges connect nodes with their sub-prefixes. For keys with expiration times, the tree would organize these keys in such a way that keys expiring sooner are found more quickly than those expiring later, facilitating an efficient expiration process.