Implementation Details - How we index SST - 《RocksDB v6.14 Document》

On level 0, files are sorted based on the time they are flushed. Their key range (as defined by FileMetaData.smallest and FileMetaData.largest) are mostly overlapped with each other. So it needs to look up every L0 file.

One observation to this problem is that: after the LSM tree is built, an SST file’s position in its level is fixed. Furthermore, its order relative to files from the next level is also fixed. Based on this idea, we can perform kind of optimization to narrow down the binary search range. Here is an example:

Let’s look at another example. We want to get key 230. A binary search on level 1 locates to file 2 (this also implies key 230 is larger than file 1’s FileMetaData.largest 200). A comparison with file 2’s range shows the target key is smaller than file 2’s FileMetaData.smallest 300. Even though we couldn’t find key on level 1, we have derived hints that target key is in range between 200 and 300. Any files on level 2 that cannot overlap with [200, 300] can be safely excluded. As a result, we only need to look at file 5 and file 6 on level 2.

Our benchmark shows that this optimization improves lookup QPS by ~5% for the setup mentioned here.