hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ben Manes (JIRA)" <>
Subject [jira] [Created] (HIVE-14983) Evaluate TinyLFU cache
Date Mon, 17 Oct 2016 05:33:58 GMT
Ben Manes created HIVE-14983:

             Summary: Evaluate TinyLFU cache
                 Key: HIVE-14983
             Project: Hive
          Issue Type: Improvement
          Components: llap
            Reporter: Ben Manes

The ORC low-level cache is either FIFO or LRFU, with the latter being the default. Least-Recently-Freq-Used
is an O(lg n) policy that tries to recency with frequency for the working set. It uses a heap
for frequency ordering and a linked list for recency ordering.

[TinyLFU|] is an O(1) policy that uses a sketch to probabilistically
estimate an entry's frequency. Instead of focusing on eviction, the policy filters out low-value
entries from entering the cache. It retains a larger history than the working set by retaining
the frequency in a compact counter array (e.g. [4-bit CountMin Sketch|]).
Simulations show that a small LRU window + TinyLFU + SLRU main cache has the [best performance|].

If the project is ready to adopt Java 8 then it could use [Caffeine|],
the successor to Guava's cache. It provides improved [concurrency|]
in addition to the higher hit rate.

But I think extracting the project's CountMin Sketch for a custom implementation would be
a win. The result would be simpler code and an improved hit rates.

This message was sent by Atlassian JIRA

View raw message