accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Fuchs (JIRA)" <>
Subject [jira] [Created] (ACCUMULO-4626) improve cache hit rate via weak reference map
Date Wed, 19 Apr 2017 18:24:41 GMT
Adam Fuchs created ACCUMULO-4626:

             Summary: improve cache hit rate via weak reference map
                 Key: ACCUMULO-4626
             Project: Accumulo
          Issue Type: Improvement
          Components: tserver
            Reporter: Adam Fuchs

When a single iterator tree references the same RFile blocks in different branches we sometimes
get cache misses for one iterator even though the requested block is held in memory by another
iterator. This is particularly important when using something like the IntersectingIterator
to intersect many deep copies. Instead of evicting completely, keeping evicted blocks into
a WeakReference value map can avoid re-reading blocks that are currently referenced by another
deep copied source iterator.

We've seen this in the field for some of Sqrrl's queries against very large tablets. The total
memory usage for these queries can be equal to the size of all the iterator block reads times
the number of readahead threads times the number of files times the number of IntersectingIterator
children when cache miss rates are high. This might work out to something like:

16 readahead threads * 200 deep copied children * 99% cache miss rate * 20 files * 252KB per
reader = ~16GB of memory

In most cases, evicting to a weak reference value map changes the cache miss rate from very
high to very low and has a dramatic effect on total memory usage.

This message was sent by Atlassian JIRA

View raw message