hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Sharma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8370) Report data block cache hit rates apart from aggregate cache hit rates
Date Thu, 27 Jun 2013 00:00:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694361#comment-13694361

Varun Sharma commented on HBASE-8370:

Having a cache hit ratio of 80 % means that at least 80 % of my requests are fast (assuming
GC out of picture) - in the current scenario, it may map to a number like 99.9 % and tomorrow
if I had 0 % cache hits for data blocks, the number comes down to 99.5 % - I am able to calculate
this based on the numbers I paste above. It assumes a certain distribution b/w number of accesses
to Index blocks and Data blocks. Tomorrow, if the distribution changes, it may well be that
99.5 % overall cache hit ratio corresponds to 90 % hit rate on data blocks. So, I don't think
that "Overall cache hit ratio" is a good proxy for "Data block cache hit ratio".

As far as derivatives go, Miss count derivative can go up with other things like read request
count - so now we would also need to do a derivate on that counter and compare etc. On 0.94,
that number has been overflowing for us all the time and is -ve, is that being fixed in trunk

I dont think this is about counters vs gauges. I am fine with exposing counters per block
type. Right now, I just don't have any insight into the block cache which plays an important
role in serving reads. When a compaction happens and new files are written, I dont know the
number of cache misses for Index block vs Data block vs Bloom block. I would no longer know
how many Data blocks are being accessed and how many Index blocks etc

> Report data block cache hit rates apart from aggregate cache hit rates
> ----------------------------------------------------------------------
>                 Key: HBASE-8370
>                 URL: https://issues.apache.org/jira/browse/HBASE-8370
>             Project: HBase
>          Issue Type: Improvement
>          Components: metrics
>            Reporter: Varun Sharma
>            Assignee: Varun Sharma
>            Priority: Minor
> Attaching from mail to dev@hbase.apache.org
> I am wondering whether the HBase cachingHitRatio metrics that the region server UI shows,
can get me a break down by data blocks. I always see this number to be very high and that
could be exagerated by the fact that each lookup hits the index blocks and bloom filter blocks
in the block cache before retrieving the data block. This could be artificially bloating up
the cache hit ratio.
> Assuming the above is correct, do we already have a cache hit ratio for data blocks alone
which is more obscure ? If not, my sense is that it would be pretty valuable to add one.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message