Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CD44C1008A for ; Wed, 26 Jun 2013 23:18:20 +0000 (UTC) Received: (qmail 53613 invoked by uid 500); 26 Jun 2013 23:18:20 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 53587 invoked by uid 500); 26 Jun 2013 23:18:20 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 53578 invoked by uid 99); 26 Jun 2013 23:18:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Jun 2013 23:18:20 +0000 Date: Wed, 26 Jun 2013 23:18:20 +0000 (UTC) From: "Elliott Clark (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8370) Report data block cache hit rates apart from aggregate cache hit rates MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694343#comment-13694343 ] Elliott Clark commented on HBASE-8370: -------------------------------------- bq.this number is always 99 % for us on all clusters That's why I said we need more decimal places for it. bq.Also, the different b/w 82 % cache hit ratio to 99 % cache hit ratio is enormous. But that 82% doesn't tell you anything all by itself. For a given work load is 80% good or bad. You don't know. That percentage is really only useful if you have a base line so it's equally informative uf the cache percentage to go from 99 and then falls to 98 or if it's 84 and falls to 83. Additionally gauges are bad. They just don't tell a great story. There's a lot of lossy data there, sampling times can skew your picture of what's actually happening. See [~phobos182]'s slides (https://speakerdeck.com/phobos182/metrics-at-pinterest) on why you should prefer counters over gauges. That's why I said that derivative of cache miss count is the best way to look at cache efficacy. It gives you an accurate count of the number of times you have to go to hdfs (not really disk since there can be os cache there). It also provides a good way to compare today to yesterday. > Report data block cache hit rates apart from aggregate cache hit rates > ---------------------------------------------------------------------- > > Key: HBASE-8370 > URL: https://issues.apache.org/jira/browse/HBASE-8370 > Project: HBase > Issue Type: Improvement > Components: metrics > Reporter: Varun Sharma > Assignee: Varun Sharma > Priority: Minor > > Attaching from mail to dev@hbase.apache.org > I am wondering whether the HBase cachingHitRatio metrics that the region server UI shows, can get me a break down by data blocks. I always see this number to be very high and that could be exagerated by the fact that each lookup hits the index blocks and bloom filter blocks in the block cache before retrieving the data block. This could be artificially bloating up the cache hit ratio. > Assuming the above is correct, do we already have a cache hit ratio for data blocks alone which is more obscure ? If not, my sense is that it would be pretty valuable to add one. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira