hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marc Limotte (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3551) Loaded hfile indexes occupy a good chunk of heap; look into shrinking the amount used and/or evicting unused indices
Date Wed, 23 Feb 2011 01:49:38 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12998125#comment-12998125

Marc Limotte commented on HBASE-3551:

Here's some more detail about the situation that Stack and I saw:

>From region server UI (via lynx)
HBase Version 0.90.0, r0b7903c50eef589c632582f7d9d6364eb3912c38 HBase version and svn revision
HBase Compiled Mon Jan 24 20:44:24 UTC 2011, root When HBase version was compiled and by whom
Metrics request=0.0, regions=107, stores=214, storefiles=381, storefileIndexSize=2983, memstoreSize=0,
compactionQueueSize=29, usedHeap=3774, maxHeap=7141, blockCacheSize=509777848, blockCacheFree=987798472,
blockCacheCount=7557, blockCacheHitCount=60151, blockCacheMissCount=38698247, blockCacheEvictedCount=0,
blockCacheHitRatio=0, blockCacheHitCachingRatio=88 RegionServer Metrics; file and heap sizes
are in megabytes
Zookeeper Quorum ip-xxxxxxxxx.ec2.internal:2181 Addresses of all registered ZK servers
So, almost 3gb for the index 

1-2 stores per region, storefile-size = 1gb, hbase block size = 64k
num-of-entries-per-storefile = storefile-size / hbase-block-size  
estimated index size = num-of-entries-per-storefile * num-store-files * key-and-entry-size

key-and-entry-size = 20 to 200 => 150  (guess)
estimated index size = (1G / 64K) * 381 * 150 = 900M (much less than 2983M)
This doesn't account for any overhead in the index, but it's hard to imaging that the overhead
would account for 3X size difference.

Also, our compaction queue is fairly deep (due to forced major compactions). What impact could
that have storefileIndexSize?

> Loaded hfile indexes occupy a good chunk of heap; look into shrinking the amount used
and/or evicting unused indices
> --------------------------------------------------------------------------------------------------------------------
>                 Key: HBASE-3551
>                 URL: https://issues.apache.org/jira/browse/HBASE-3551
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: stack
> I hung with a user Marc and we were looking over configs and his cluster profile up on
ec2.  One thing we noticed was that his 100+ 1G regions of two families had ~2.5G of heap
resident.  We did a bit of math and couldn't get to 2.5G so that needs looking into.  Even
still, 2.5G is a bunch of heap to give over to indices (He actually OOME'd when he had his
RS heap set to just 3G; we shouldn't OOME, we should just run slower).  It sounds like he
needs the indices loaded but still, for some cases we should drop indices for unaccessed files.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message