hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7391) Review/improve HLog compression's memory consumption
Date Tue, 23 Apr 2013 21:45:18 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13639669#comment-13639669
] 

Sergey Shelukhin commented on HBASE-7391:
-----------------------------------------

Hmm. I seem to be missing something here... what is the point of all this LRU in this dictionary
and the whole indexToNode thing?
We cannot really drop entries and ensure correctness, can we.
I can see how array is faster than a hash map for the purpose of this class, but there's memory
concern.
Should this all be nuked?
                
> Review/improve HLog compression's memory consumption
> ----------------------------------------------------
>
>                 Key: HBASE-7391
>                 URL: https://issues.apache.org/jira/browse/HBASE-7391
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>             Fix For: 0.95.1
>
>
> From Ram in http://mail-archives.apache.org/mod_mbox/hbase-dev/201205.mbox/%3C00bc01cd31e6$7caf1320$760d3960$%25vasudevan@huawei.com%3E:
> {quote}
> One small observation after giving +1 on the RC.
> The WAL compression feature causes OOME and causes Full GC.
> The problem is, if we have 1500 regions and I need to create recovered.edits
> for each of the region (I don’t have much data in the regions (~300MB)).
> Now when I try to build the dictionary there is a Node object getting
> created.
> Each node object occupies 32 bytes.
> We have 5 such dictionaries.
> Initially we create indexToNodes array and its size is 32767.
> So now we have 32*5*32767 = ~5MB.
> Now I have 1500 regions.
> So 5MB*1500 = ~7GB.(Excluding actual data).  This seems to a very high
> initial memory foot print and this never allows me to split the logs and I
> am not able to make the cluster up at all.
> Our configured heap size was 8GB, tested in 3 node cluster with 5000
> regions, very less data( 1GB in hdfs cluster including replication), some
> small data is spread evenly across all regions.
> The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
> objects)*noofregions.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message