hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7391) Review/improve HLog compression's memory consumption
Date Wed, 07 Aug 2013 18:06:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732235#comment-13732235
] 

ramkrishna.s.vasudevan commented on HBASE-7391:
-----------------------------------------------

{code}
 if (recoveredEdits) {
      // This will never change
      regionDict.init(1);
      tableDict.init(1);
      rowDict.init(Short.MAX_VALUE);
    } else {
      regionDict.init(Short.MAX_VALUE);
      tableDict.init(Short.MAX_VALUE);
      rowDict.init(Short.MAX_VALUE);
    }
    familyDict.init(Byte.MAX_VALUE);
    qualifierDict.init(Byte.MAX_VALUE);
{code}
If we make a change like above for the recovered.edits case as described in the above description
we would be reducing the memory 4 times i.e from 5 MB to 1MB for every writer instantiated.
For the regionname and table name we know that for sure not more than 1 entry would be there.
For the family and qualifier dictionary (both in the normal case and recovered edits) we could
make it Max value of byte.  Anyway it is LRU type so if any use case has more than 127 qualifiers
per CF that would go in roundrobin fashion.
Let the rowDict be as Short.MAX_VALUE.

So when the size reduces from 5MB to 1MB overall the size for 1500 regions would come down
to 1.5 GB for 1500 regions (from ~7GB).  What do you think of this change?
I can submit a patch based on this.
                
> Review/improve HLog compression's memory consumption
> ----------------------------------------------------
>
>                 Key: HBASE-7391
>                 URL: https://issues.apache.org/jira/browse/HBASE-7391
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.95.2
>
>
> From Ram in http://mail-archives.apache.org/mod_mbox/hbase-dev/201205.mbox/%3C00bc01cd31e6$7caf1320$760d3960$%25vasudevan@huawei.com%3E:
> {quote}
> One small observation after giving +1 on the RC.
> The WAL compression feature causes OOME and causes Full GC.
> The problem is, if we have 1500 regions and I need to create recovered.edits
> for each of the region (I don’t have much data in the regions (~300MB)).
> Now when I try to build the dictionary there is a Node object getting
> created.
> Each node object occupies 32 bytes.
> We have 5 such dictionaries.
> Initially we create indexToNodes array and its size is 32767.
> So now we have 32*5*32767 = ~5MB.
> Now I have 1500 regions.
> So 5MB*1500 = ~7GB.(Excluding actual data).  This seems to a very high
> initial memory foot print and this never allows me to split the logs and I
> am not able to make the cluster up at all.
> Our configured heap size was 8GB, tested in 3 node cluster with 5000
> regions, very less data( 1GB in hdfs cluster including replication), some
> small data is spread evenly across all regions.
> The formula is 32(Node object size)*5(No of dictionary)*32767(no of node
> objects)*noofregions.
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message