hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-900) Regionserver memory leak causing OOME during relatively modest bulk importing
Date Wed, 10 Dec 2008 18:14:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12655297#action_12655297

stack commented on HBASE-900:

Studying my replica of Tim Sell job -- i.e. using TOF and seeing 100k+ BatchUpdates in an
array held in the HBaseRPC#Invocation#parameters field -- I now conclude that TOF is operating
"as-advertised".  Default is that client marshalls 10MB of data. In PE case, this is 12k edits
(We measure the BU to be of size 1039 bytes which is probably low-ball looking at BU up in
jhat but near-enough).  If server is running 10 handlers, then a common case is 10x10MB of
edits just sitting around while the batch of edits are being processed server-side.  We should
set the client-side 10MB down to maybe 2MB as default but this is not the root cause of the
Tim Sell OOME (Avoiding TOE, he ran longer but still OOME'd).  In his case, the10MB holds
even more edits -- 70k for 10MB seems viable after he described his data format -- and that
allowing that our accounting of object sizes is coarse, that the 'deep size' reported in the
profiler of 318MB is probably about right.

So, TODO, set the client-side batch of edits flush size down from 10MB to 2MB.

Now to look at latest Tim Sell heap dump.

> Regionserver memory leak causing OOME during relatively modest bulk importing
> -----------------------------------------------------------------------------
>                 Key: HBASE-900
>                 URL: https://issues.apache.org/jira/browse/HBASE-900
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.18.1, 0.19.0
>            Reporter: Jonathan Gray
>            Assignee: stack
>            Priority: Blocker
>         Attachments: 900.patch, memoryOn13.png
> I have recreated this issue several times and it appears to have been introduced in 0.2.
> During an import to a single table, memory usage of individual region servers grows w/o
bounds and when set to the default 1GB it will eventually die with OOME.  This has happened
to me as well as Daniel Ploeg on the mailing list.  In my case, I have 10 RS nodes and OOME
happens w/ 1GB heap at only about 30-35 regions per RS.  In previous versions, I have imported
to several hundred regions per RS with default heap size.
> I am able to get past this by increasing the max heap to 2GB.  However, the appearance
of this in newer versions leads me to believe there is now some kind of memory leak happening
in the region servers during import.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message