hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-900) Regionserver memory leak causing OOME during relatively modest bulk importing
Date Sun, 23 Nov 2008 22:53:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650069#action_12650069
] 

Andrew Purtell commented on HBASE-900:
--------------------------------------

Yes, RECORD compression on 'content' family, which will have up to two cells per row: 'content:raw'
will contain the response body written by a custom Heritrix hbase writer, and if the mimetype
is text/*, another cell 'content:document' containing a serialized Document object produced
by MozillaHtmlParser (http://sourceforge.net/projects/mozillaparser/). Some binary content
can be very large, e.g. 100MB zip, tgz, etc. Row index is SHA1 hash of content object. There
is also an 'info' family, not compressed, that stores attributes. Finally there is a 'urls'
family, not compressed, that will have a cell for each unique URL corresponding to the content
object. 

> Regionserver memory leak causing OOME during relatively modest bulk importing
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-900
>                 URL: https://issues.apache.org/jira/browse/HBASE-900
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.18.1, 0.19.0
>            Reporter: Jonathan Gray
>            Assignee: stack
>            Priority: Blocker
>         Attachments: memoryOn13.png
>
>
> I have recreated this issue several times and it appears to have been introduced in 0.2.
> During an import to a single table, memory usage of individual region servers grows w/o
bounds and when set to the default 1GB it will eventually die with OOME.  This has happened
to me as well as Daniel Ploeg on the mailing list.  In my case, I have 10 RS nodes and OOME
happens w/ 1GB heap at only about 30-35 regions per RS.  In previous versions, I have imported
to several hundred regions per RS with default heap size.
> I am able to get past this by increasing the max heap to 2GB.  However, the appearance
of this in newer versions leads me to believe there is now some kind of memory leak happening
in the region servers during import.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message