hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Spiegelberg (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3421) Very wide rows -- 30M plus -- cause us OOME
Date Thu, 06 Jan 2011 00:10:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12978054#action_12978054

Nicolas Spiegelberg commented on HBASE-3421:

For interested parties...

From: Ted Yu
I used the command you suggested in HBASE-3421 on a table and got:

K: 0012F2157E58883070B9814047048E8B/v:_/1283909035492/Put/vlen=1308 
K: 0041A80A545C4CBF412865412065BF5E/v:_/1283909035492/Put/vlen=1311 
K: 00546F4AA313020E551E049E848949C6/v:_/1283909035492/Put/vlen=1866 
K: 0068CC263C81CE65B65FC5425EFEBBCD/v:_/1283909035492/Put/vlen=1191 
K: 006DB8745D6D1B624F77E0F06C177C0B/v:_/1283909035492/Put/vlen=1021 
K: 006F9037BD7A8F081B54C5B03756C143/v:_/1283909035492/Put/vlen=1382 

Can you briefly describe what conclusion can be drawn here ?

From: Nicolas Spiegelberg

You're basically seeing all the KeyValues in that HFile.  The format is basically:

K: <KeyValue.toString()>

If you look at KeyValue.toString(), you'll see that the format is roughly:


So, it looks like you only have one qualifier per row and each row is roughly ~1500 bytes
of data.  For the user with the 30K columns per row, you should see an output that contains
a ton of lines with the same row.  If you grep that row, cut the number after vlen=, and sum
the values, you can see the size of your rows on a per-Hfile basis.

> Very wide rows -- 30M plus -- cause us OOME
> -------------------------------------------
>                 Key: HBASE-3421
>                 URL: https://issues.apache.org/jira/browse/HBASE-3421
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: stack
> From the list, see 'jvm oom' in http://mail-archives.apache.org/mod_mbox/hbase-user/201101.mbox/browser,
it looks like wide rows -- 30M or so -- causes OOME during compaction.  We should check it
out. Can the scanner used during compactions use the 'limit' when nexting?  If so, this should
save our OOME'ing (or, we need to add to the next a max size rather than count of KVs).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message