hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14339) HBase Bulk Load and super wide rows
Date Mon, 31 Aug 2015 16:56:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723684#comment-14723684

Nick Dimiduk commented on HBASE-14339:

See also HBASE-7743.

> HBase Bulk Load and super wide rows
> -----------------------------------
>                 Key: HBASE-14339
>                 URL: https://issues.apache.org/jira/browse/HBASE-14339
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Malaska
>            Priority: Minor
> This may not be a huge issues but it does come up.  If the number of columns in a row
are to many then KeyValueSortReducer will blow up with a out of memory exception, because
it uses a TreeMap to sort the columns with in the memory of the reducer.
> A solution would be to add the column family and qualifier to the key so the shuffle
would handle the sort.
> The partitioner would only partition on the rowKey but ordering would apply to the RowKey,
Column Family, and Column Qualifier.
> Look at the Spark Bulk load as an example.  HBASE-14150  

This message was sent by Atlassian JIRA

View raw message