Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Mon, 31 Aug 2015 16:56:46 +0000 (UTC)
From: "Nick Dimiduk (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12860584.1440971487000.212180.1441040206207@Atlassian.JIRA>
In-Reply-To: <JIRA.12860584.1440971487000@Atlassian.JIRA>
References: <JIRA.12860584.1440971487000@Atlassian.JIRA>
 <JIRA.12860584.1440971487198@arcas>
Subject: [jira] [Commented] (HBASE-14339) HBase Bulk Load and super wide
 rows
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-14339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14723684#comment-14723684 ] 

Nick Dimiduk commented on HBASE-14339:
--------------------------------------

See also HBASE-7743.

> HBase Bulk Load and super wide rows
> -----------------------------------
>
>                 Key: HBASE-14339
>                 URL: https://issues.apache.org/jira/browse/HBASE-14339
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Malaska
>            Priority: Minor
>
> This may not be a huge issues but it does come up.  If the number of columns in a row are to many then KeyValueSortReducer will blow up with a out of memory exception, because it uses a TreeMap to sort the columns with in the memory of the reducer.
> A solution would be to add the column family and qualifier to the key so the shuffle would handle the sort.
> The partitioner would only partition on the rowKey but ordering would apply to the RowKey, Column Family, and Column Qualifier.
> Look at the Spark Bulk load as an example.  HBASE-14150  


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)