Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C2C44188CA for ; Sun, 30 Aug 2015 21:59:46 +0000 (UTC) Received: (qmail 80361 invoked by uid 500); 30 Aug 2015 21:59:45 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 80186 invoked by uid 500); 30 Aug 2015 21:59:45 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 79938 invoked by uid 99); 30 Aug 2015 21:59:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Aug 2015 21:59:45 +0000 Date: Sun, 30 Aug 2015 21:59:45 +0000 (UTC) From: "Ted Malaska (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-14340) Add second bulk load option to Spark Bulk Load to send puts as the value MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Ted Malaska created HBASE-14340: ----------------------------------- Summary: Add second bulk load option to Spark Bulk Load to send puts as the value Key: HBASE-14340 URL: https://issues.apache.org/jira/browse/HBASE-14340 Project: HBase Issue Type: New Feature Components: spark Reporter: Ted Malaska Assignee: Ted Malaska Priority: Minor The initial bulk load option for Spark bulk load sends values over one by one through the shuffle. This is the similar to how the original MR bulk load worked. How ever the MR bulk loader have more then one bulk load option. There is a second option that allows for all the Column Families, Qualifiers, and Values or a row to be combined in the map side. This only works if the row is not super wide. But if the row is not super wide this method of sending values through the shuffle will reduce the data and work the shuffle has to deal with. -- This message was sent by Atlassian JIRA (v6.3.4#6332)