Return-Path: X-Original-To: apmail-phoenix-dev-archive@minotaur.apache.org Delivered-To: apmail-phoenix-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6A1CC182D2 for ; Wed, 2 Mar 2016 18:39:21 +0000 (UTC) Received: (qmail 16744 invoked by uid 500); 2 Mar 2016 18:39:20 -0000 Delivered-To: apmail-phoenix-dev-archive@phoenix.apache.org Received: (qmail 16690 invoked by uid 500); 2 Mar 2016 18:39:20 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 16626 invoked by uid 99); 2 Mar 2016 18:39:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Mar 2016 18:39:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id E52AF18057C for ; Wed, 2 Mar 2016 18:39:19 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.549 X-Spam-Level: X-Spam-Status: No, score=-3.549 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.329] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id qgNHpceiG3iA for ; Wed, 2 Mar 2016 18:39:19 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with SMTP id EAFF15F341 for ; Wed, 2 Mar 2016 18:39:18 +0000 (UTC) Received: (qmail 13917 invoked by uid 99); 2 Mar 2016 18:39:18 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Mar 2016 18:39:18 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 4F3272C1F5D for ; Wed, 2 Mar 2016 18:39:18 +0000 (UTC) Date: Wed, 2 Mar 2016 18:39:18 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-1973) Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to reduce phase MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PHOENIX-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15176209#comment-15176209 ] James Taylor commented on PHOENIX-1973: --------------------------------------- I filed PHOENIX-2733 for this cleanup. Would be great if [~sergey.soldatov] could get to it for the RC. > Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to reduce phase > -------------------------------------------------------------------------------------------------- > > Key: PHOENIX-1973 > URL: https://issues.apache.org/jira/browse/PHOENIX-1973 > Project: Phoenix > Issue Type: Improvement > Reporter: Rajeshbabu Chintaguntla > Assignee: Sergey Soldatov > Fix For: 4.7.0 > > Attachments: PHOENIX-1973-1.patch, PHOENIX-1973-2.patch, PHOENIX-1973-3.patch, PHOENIX-1973-4.patch, PHOENIX-1973-5.patch, PHOENIX-1973-6.patch, PHOENIX-1973-7.patch > > > It's similar to HBASE-8768. Only thing is we need to write custom mapper and reducer in Phoenix. In Map phase we just need to get row key from primary key columns and write the full text of a line as usual(to ensure sorting). In reducer we need to get actual key values by running upsert query. > It's basically reduces lot of map output to write to disk and data need to be transferred through network. -- This message was sent by Atlassian JIRA (v6.3.4#6332)