phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1973) Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to reduce phase
Date Tue, 16 Feb 2016 08:48:18 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148279#comment-15148279
] 

Hudson commented on PHOENIX-1973:
---------------------------------

SUCCESS: Integrated in Phoenix-master #1138 (See [https://builds.apache.org/job/Phoenix-master/1138/])
PHOENIX-1973 Improve CsvBulkLoadTool performance by moving keyvalue (rajeshbabu: rev e797b36c2ce42e9b9fd6b37fd8b9f79f79d6f18f)
* phoenix-core/src/main/java/org/apache/phoenix/mapreduce/FormatToKeyValueMapper.java
* phoenix-core/src/main/java/org/apache/phoenix/mapreduce/AbstractBulkLoadTool.java
* phoenix-core/src/main/java/org/apache/phoenix/mapreduce/FormatToKeyValueReducer.java
* phoenix-core/src/main/java/org/apache/phoenix/mapreduce/bulkload/TargetTableRefFunctions.java


> Improve CsvBulkLoadTool performance by moving keyvalue construction from map phase to
reduce phase
> --------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-1973
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1973
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Rajeshbabu Chintaguntla
>            Assignee: Sergey Soldatov
>             Fix For: 4.7.0
>
>         Attachments: PHOENIX-1973-1.patch, PHOENIX-1973-2.patch, PHOENIX-1973-3.patch,
PHOENIX-1973-4.patch, PHOENIX-1973-5.patch, PHOENIX-1973-6.patch
>
>
> It's similar to HBASE-8768. Only thing is we need to write custom mapper and reducer
in Phoenix. In Map phase we just need to get row key from primary key columns and write the
full text of a line as usual(to ensure sorting). In reducer we need to get actual key values
by running upsert query.
> It's basically reduces lot of map output to write to disk and data need to be transferred
through network.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message