hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sankar Hariappan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask
Date Tue, 04 Jul 2017 17:26:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16073937#comment-16073937
] 

Sankar Hariappan edited comment on HIVE-16901 at 7/4/17 5:25 PM:
-----------------------------------------------------------------

Added 04.patch after replacing (0 != srcMap.size()) to (!srcMap.isEmpty())

Thanks [~anishek] for the review!
Request [~thejas]/[~daijy]/[~sushanth] to please review/commit the patch!


was (Author: sankarh):
Added 04.patch after replacing (0 != srcMap.size()) to (!srcMap.isEmpty())

Thanks [~anishek] for the review!
Request [~thejas]/[~daijy] to please review/commit the patch!

> Distcp optimization - One distcp per ReplCopyTask 
> --------------------------------------------------
>
>                 Key: HIVE-16901
>                 URL: https://issues.apache.org/jira/browse/HIVE-16901
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Hive, repl
>    Affects Versions: 2.1.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>              Labels: DR, replication
>             Fix For: 3.0.0
>
>         Attachments: HIVE-16901.01.patch, HIVE-16901.02.patch, HIVE-16901.03.patch, HIVE-16901.04.patch
>
>
> Currently, if a ReplCopyTask is created to copy a list of files, then distcp is invoked
for each and every file. Instead, need to pass the list of source files to be copied to distcp
tool which basically copies the files in parallel and hence gets lot of performance gain.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message