hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8828) Utilize Snapshot diff report to build copy list in distcp
Date Sat, 08 Aug 2015 18:14:46 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14663093#comment-14663093
] 

Hadoop QA commented on HDFS-8828:
---------------------------------

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 28s | Pre-patch trunk compilation is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any @author tags.
|
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to include 2 new
or modified test files. |
| {color:green}+1{color} | javac |   7m 41s | There were no new javac warning messages. |
| {color:green}+1{color} | javadoc |   9m 40s | There were no new javadoc warning messages.
|
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does not increase
the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 26s | There were no new checkstyle issues. |
| {color:green}+1{color} | whitespace |   0m  3s | The patch has no lines that end in whitespace.
|
| {color:green}+1{color} | install |   1m 22s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with eclipse:eclipse.
|
| {color:green}+1{color} | findbugs |   0m 47s | The patch does not introduce any new Findbugs
(version 3.0.0) warnings. |
| {color:green}+1{color} | tools/hadoop tests |   6m 23s | Tests passed in hadoop-distcp.
|
| | |  42m 51s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | http://issues.apache.org/jira/secure/attachment/12749426/HDFS-8828.005.patch
|
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8f73bdd |
| hadoop-distcp test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11946/artifact/patchprocess/testrun_hadoop-distcp.txt
|
| Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11946/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep
3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11946/console |


This message was automatically generated.

> Utilize Snapshot diff report to build copy list in distcp
> ---------------------------------------------------------
>
>                 Key: HDFS-8828
>                 URL: https://issues.apache.org/jira/browse/HDFS-8828
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: distcp, snapshots
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>         Attachments: HDFS-8828.001.patch, HDFS-8828.002.patch, HDFS-8828.003.patch, HDFS-8828.004.patch,
HDFS-8828.005.patch
>
>
> Some users reported huge time cost to build file copy list in distcp. (30 hours for 1.6M
files). We can leverage snapshot diff report to build file copy list including files/dirs
which are changes only between two snapshots (or a snapshot and a normal dir). It speed up
the process in two folds: 1. less copy list building time. 2. less file copy MR jobs.
> HDFS snapshot diff report provide information about file/directory creation, deletion,
rename and modification between two snapshots or a snapshot and a normal directory. HDFS-7535
synchronize deletion and rename, then fallback to the default distcp. So it still relies on
default distcp to building complete list of files under the source dir. This patch only puts
creation and modification files into the copy list based on snapshot diff report. We can minimize
the number of files to copy. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message