hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5685) DistCp will fail to copy with -delete switch
Date Fri, 03 Jan 2014 01:13:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13861061#comment-13861061

Aaron T. Myers commented on HDFS-5685:

Hi Yongjun,

Patch looks pretty good to me. A few small comments:

# Do we really need doneFilteringJobDir and doneFilteringJobDirDstLsr? Seems like we could
just always do the comparison and simplify the code a bit.
# Should camelCase variable names, not use underscores, e.g. "cmp_job_dir".
# I think this comment is incorrect: "// lsrpath does not exist, delete it if it's not jobDir
or jobDir's ancestor". Really you should delete it only if it's "not jobDir or jobDir is its
ancestor." I think you have the function call in the code correct, though.
# Can you refactor the test code into a single parameterized method that the two test cases
call? Seems like the two test methods are identical except for the source and destination
file systems.

> DistCp will fail to copy with -delete switch
> --------------------------------------------
>                 Key: HDFS-5685
>                 URL: https://issues.apache.org/jira/browse/HDFS-5685
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 1.2.1
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>             Fix For: 1.3.0
>         Attachments: HDFS-5685.001.patch, HDFS-5685.002.patch
> When using distcp command to copy files with -delete switch, running as user <xyz>,
> hadoop distcp -p -i -update  -delete hdfs://srchost:<port>/user hdfs://dsthost:<port>/user
> It fails with the following exception:
> Copy failed: java.io.FileNotFoundException: File does not exist: hdfs://dsthost:<port>/user/xyz/.stagingdistcp_urjb0g/_distcp_src_files
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:557)
>         at org.apache.hadoop.tools.DistCp$CopyInputFormat.getSplits(DistCp.java:266)
>         at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)
>         at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)
>         at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)
>         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>         at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353)
>         at org.apache.hadoop.tools.DistCp.copy(DistCp.java:667)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)

This message was sent by Atlassian JIRA

View raw message