hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13024) Distcp with -delete feature on raw data not implemented
Date Thu, 13 Oct 2016 20:30:20 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jing Zhao updated HADOOP-13024:
-------------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: 3.0.0-alpha2
                   2.8.0
           Status: Resolved  (was: Patch Available)

I've committed the patch to trunk, branch-2 and 2.8. Thanks for the contribution, [~mavinmartin@gmail.com]!

> Distcp with -delete feature on raw data not implemented
> -------------------------------------------------------
>
>                 Key: HADOOP-13024
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13024
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Mavin Martin
>            Assignee: Mavin Martin
>             Fix For: 2.8.0, 3.0.0-alpha2
>
>         Attachments: HADOOP-13024.patch, HADOOP-13024.patch, HADOOP-13024.patch.10, HADOOP-13024.patch.3,
HADOOP-13024.patch.4, HADOOP-13024.patch.5, HADOOP-13024.patch.6, HADOOP-13024.patch.7, HADOOP-13024.patch.8,
HADOOP-13024.patch.9
>
>
> When doing distcp of raw data using -delete feature, following bug appears.
> {code}
> [root@xxx bin]# hadoop distcp -delete -update /.reserved/raw/tmp/a /.reserved/raw/tmp/b
> 16/04/14 02:54:01 ERROR tools.DistCp: Exception encountered
> java.io.IOException: DistCp failure: Job job_xxx has failed: Job commit failed: org.apache.hadoop.tools.CopyListing$InvalidInputException:
The source path 'hdfs://nn/.reserved/raw/tmp/b' starts with /.reserved/raw but the target
path 'hdfs://nn/NONE' does not. Either all or none of the paths must have this prefix.
>         at org.apache.hadoop.tools.SimpleCopyListing.validatePaths(SimpleCopyListing.java:141)
>         at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:85)
>         at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
>         at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
>         at org.apache.hadoop.tools.mapred.CopyCommitter.deleteMissing(CopyCommitter.java:244)
>         at org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:94)
>         at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:274)
>         at org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>         at org.apache.hadoop.tools.DistCp.execute(DistCp.java:187)
>         at org.apache.hadoop.tools.DistCp.run(DistCp.java:122)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.tools.DistCp.main(DistCp.java:429)
> {code}
> The issue is not with the distributed copy, the issue is when it tries to delete things
in the target that no longer exist in the source, it revalidates to make sure NONE is in the
/.reserved/raw domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message