hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-648) Two distcp bugs
Date Tue, 15 Sep 2009 21:03:58 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Tsz Wo (Nicholas), SZE updated MAPREDUCE-648:

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Ravi!

> Two distcp bugs
> ---------------
>                 Key: MAPREDUCE-648
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-648
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: distcp
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>            Priority: Minor
>             Fix For: 0.21.0
>         Attachments: d_648_644.patch, d_dirCount648.patch, d_dirCount648.v1.patch, d_dirCount_648.patch
> h4. 1. distcp -update launches job when there is at least one dir in source paths to
be copied, even though there is nothing to copy.
> HADOOP-5675 added fileCount > 0 to be checked to decide whether to launch job. And
HADOOP-5762 changed this to fileCount + dirCount > 0 to solve the issue of empty directories
not getting copied to destination. With -update, dirCount is incremented without checking
if that dir already exists at the destination. So distcp job is launched because of dirCount
> 0 even though there is nothing to copy. Incrementing dirCount can be skipped if that
dir already exists at the destination in case of -update.
> h4. 2. distcp doesn't skip copying file when we do -update on single file if the destfile
already exists.
> When we do
> hadoop distcp -update srcfilename destfilename
> it seems to be comparing checksums of srcfilename and destfilename/srcfilename and so
skip is not done. It should compare checksums of srcfilename and destfilename.
> See also MAPREDUCE-644.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message