hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Gummadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6051) distcp does not skip copying file if we are updating single file
Date Thu, 18 Jun 2009 04:45:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721060#action_12721060

Ravi Gummadi commented on HADOOP-6051:

Only file sizes were checked earlier. But now in trunk, checksums are also checked after checking
In any case, if I run the following command multiple times

hadoop distcp -update srcfile destfile

and if destfile doesn't exist, -update should allow the file to be copied only once and from
2nd run onwards it should not copy as the filesizes(and 
checksums are same).
But the problem here seems to be it is not comparing the filesizes and checksums of srcfile
and destfile. distcp seems to be comparing srcfile with  
the path destfile/srcfile(i.e. srcfile in destfile directory), which is wrong.

> distcp does not skip copying file if we are updating single file
> ----------------------------------------------------------------
>                 Key: HADOOP-6051
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6051
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 0.21.0
>            Reporter: Ravi Gummadi
>             Fix For: 0.21.0
> distcp doesn't skip copying file when we do -update on single file if the destfile already
> When we do 
> hadoop distcp -update srcfilename destfilename
> it seems to be comparing checksums of srcfilename and destfilename/srcfilename and so
skip is not done. It should compare checksums of srcfilename and destfilename.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message