hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "eric baldeschwieler (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-738) dfs get or copyToLocal should not copy crc file
Date Sat, 25 Nov 2006 23:57:03 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-738?page=comments#action_12452613 ] 
            
eric baldeschwieler commented on HADOOP-738:
--------------------------------------------

Well the discussion does seem to be on the topic of the subject.  As discussed in the thread
there are several reasons to consider the change.  Enhancing the description could be done.

I agree that we probably should fix the interaction of the CRC files in the current copy too.
 Although renaming them so they are visible would help a lot no mater what.

I also agree that we will need sub-block CRC info even when we move the CRC data to be a block
attribute.

Moving multi-terrabyte objects to local disk is not the prototypical use of Hadoop in our
environment.  It is certainly not the prototypical reason we invoke -get or -copyToLocal.

We don't seem to be converging here.  Maybe we should create two commands.  One which is typically
used for lightweight copies and does not write CRCs and one which does and has an inverse
command that validates the CRCs on import.  Although what happens when a CRC does not match
on import?  The CRC exporter would create visible CRCs and would have well defined semantics
for overwriting files.  (Failing if the target directory would avoid the problem in the description
...)

> dfs get or copyToLocal should not copy crc file
> -----------------------------------------------
>
>                 Key: HADOOP-738
>                 URL: http://issues.apache.org/jira/browse/HADOOP-738
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.8.0
>         Environment: all
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.9.0
>
>         Attachments: hadoop-crc.patch
>
>
> Currently, when we -get or -copyToLocal a directory from DFS, all the files including
crc files are also copied. When we -put or -copyFromLocal again, since the crc files already
exist on DFS, this put fails. The solution is not to copy checksum files when copying to local.
Patch is forthcoming.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message