hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-112) copyFromLocal should exclude .crc files
Date Wed, 05 Apr 2006 19:17:45 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-112?page=comments#action_12373413 ] 

Doug Cutting commented on HADOOP-112:

The changes made are listed at:


The described problem was fixed: a -copyToLocal (a.k.a -get) followed by a -copyFromLocal
(a.k.a. -put) no longer fails complaining about a .crc file.  If this is failing again then
this bug should be re-opened.  Otherwise I think it should remain closed.

If there is a problem with 'dfs -cp' then I think that is a separate bug, no?

> copyFromLocal should exclude .crc files
> ---------------------------------------
>          Key: HADOOP-112
>          URL: http://issues.apache.org/jira/browse/HADOOP-112
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>  Environment: DFS cluster of 6 3hz Xeons with 2Gb RAM running Centos 4.2 and Sun's JDK1.5
- but Probably applies in any environment
>     Reporter: Monu Ogbe
>     Assignee: Doug Cutting
>     Priority: Minor
>      Fix For: 0.1.0

> Doug Cutting says: "The problem is that when copyFromLocal 
> enumerates local files it should exclude .crc files, but it does not. 
> This is the listFiles() call on DistributedFileSystem:160.  It should 
> filter this, excluding files that are FileSystem.isChecksumFile().
> BTW, as a workaround, it is safe to first remove all of the .crc files, 
> but your files will no longer be checksummed as they are read.  On 
> systems without ECC memory file corruption is not uncommon, but I have 
> seen very little on clusters that have ECC."
> Original observations:
> Hello Team,
> I created a backup of my DFS database:
> # bin/hadoop dfs -copyToLocal /user/root/crawl /mylocaldir
> I now want to restore from the backup using:
> # bin/hadoop dfs -copyFromLocal /mylocaldir/crawl /user/root
> However I'm getting the following error:
> copyFromLocal: Target /user/root/crawl/crawldb/current/part-00000/.data.crc
> already exists
> I get this message with every permutation of the command that I've tried, and
> even after totally deleting all content in the DFS directories.
> I'd be grateful for any pointers.
> Many thanks,

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message