hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Baldeschwieler <eri...@yahoo-inc.com>
Subject Re: Help: -copyFromLocal
Date Thu, 30 Mar 2006 04:42:24 GMT
Interesting.  It would actually be nice to include the CRCs in an  
export, so that you can validate your data when you reload it.  CRCs  
are best if they are kept end to end.

On Mar 29, 2006, at 9:59 AM, Doug Cutting wrote:

> monu.ogbe@richmondinformatics.com wrote:
>> However I'm getting the following error:
>> copyFromLocal: Target /user/root/crawl/crawldb/current/ 
>> part-00000/.data.crc
>> already exists
>
> Please file a bug report.  The problem is that when copyFromLocal  
> enumerates local files it should exclude .crc files, but it does  
> not. This is the listFiles() call on DistributedFileSystem:160.  It  
> should filter this, excluding files that are  
> FileSystem.isChecksumFile().
>
> BTW, as a workaround, it is safe to first remove all of the .crc  
> files, but your files will no longer be checksummed as they are  
> read.  On systems without ECC memory file corruption is not  
> uncommon, but I have seen very little on clusters that have ECC.
>
> Doug


Mime
View raw message