hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raymond Jennings III <raymondj...@yahoo.com>
Subject Is hdfs reliable? Very odd error
Date Sat, 14 Aug 2010 03:12:22 GMT
I copied a 230GB file into my hadoop cluster.  After my MR job kept failing I 
tracked down the error to one line of formatted text.

I copied the file back out of hdfs and when I compare it to the original file 
there are about 20 bytes on one line (out of 230GB) that are different.

Is there no CRC or checksum done when copying files into hdfs?

(Just to be clear, I copied the original file out of hdfs - not the output of my 
MR job.)


View raw message