hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Many Checksum Errors
Date Wed, 16 May 2007 19:15:19 GMT
Raghu Angadi wrote:
> But this will not fix the same problem with block-level checksums. 
> Pretty soon, HDFS will not use ChecksumFileSystem at all.

I'd hope that block-level checksums do not replicate logic from 
ChecksumFileSystem.  Rather they should probably share substantial 
portions of their checksumming input and output stream implementations, 
no?  So it could fix the same problem for block-level checksums, and 
should if possible.

> Ideally we 
> should let the implementations decide how to buffer.

I'm not sure what you mean by this.  The buffer size is a parameter to 
FileSystem's open() and create() methods.  Whether checksums require 
another level of buffering is a separate issue.  Is it efficient to 
invoke the CRC32 code as each byte is written, or is it faster to run it 
in 512-byte or larger batches?


View raw message