hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: [jira] Updated: (HADOOP-1180) NNbench test should be able to test the checksumfilesystem as well as the raw filesystem
Date Thu, 29 Mar 2007 19:07:01 GMT
Nigel Daley wrote:
> So shouldn't fixing this test to conform to the new model in HADOOP-1134 
> be the concern of the patch for HADOOP-1134? 

Yes, but, as it stands, this patch would silently stop working correctly 
once HADOOP-1134 is committed.  It should instead be written in a more 
robust way, that can survive expected changes.  Relying on HDFS using 
ChecksumFileSystem isn't as reliable as an explicit constructor that 
says "I want an unchecksummed FileSystem."

> As it stand, I can't run 
> NNBench at scale without using a raw file system, which is what this 
> patch is intended to allow.

It seems strange to disable things in an undocumented and unsupported 
way in order to get a benchmark to complete.  How does that prove 
scalability?  Rather, leaving NNBench alone seems like a strong argument 
for implementing HADOOP-1134 sooner.

Still, if you want to be able to disable checksums, for benchmarks or 
whatever, we can permit that, but should do so explicitly.

> HADOOP-928 caused this test to use a 
> ChecksumFileSystem and subsequently we saw our "read" TPS metric plummet 
> from 20,000 to a couple hundred.

Ah, NNBench used the 'raw' methods before, which was kind of sneaky on 
its part, since it didn't benchmark the typical user experience. 
Although the namenode performance should only halve at worst with 
checksums as currently implemented, no?

> Let's get our current benchmark back on track before we commit 
> HADOOP-1134 (which will likely take a while before it is "Patch 
> Available").

I'd argue that we should fix the benchmark to accurately reflect what 
users see, so that we see real improvement when HADOOP-1134 is 
committed.  That would make it a more useful and realistic benchmark. 
However if you believe that a checksum-free benchmark is still useful, I 
think it should be more future-proof.


View raw message