hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Question on Critical Region size for SequenceFile next/write - 0.15.1
Date Wed, 12 Dec 2007 21:57:29 GMT
Jason Venner wrote:
> On investigating, we discovered that the entirety of the next(key,value) 
> and the entirety of the write( key, value) are synchronized on the file 
> object.
> This causes all threads to back up on the serialization/deserialization.

I'm not sure what you want to happen here.  If you've got a bunch of 
threads writing to a single file, and that's your performance 
bottleneck, I don't see how to improve the situation except to write to 
multiple files on different drives, or to spread your load across a 
larger cluster (another way to get more drives).


View raw message