hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Soztutar <enis.soz.nu...@gmail.com>
Subject Re: Sort benchmark on 2000 nodes
Date Wed, 05 Sep 2007 13:01:32 GMT
I am wondering how hadoop scores on sorting 1TB with say 1000 nodes. Is 
it possible for you to try the Terasort benchmark?

Devaraj Das wrote:
> This is FYI. We at Yahoo! could successfully run hadoop (upto date trunk
> version) on a cluster of 2000 nodes. The programs we ran were RandomWriter
> and Sort. Sort performance was pretty good - we could sort 20TB of data in
> 2.5 hours! Not many task failures - most of those that failed encountered
> file checksum errors during merge and map output serving, some got killed
> due to lack of progress reporting. Overall, a pretty successful run.

View raw message