hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma" <jssa...@facebook.com>
Subject RE: Sort benchmark on 2000 nodes
Date Wed, 05 Sep 2007 17:19:31 GMT
It will be very useful to see the hadoop/job config settings and get
some sense of the underlying hardware config.

-----Original Message-----
From: Devaraj Das [mailto:ddas@yahoo-inc.com] 
Sent: Wednesday, September 05, 2007 2:29 AM
To: hadoop-user@lucene.apache.org
Subject: Sort benchmark on 2000 nodes

This is FYI. We at Yahoo! could successfully run hadoop (upto date trunk
version) on a cluster of 2000 nodes. The programs we ran were
RandomWriter
and Sort. Sort performance was pretty good - we could sort 20TB of data
in
2.5 hours! Not many task failures - most of those that failed
encountered
file checksum errors during merge and map output serving, some got
killed
due to lack of progress reporting. Overall, a pretty successful run.


Mime
View raw message