hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Baldeschwieler <eri...@yahoo-inc.com>
Subject Re: Sort benchmark on 2000 nodes
Date Thu, 06 Sep 2007 08:07:13 GMT
hardware is similar top that discussed here:

http://wiki.apache.org/lucene-hadoop-data/attachments/ 
HadoopPresentations/attachments/oscon-part-2.pdf

- 10:1 oversubscribed network (so 100mBit bandwidth all nodes to all  
nodes)
- 40 nodes / leaf switch
- Machines are beefy
   - 4SATA drives, 500 or 750 GB each, 7200 RPM
   - 4+ cores (modern Intels or AMDs)
   - 4+ GB RAM

On Sep 5, 2007, at 10:19 AM, Joydeep Sen Sarma wrote:

> It will be very useful to see the hadoop/job config settings and get
> some sense of the underlying hardware config.
>
> -----Original Message-----
> From: Devaraj Das [mailto:ddas@yahoo-inc.com]
> Sent: Wednesday, September 05, 2007 2:29 AM
> To: hadoop-user@lucene.apache.org
> Subject: Sort benchmark on 2000 nodes
>
> This is FYI. We at Yahoo! could successfully run hadoop (upto date  
> trunk
> version) on a cluster of 2000 nodes. The programs we ran were
> RandomWriter
> and Sort. Sort performance was pretty good - we could sort 20TB of  
> data
> in
> 2.5 hours! Not many task failures - most of those that failed
> encountered
> file checksum errors during merge and map output serving, some got
> killed
> due to lack of progress reporting. Overall, a pretty successful run.
>
>


Mime
View raw message