hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das" <d...@yahoo-inc.com>
Subject RE: Sort benchmark on 2000 nodes
Date Thu, 06 Sep 2007 12:53:38 GMT
Thanks Eric for pointing out the hardware spec. I have updated the
hadoop-config on the hadoop FAQ - http://wiki.apache.org/lucene-hadoop/FAQ 

> -----Original Message-----
> From: Eric Baldeschwieler [mailto:eric14@yahoo-inc.com] 
> Sent: Thursday, September 06, 2007 1:37 PM
> To: hadoop-user@lucene.apache.org
> Subject: Re: Sort benchmark on 2000 nodes
> 
> hardware is similar top that discussed here:
> 
> http://wiki.apache.org/lucene-hadoop-data/attachments/
> HadoopPresentations/attachments/oscon-part-2.pdf
> 
> - 10:1 oversubscribed network (so 100mBit bandwidth all nodes to all
> nodes)
> - 40 nodes / leaf switch
> - Machines are beefy
>    - 4SATA drives, 500 or 750 GB each, 7200 RPM
>    - 4+ cores (modern Intels or AMDs)
>    - 4+ GB RAM
> 
> On Sep 5, 2007, at 10:19 AM, Joydeep Sen Sarma wrote:
> 
> > It will be very useful to see the hadoop/job config 
> settings and get 
> > some sense of the underlying hardware config.
> >
> > -----Original Message-----
> > From: Devaraj Das [mailto:ddas@yahoo-inc.com]
> > Sent: Wednesday, September 05, 2007 2:29 AM
> > To: hadoop-user@lucene.apache.org
> > Subject: Sort benchmark on 2000 nodes
> >
> > This is FYI. We at Yahoo! could successfully run hadoop (upto date 
> > trunk
> > version) on a cluster of 2000 nodes. The programs we ran were 
> > RandomWriter and Sort. Sort performance was pretty good - we could 
> > sort 20TB of data in
> > 2.5 hours! Not many task failures - most of those that failed 
> > encountered file checksum errors during merge and map 
> output serving, 
> > some got killed due to lack of progress reporting. Overall, 
> a pretty 
> > successful run.
> >
> >
> 
> 


Mime
View raw message