hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: Hardware Setup
Date Thu, 15 Oct 2009 16:06:12 GMT
Hey Alex,

In order to lower cost, you'll probably want to order the worker nodes  
without hard drives then buy them separately.  HDFS provides a  
software-level RAID, so most of the reasonings behind buying hard  
drives from Dell/HP are irrelevant - you are just paying an extra $400  
per hard drive.  I know Dell sells the R410 which has 4 SATA bays; I'm  
sure Steve knows an HP model that has something similar.

However, BE VERY CAREFUL when you do this.  From experience, a certain  
large manufacturer (I don't know about Dell/HP) will refuse to ship  
(or sell separately) hard drive trays if you order their machine  
without hard drives.  When this happened to us, we were not able to  
return the machines because they were custom orders.  Eventually, we  
had to get someone to go to the machine shop and build 72 hard drive  
trays for us.

Worst. Experience. Ever.

So, ALWAYS ASK and make sure that you can buy empty hard drive trays  
for that specific model (or at least that it ships with them).


On Oct 15, 2009, at 10:48 AM, Alex Newman wrote:

>          So my company is looking at only using dell or hp for our
> hadoop cluster and a sun thumper to backup the data. The prices are
> ok, after a 40% discount, but realistically I am paying twice as much
> as if I went to silicon mechanics, and with a much slower machine. It
> seems as though the big expense are the disks. Even with a 40%
> discount 550$ per 1tb disk seems crazy expensive. Also, they are
> pushing me to build a smaller cluster (6 nodes) and I am pushing back
> for nodes half the size but having twice as many. So how much of a
> performance difference can I expect btwn 12 nodes with 1 xeon 5 series
> running at 2.26 ghz 8 gigs of ram with 4 1 tb disks and a 6 node
> cluster with 2 xeon 5 series running at 2.26 16 gigs of ram with 8 1
> tb disks. Both setups will also have 2 very small sata drives in raid
> 1 for the OS. I will be doing some stuff with hadoop and a lot of
> stuff with HBase. What are the considerations with HDFS performance
> with a low number of nodes,etc.

View raw message