hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Big machines or (relatively) small machines?
Date Mon, 07 Jun 2010 17:46:39 GMT
It really depends on your usage pattern, but there's a balance wrt
cost VS hardware you must achieve. At StumbleUpon we run with 2xi7,
24GB, 4x 1TB and it works like a charm. The only thing I would change
is maybe more disks/node but that's pretty much it. Some relevant

 - Do you have any mem-intensive jobs? If so, figure how many tasks
you'll run per node and make the RAM fit the load.
 - Do you plan to serve data out of HBase or will you just use it for
MapReduce? Or will it be a mix (not recommended)?

Also, keep in mind that losing 1 machine over 8 compared to 1 over 16
drastically changes the performance of your system at the time of the

About virtualization, it doesn't make sense. Also your disks should be in JBOD.


On Wed, Jun 2, 2010 at 11:12 PM, Sean Bigdatafun
<sean.bigdatafun@gmail.com> wrote:
> I am thinking of the following problem lately. I started thinking of this
> problem in the following context.
> I have a predefined budget and I can either
>  -- A) purchase 8 more powerful servers (4cpu x 4 cores/cpu +  128GB mem +
> 16 x 1TB disk) or
>  -- B) purchase 16 less powerful servers(2cpu x 4 cores/cpu +  64GB mem + 8
> x 1TB disk)
>          NOTE: I am basically making up a half housepower scenario
>  -- Let's say I am going to use 10Gbps network switch and each machine has
> a 10Gbps network card
> In the above scenario, does A or B perform better or relatively same? -- I
> guess this really depends on Hadoop's map/reduce's scheduler.
> And then I have a following question: does it make sense to virtualize a
> Hadoop datanode at all?  (if the answer to above question is "relatively
> same", I'd say it does not make sense)
> Thanks,
> Sean

View raw message