hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Big machines or (relatively) small machines?
Date Mon, 07 Jun 2010 17:46:39 GMT
It really depends on your usage pattern, but there's a balance wrt
cost VS hardware you must achieve. At StumbleUpon we run with 2xi7,
24GB, 4x 1TB and it works like a charm. The only thing I would change
is maybe more disks/node but that's pretty much it. Some relevant
questions:

 - Do you have any mem-intensive jobs? If so, figure how many tasks
you'll run per node and make the RAM fit the load.
 - Do you plan to serve data out of HBase or will you just use it for
MapReduce? Or will it be a mix (not recommended)?

Also, keep in mind that losing 1 machine over 8 compared to 1 over 16
drastically changes the performance of your system at the time of the
failure.

About virtualization, it doesn't make sense. Also your disks should be in JBOD.

J-D

On Wed, Jun 2, 2010 at 11:12 PM, Sean Bigdatafun
<sean.bigdatafun@gmail.com> wrote:
> I am thinking of the following problem lately. I started thinking of this
> problem in the following context.
>
> I have a predefined budget and I can either
>  -- A) purchase 8 more powerful servers (4cpu x 4 cores/cpu +  128GB mem +
> 16 x 1TB disk) or
>  -- B) purchase 16 less powerful servers(2cpu x 4 cores/cpu +  64GB mem + 8
> x 1TB disk)
>          NOTE: I am basically making up a half housepower scenario
>  -- Let's say I am going to use 10Gbps network switch and each machine has
> a 10Gbps network card
>
> In the above scenario, does A or B perform better or relatively same? -- I
> guess this really depends on Hadoop's map/reduce's scheduler.
>
> And then I have a following question: does it make sense to virtualize a
> Hadoop datanode at all?  (if the answer to above question is "relatively
> same", I'd say it does not make sense)
>
> Thanks,
> Sean
>

Mime
View raw message