hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: Big machines or (relatively) small machines?
Date Mon, 07 Jun 2010 20:13:08 GMT
If those are your actual specs, I would definitely go with 16 of the smaller
ones. 128G heaps are not going to work well in a JVM, you're better off
running with more nodes with a more common configuration.

-Todd

On Mon, Jun 7, 2010 at 1:46 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> It really depends on your usage pattern, but there's a balance wrt
> cost VS hardware you must achieve. At StumbleUpon we run with 2xi7,
> 24GB, 4x 1TB and it works like a charm. The only thing I would change
> is maybe more disks/node but that's pretty much it. Some relevant
> questions:
>
>  - Do you have any mem-intensive jobs? If so, figure how many tasks
> you'll run per node and make the RAM fit the load.
>  - Do you plan to serve data out of HBase or will you just use it for
> MapReduce? Or will it be a mix (not recommended)?
>
> Also, keep in mind that losing 1 machine over 8 compared to 1 over 16
> drastically changes the performance of your system at the time of the
> failure.
>
> About virtualization, it doesn't make sense. Also your disks should be in
> JBOD.
>
> J-D
>
> On Wed, Jun 2, 2010 at 11:12 PM, Sean Bigdatafun
> <sean.bigdatafun@gmail.com> wrote:
> > I am thinking of the following problem lately. I started thinking of this
> > problem in the following context.
> >
> > I have a predefined budget and I can either
> >  -- A) purchase 8 more powerful servers (4cpu x 4 cores/cpu +  128GB mem
> +
> > 16 x 1TB disk) or
> >  -- B) purchase 16 less powerful servers(2cpu x 4 cores/cpu +  64GB mem +
> 8
> > x 1TB disk)
> >          NOTE: I am basically making up a half housepower scenario
> >  -- Let's say I am going to use 10Gbps network switch and each machine
> has
> > a 10Gbps network card
> >
> > In the above scenario, does A or B perform better or relatively same? --
> I
> > guess this really depends on Hadoop's map/reduce's scheduler.
> >
> > And then I have a following question: does it make sense to virtualize a
> > Hadoop datanode at all?  (if the answer to above question is "relatively
> > same", I'd say it does not make sense)
> >
> > Thanks,
> > Sean
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message