hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ramu M S <ramu.ma...@gmail.com>
Subject Re: Regarding Hardware configuration for HBase cluster
Date Sat, 08 Feb 2014 08:10:55 GMT
Lars,

What about high density storage servers that has capacity of up to 24
drives. There were also some recommendations in few blogs about having 1
core per disk.

1TB disks have slight price difference compared to 600 GB. With
negotiations it'll be as low as 50$. Also price difference between 8 core
and 12 core processors is very less, 200-300$.

Do you think having 20-24 cores and 24 1TB disks will also be an option?

Regards,
Ramu
On Feb 8, 2014 11:19 AM, "lars hofhansl" <larsh@apache.org> wrote:

> Let's not refer to our users in the third person. It's not polite :)
>
> Suresh,
>
> I wrote something up about RegionServer sizing here:
> http://hadoop-hbase.blogspot.com/2013/01/hbase-region-server-memory-sizing.html
>
> For your load I would guess that you'd need about 100 servers.
>
> That would:
> 1. have 8TB/server
> 2. 30m rows/day/server
> 3. 30GB/day/server
>
> You not expect a single server to be able to absorb more than 10000rows/s
> or 40mb/s, whatever is less.
>
> The machines I'd size as follows:
> 12-16 cores, HT, 1.8GHz-2.4GHz (more is better)
> 32-96GB ram
> 6-12 drives (more spindles are better to absorb the write load)
> 10ge NICs and TopOfRack switches
>
> Now, this is only a *rough guideline* and obviously you'd have perform
> your own tests and this would only scale across if the machines if your
> keys are sufficiently distributed.
> The details also depend on how compressable your data is and your exact
> access patterns (read patters, spiky write load, etc)
> Start with 10 data nodes and appropriately scaled down load and see how it
> works.
>
> Vladimir is right here, you probably want to seek professional help.
>
> -- Lars
>
>
>
>
> ________________________________
>  From: Vladimir Rodionov <vrodionov@carrieriq.com>
> To: "user@hbase.apache.org" <user@hbase.apache.org>
> Sent: Friday, February 7, 2014 10:29 AM
> Subject: RE: Regarding Hardware configuration for HBase cluster
>
>
> This guy is building system of a scale of Yahoo and asking user group how
> to size the cluster.
> Few people here can give him advice based on their experience and I am not
> one of them. I can
> only speculate on "how many nodes will we need to consume 3TB/3B records
> daily".
>
> For this scale of a system its better to go to Cloudera/IBM/HW, and not to
> try to build it yourself,
> especially when you ask questions on user group (not answer them).
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
>
> From: Ted Yu [yuzhihong@gmail.com]
> Sent: Friday, February 07, 2014 6:27 AM
> To: user@hbase.apache.org
> Cc: user@hbase.apache.org
> Subject: Re: Regarding Hardware configuration for HBase cluster
>
> Have you read http://www.slideshare.net/larsgeorge/hbase-sizing-notes ?
>
> Cheers
>
> On Feb 6, 2014, at 8:47 PM, suresh babu <bigdatacslt@gmail.com> wrote:
>
> > Hi Stana,
> >
> > We are trying to find out how many data nodes (including hardware
> > configuration detail)should be configured or setup for this requirement
> >
> > -suresh
> >
> > On Friday, February 7, 2014, stana <stana@is-land.com.tw> wrote:
> >
> >> HI suresh babu :
> >>
> >> how many data nodes do you have?
> >>
> >>
> >> 2014-02-07 suresh babu <bigdatacslt@gmail.com <javascript:;>>:
> >>
> >>> refreshing the thread,
> >>>
> >>> Can you please  suggest any inputs for the hardware configuration(for
> the
> >>> below mentioned use case).
> >>>
> >>>
> >>>
> >>>
> >>> On Wed, Feb 5, 2014 at 10:31 AM, suresh babu <bigdatacslt@gmail.com>
> >>> wrote:
> >>>
> >>>> Please find the data requirements for our use case below :
> >>>>
> >>>> Raw data processing
> >>>> ----------------------------------
> >>>> 1. Data is populated into hdfs , after etl around 3 billion puts per
> >> day
> >>>> in to hbase
> >>>>
> >>>> 2. Oldest data after X days to be deleted from hbase
> >>>>
> >>>> Aggregates processing
> >>>> ----------------------------------
> >>>> 3 billion reads per day ... Large scan or reads
> >>>>
> >>>> KV size around 1 KB Daily Processing, raw and aggregates, via M/R jobs
> >>>> Hive queries in future, but not of immediate focus
> >>>> On Feb 5, 2014 12:48 AM, "Vladimir Rodionov" <vrodionov@carrieriq.com
> >
> >>>> wrote:
> >>>>
> >>>>> Yes,
> >>>>>
> >>>>> 1. What is the expected avg and peak load in
> >>> writes/updates/deletes/reads?
> >>>>> 2. What is the average size of a KV?
> >>>>> 3. Reads/small scans/medium/large scan %%
> >>>>> 4. Do you plan M/R jobs, Hive query?
> >>>>>
> >>>>>
> >>>>> Best regards,
> >>>>> Vladimir Rodionov
> >>>>> Principal Platform Engineer
> >>>>> Carrier IQ, www.carrieriq.com
> >>>>> e-mail: vrodionov@carrieriq.com
> >>>>>
> >>>>> ________________________________________
> >>>>> From: Nick Xie [nick.xie.hadoop@gmail.com]
> >>>>> Sent: Tuesday, February 04, 2014 10:02 AM
> >>>>> To: user@hbase.apache.org
> >>>>> Subject: Re: Regarding Hardware configuration for HBase cluster
> >>>>>
> >>>>> I guess you'd better describe a little bit more about your
> >> applications.
> >>>>> Does the data increase over the time at all?
> >>>>>
> >>>>> Nick
> >>>>>
> >>>>>
> >>>>> On Tue, Feb 4, 2014 at 5:22 AM, suresh babu <bigdatacslt@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi folks,
> >>>>>>
> >>>>>> We are trying to setup HBase cluster for the following requirement:
> >>>>>>
> >>>>>> We have to maintain data of size around 800TB,
> >>>>>>
> >>>>>> For the above requirement,please suggest me the best hardware
> >>>>> configuration
> >>>>>> details like
> >>>>>>
> >>>>>> 1)how many disks to consider for machine and the  capacity of
disks
> >>> ,for
> >>>>>> example, 16/24 disks per node with 1/2TB capacity per each disk
> >>>>>>
> >>>>>> 2) which compression method is suited for production environment
,
> >>>>> space is
> >>>>>> not a major limitation , but speed is of prime concern for my
use
> >> case
> >>>>>>
> >>>>>> 3) how many CPU Cores should be configured for each node/machine
?
> >> Or
> >>>>>> ideal ratio of number of cores to the number of disks,for example
> >>>>>> 1core/1disk ?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Kaushik
> >>>>>
> >>>>> Confidentiality Notice:  The information contained in this message,
> >>>>> including any attachments hereto, may be confidential and is intended
> >>> to be
> >>>>> read only by the individual or entity to whom this message is
> >>> addressed. If
> >>>>> the reader of this message is not the intended recipient or an agent
> >> or
> >>>>> designee of the intended recipient, please note that any review,
use,
> >>>>> disclosure or distribution of this message or its attachments, in
any
> >>> form,
> >>>>> is strictly prohibited.  If you have received this message in error,
> >>> please
> >>>>> immediat--
> >> Best Regards
> >>
> >> 亦思科技  is-land Systems Inc.
> >> Tel:03-5630345 Ext.14
> >> Fax:03-5631345
> >> e-MAIL:stana@is-land.com.tw <javascript:;>
> >>
> >> 何永安 Yung An He
> >>
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message