hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: recommended nodes
Date Thu, 29 Nov 2012 01:53:14 GMT
Ok, just a caveat.

I am discussing MapR as part of a complete response. As Mohit posted MapR takes the raw device
for their MapR File System. 
They do stripe on their own within what they call a volume. 

But going back to Apache... 
You can stripe drives, however I wouldn't recommend it. I don't think the performance gains
would really matter. 
You're going to end up getting blocked first by disk i/o, then your controller card, then
your network... assuming 10GBe. 

With only 2 disks on an 8 core system, you will hit disk i/o first and then you'll watch your
CPU Wait I/O climb. 

HTH

-Mike

On Nov 28, 2012, at 7:28 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org> wrote:

> Hi Mike,
> 
> Why not using LVM with MapR? Since LVM is reading from 2 drives almost
> at the same time, it should be better than RAID0 or a single drive,
> no?
> 
> 2012/11/28, Michael Segel <michael_segel@hotmail.com>:
>> Just a couple of things.
>> 
>> I'm neutral on the use of LVMs. Some would point out that there's some
>> overhead, but on the flip side, it can make managing the machines easier.
>> If you're using MapR, you don't want to use LVMs but raw devices.
>> 
>> In terms of GC, its going to depend on the heap size and not the total
>> memory. With respect to HBase. ... MSLABS is the way to go.
>> 
>> 
>> On Nov 28, 2012, at 12:05 PM, Jean-Marc Spaggiari <jean-marc@spaggiari.org>
>> wrote:
>> 
>>> Hi Gregory,
>>> 
>>> I founs this about LVM:
>>> -> http://blog.andrew.net.au/2006/08/09
>>> -> http://www.phoronix.com/scan.php?page=article&item=fedora_15_lvm&num=2
>>> 
>>> Seems that performances are still correct with it. I will most
>>> probably give it a try and bench that too... I have one new hard drive
>>> which should arrived tomorrow. Perfect timing ;)
>>> 
>>> 
>>> 
>>> JM
>>> 
>>> 2012/11/28, Mohit Anchlia <mohitanchlia@gmail.com>:
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Nov 28, 2012, at 9:07 AM, Adrien Mogenet <adrien.mogenet@gmail.com>
>>>> wrote:
>>>> 
>>>>> Does HBase really benefit from 64 GB of RAM since allocating too large
>>>>> heap
>>>>> might increase GC time ?
>>>>> 
>>>> Benefit you get is from OS cache
>>>>> Another question : why not RAID 0, in order to aggregate disk bandwidth
>>>>> ?
>>>>> (and thus keep 3x replication factor)
>>>>> 
>>>>> 
>>>>> On Wed, Nov 28, 2012 at 5:58 PM, Michael Segel
>>>>> <michael_segel@hotmail.com>wrote:
>>>>> 
>>>>>> Sorry,
>>>>>> 
>>>>>> I need to clarify.
>>>>>> 
>>>>>> 4GB per physical core is a good starting point.
>>>>>> So with 2 quad core chips, that is going to be 32GB.
>>>>>> 
>>>>>> IMHO that's a minimum. If you go with HBase, you will want more.
>>>>>> (Actually
>>>>>> you will need more.) The next logical jump would be to 48 or 64GB.
>>>>>> 
>>>>>> If we start to price out memory, depending on vendor, your company's
>>>>>> procurement,  there really isn't much of a price difference in terms
>>>>>> of
>>>>>> 32,48, or 64 GB.
>>>>>> Note that it also depends on the chips themselves. Also you need
to
>>>>>> see
>>>>>> how many memory channels exist in the mother board. You may need
to
>>>>>> buy
>>>>>> in
>>>>>> pairs or triplets. Your hardware vendor can help you. (Also you need
>>>>>> to
>>>>>> keep an eye on your hardware vendor. Sometimes they will give you
>>>>>> higher
>>>>>> density chips that are going to be more expensive...) ;-)
>>>>>> 
>>>>>> I tend to like having extra memory from the start.
>>>>>> It gives you a bit more freedom and also protects you from 'fat'
code.
>>>>>> 
>>>>>> Looking at YARN... you will need more memory too.
>>>>>> 
>>>>>> 
>>>>>> With respect to the hard drives...
>>>>>> 
>>>>>> The best recommendation is to keep the drives as JBOD and then use
3x
>>>>>> replication.
>>>>>> In this case, make sure that the disk controller cards can handle
>>>>>> JBOD.
>>>>>> (Some don't support JBOD out of the box)
>>>>>> 
>>>>>> With respect to RAID...
>>>>>> 
>>>>>> If you are running MapR, no need for RAID.
>>>>>> If you are running an Apache derivative, you could use RAID 1. Then
>>>>>> cut
>>>>>> your replication to 2X. This makes it easier to manage drive failures.
>>>>>> (Its not the norm, but it works...) In some clusters, they are using
>>>>>> appliances like Net App's e series where the machines see the drives
>>>>>> as
>>>>>> local attached storage and I think the appliances themselves are
using
>>>>>> RAID.  I haven't played with this configuration, however it could
make
>>>>>> sense and its a valid design.
>>>>>> 
>>>>>> HTH
>>>>>> 
>>>>>> -Mike
>>>>>> 
>>>>>> On Nov 28, 2012, at 10:33 AM, Jean-Marc Spaggiari
>>>>>> <jean-marc@spaggiari.org>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Mike,
>>>>>>> 
>>>>>>> Thanks for all those details!
>>>>>>> 
>>>>>>> So to simplify the equation, for 16 virtual cores we need 48
to 64GB.
>>>>>>> Which mean 3 to 4GB per core. So with quad cores, 12GB to 16GB
are a
>>>>>>> good start? Or I simplified it to much?
>>>>>>> 
>>>>>>> Regarding the hard drives. If you add more than one drive, do
you
>>>>>>> need
>>>>>>> to build them on RAID or similar systems? Or can Hadoop/HBase
be
>>>>>>> configured to use more than one drive?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> JM
>>>>>>> 
>>>>>>> 2012/11/27, Michael Segel <michael_segel@hotmail.com>:
>>>>>>>> 
>>>>>>>> OK... I don't know why Cloudera is so hung up on 32GB. ;-)
[Its an
>>>>>> inside
>>>>>>>> joke ...]
>>>>>>>> 
>>>>>>>> So here's the problem...
>>>>>>>> 
>>>>>>>> By default, your child processes in a map/reduce job get
a default
>>>>>> 512MB.
>>>>>>>> The majority of the time, this gets raised to 1GB.
>>>>>>>> 
>>>>>>>> 8 cores (dual quad cores) shows up at 16 virtual processors
in
>>>>>>>> Linux.
>>>>>> (Note:
>>>>>>>> This is why when people talk about the number of cores, you
have to
>>>>>> specify
>>>>>>>> physical cores or logical cores....)
>>>>>>>> 
>>>>>>>> So if you were to over subscribe and have lets say 12  mappers
and
>>>>>>>> 12
>>>>>>>> reducers, that's 24 slots. Which means that you would need
24GB of
>>>>>> memory
>>>>>>>> reserved just for the child processes. This would leave 8GB
for DN,
>>>>>>>> TT
>>>>>> and
>>>>>>>> the rest of the linux OS processes.
>>>>>>>> 
>>>>>>>> Can you live with that? Sure.
>>>>>>>> Now add in R, HBase, Impala, or some other set of tools on
top of
>>>>>>>> the
>>>>>>>> cluster.
>>>>>>>> 
>>>>>>>> Ooops! Now you are in trouble because you will swap.
>>>>>>>> Also adding in R, you may want to bump up those child procs
from 1GB
>>>>>>>> to
>>>>>> 2
>>>>>>>> GB. That means the 24 slots would now require 48GB.  Now
you have
>>>>>>>> swap
>>>>>> and
>>>>>>>> if that happens you will see HBase in a cascading failure.
>>>>>>>> 
>>>>>>>> So while you can do a rolling restart with the changed configuration
>>>>>>>> (reducing the number of mappers and reducers) you end up
with less
>>>>>>>> slots
>>>>>>>> which will mean in longer run time for your jobs. (Less slots
==
>>>>>>>> less
>>>>>>>> parallelism )
>>>>>>>> 
>>>>>>>> Looking at the price of memory... you can get 48GB or even
64GB  for
>>>>>> around
>>>>>>>> the same price point. (8GB chips)
>>>>>>>> 
>>>>>>>> And I didn't even talk about adding SOLR either again a memory
>>>>>>>> hog...
>>>>>> ;-)
>>>>>>>> 
>>>>>>>> Note that I matched the number of mappers w reducers. You
could go
>>>>>>>> with
>>>>>>>> fewer reducers if you want. I tend to recommend a ratio of
2:1
>>>>>>>> mappers
>>>>>> to
>>>>>>>> reducers, depending on the work flow....
>>>>>>>> 
>>>>>>>> As to the disks... no 7200 SATA III drives are fine. SATA
III
>>>>>>>> interface
>>>>>> is
>>>>>>>> pretty much available in the new kit being shipped.
>>>>>>>> Its just that you don't have enough drives. 8 cores should
be 8
>>>>>> spindles if
>>>>>>>> available.
>>>>>>>> Otherwise you end up seeing your CPU load climb on wait states
as
>>>>>>>> the
>>>>>>>> processes wait for the disk i/o to catch up.
>>>>>>>> 
>>>>>>>> I mean you could build out a cluster w 4 x 3 3.5" 2TB drives
in a 1
>>>>>>>> U
>>>>>>>> chassis based on price. You're making a trade off and you
should be
>>>>>> aware of
>>>>>>>> the performance hit you will take.
>>>>>>>> 
>>>>>>>> HTH
>>>>>>>> 
>>>>>>>> -Mike
>>>>>>>> 
>>>>>>>> On Nov 27, 2012, at 1:52 PM, Jean-Marc Spaggiari <
>>>>>> jean-marc@spaggiari.org>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi Michael,
>>>>>>>>> 
>>>>>>>>> so are you recommanding 32Gb per node?
>>>>>>>>> 
>>>>>>>>> What about the disks? SATA drives are to slow?
>>>>>>>>> 
>>>>>>>>> JM
>>>>>>>>> 
>>>>>>>>> 2012/11/26, Michael Segel <michael_segel@hotmail.com>:
>>>>>>>>>> Uhm, those specs are actually now out of date.
>>>>>>>>>> 
>>>>>>>>>> If you're running HBase, or want to also run R on
top of Hadoop,
>>>>>>>>>> you
>>>>>>>>>> will
>>>>>>>>>> need to add more memory.
>>>>>>>>>> Also forget 1GBe got 10GBe,  and w 2 SATA drives,
you will be disk
>>>>>>>>>> i/o
>>>>>>>>>> bound
>>>>>>>>>> way too quickly.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Nov 26, 2012, at 8:05 AM, Marcos Ortiz <mlortiz@uci.cu>
wrote:
>>>>>>>>>> 
>>>>>>>>>>> Are you asking about hardware recommendations?
>>>>>>>>>>> Eric Sammer on his "Hadoop Operations" book,
did a great job
>>>>>>>>>>> about
>>>>>>>>>>> this:
>>>>>>>>>>> For middle size clusters (until 300 nodes):
>>>>>>>>>>> Processor: A dual quad-core 2.6 Ghz
>>>>>>>>>>> RAM: 24 GB DDR3
>>>>>>>>>>> Dual 1 Gb Ethernet NICs
>>>>>>>>>>> a SAS drive controller
>>>>>>>>>>> at least two SATA II drives in a JBOD configuration
>>>>>>>>>>> 
>>>>>>>>>>> The replication factor depends heavily of the
primary use of your
>>>>>>>>>>> cluster.
>>>>>>>>>>> 
>>>>>>>>>>> On 11/26/2012 08:53 AM, David Charle wrote:
>>>>>>>>>>>> hi
>>>>>>>>>>>> 
>>>>>>>>>>>> what's the recommended nodes for NN, hmaster
and zk nodes for a
>>>>>> larger
>>>>>>>>>>>> cluster, lets say 50-100+
>>>>>>>>>>>> 
>>>>>>>>>>>> also, what would be the ideal replication
factor for larger
>>>>>>>>>>>> clusters
>>>>>>>>>>>> when
>>>>>>>>>>>> u have 3-4 racks ?
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> David
>>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD
DE LAS
>>>>>>>>>>>> CIENCIAS
>>>>>>>>>>>> INFORMATICAS...
>>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>>>>>>>>>>> 
>>>>>>>>>>>> http://www.uci.cu
>>>>>>>>>>>> http://www.facebook.com/universidad.uci
>>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> 
>>>>>>>>>>> Marcos Luis Ortíz Valmaseda
>>>>>>>>>>> about.me/marcosortiz <http://about.me/marcosortiz>
>>>>>>>>>>> @marcosluis2186 <http://twitter.com/marcosluis2186>
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD
DE LAS
>>>>>>>>>>> CIENCIAS
>>>>>>>>>>> INFORMATICAS...
>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>>>>>>>>>> 
>>>>>>>>>>> http://www.uci.cu
>>>>>>>>>>> http://www.facebook.com/universidad.uci
>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci
>>>>> 
>>>>> 
>>>>> --
>>>>> Adrien Mogenet
>>>>> 06.59.16.64.22
>>>>> http://www.mogenet.me
>>>> 
>>> 
>> 
>> 
> 


Mime
View raw message