Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CCD66E322 for ; Thu, 29 Nov 2012 01:14:33 +0000 (UTC) Received: (qmail 90334 invoked by uid 500); 29 Nov 2012 01:14:31 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 90266 invoked by uid 500); 29 Nov 2012 01:14:31 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 90255 invoked by uid 99); 29 Nov 2012 01:14:31 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Nov 2012 01:14:31 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of michael_segel@hotmail.com designates 65.55.111.77 as permitted sender) Received: from [65.55.111.77] (HELO blu0-omc2-s2.blu0.hotmail.com) (65.55.111.77) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Nov 2012 01:14:26 +0000 Received: from BLU0-SMTP221 ([65.55.111.71]) by blu0-omc2-s2.blu0.hotmail.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 28 Nov 2012 17:14:05 -0800 X-Originating-IP: [173.15.87.37] X-EIP: [ZZbWLjU7BkjfuSpkoK1ASNj+/6jvPuYa] X-Originating-Email: [michael_segel@hotmail.com] Message-ID: Received: from [192.168.0.104] ([173.15.87.37]) by BLU0-SMTP221.blu0.hotmail.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Wed, 28 Nov 2012 17:14:03 -0800 Content-Type: text/plain; charset="iso-8859-1" MIME-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: recommended nodes From: Michael Segel In-Reply-To: Date: Wed, 28 Nov 2012 19:14:02 -0600 Content-Transfer-Encoding: quoted-printable References: <50B37737.7050409@uci.cu> To: user@hbase.apache.org X-Mailer: Apple Mail (2.1499) X-OriginalArrivalTime: 29 Nov 2012 01:14:03.0782 (UTC) FILETIME=[D23D9260:01CDCDCE] X-Virus-Checked: Checked by ClamAV on apache.org Just a couple of things.=20 I'm neutral on the use of LVMs. Some would point out that there's some = overhead, but on the flip side, it can make managing the machines = easier.=20 If you're using MapR, you don't want to use LVMs but raw devices.=20 In terms of GC, its going to depend on the heap size and not the total = memory. With respect to HBase. ... MSLABS is the way to go.=20 On Nov 28, 2012, at 12:05 PM, Jean-Marc Spaggiari = wrote: > Hi Gregory, >=20 > I founs this about LVM: > -> http://blog.andrew.net.au/2006/08/09 > -> = http://www.phoronix.com/scan.php?page=3Darticle&item=3Dfedora_15_lvm&num=3D= 2 >=20 > Seems that performances are still correct with it. I will most > probably give it a try and bench that too... I have one new hard drive > which should arrived tomorrow. Perfect timing ;) >=20 >=20 >=20 > JM >=20 > 2012/11/28, Mohit Anchlia : >>=20 >>=20 >>=20 >>=20 >> On Nov 28, 2012, at 9:07 AM, Adrien Mogenet = >> wrote: >>=20 >>> Does HBase really benefit from 64 GB of RAM since allocating too = large >>> heap >>> might increase GC time ? >>>=20 >> Benefit you get is from OS cache >>> Another question : why not RAID 0, in order to aggregate disk = bandwidth ? >>> (and thus keep 3x replication factor) >>>=20 >>>=20 >>> On Wed, Nov 28, 2012 at 5:58 PM, Michael Segel >>> wrote: >>>=20 >>>> Sorry, >>>>=20 >>>> I need to clarify. >>>>=20 >>>> 4GB per physical core is a good starting point. >>>> So with 2 quad core chips, that is going to be 32GB. >>>>=20 >>>> IMHO that's a minimum. If you go with HBase, you will want more. >>>> (Actually >>>> you will need more.) The next logical jump would be to 48 or 64GB. >>>>=20 >>>> If we start to price out memory, depending on vendor, your = company's >>>> procurement, there really isn't much of a price difference in = terms of >>>> 32,48, or 64 GB. >>>> Note that it also depends on the chips themselves. Also you need to = see >>>> how many memory channels exist in the mother board. You may need to = buy >>>> in >>>> pairs or triplets. Your hardware vendor can help you. (Also you = need to >>>> keep an eye on your hardware vendor. Sometimes they will give you = higher >>>> density chips that are going to be more expensive...) ;-) >>>>=20 >>>> I tend to like having extra memory from the start. >>>> It gives you a bit more freedom and also protects you from 'fat' = code. >>>>=20 >>>> Looking at YARN... you will need more memory too. >>>>=20 >>>>=20 >>>> With respect to the hard drives... >>>>=20 >>>> The best recommendation is to keep the drives as JBOD and then use = 3x >>>> replication. >>>> In this case, make sure that the disk controller cards can handle = JBOD. >>>> (Some don't support JBOD out of the box) >>>>=20 >>>> With respect to RAID... >>>>=20 >>>> If you are running MapR, no need for RAID. >>>> If you are running an Apache derivative, you could use RAID 1. Then = cut >>>> your replication to 2X. This makes it easier to manage drive = failures. >>>> (Its not the norm, but it works...) In some clusters, they are = using >>>> appliances like Net App's e series where the machines see the = drives as >>>> local attached storage and I think the appliances themselves are = using >>>> RAID. I haven't played with this configuration, however it could = make >>>> sense and its a valid design. >>>>=20 >>>> HTH >>>>=20 >>>> -Mike >>>>=20 >>>> On Nov 28, 2012, at 10:33 AM, Jean-Marc Spaggiari >>>> >>>> wrote: >>>>=20 >>>>> Hi Mike, >>>>>=20 >>>>> Thanks for all those details! >>>>>=20 >>>>> So to simplify the equation, for 16 virtual cores we need 48 to = 64GB. >>>>> Which mean 3 to 4GB per core. So with quad cores, 12GB to 16GB are = a >>>>> good start? Or I simplified it to much? >>>>>=20 >>>>> Regarding the hard drives. If you add more than one drive, do you = need >>>>> to build them on RAID or similar systems? Or can Hadoop/HBase be >>>>> configured to use more than one drive? >>>>>=20 >>>>> Thanks, >>>>>=20 >>>>> JM >>>>>=20 >>>>> 2012/11/27, Michael Segel : >>>>>>=20 >>>>>> OK... I don't know why Cloudera is so hung up on 32GB. ;-) [Its = an >>>> inside >>>>>> joke ...] >>>>>>=20 >>>>>> So here's the problem... >>>>>>=20 >>>>>> By default, your child processes in a map/reduce job get a = default >>>> 512MB. >>>>>> The majority of the time, this gets raised to 1GB. >>>>>>=20 >>>>>> 8 cores (dual quad cores) shows up at 16 virtual processors in = Linux. >>>> (Note: >>>>>> This is why when people talk about the number of cores, you have = to >>>> specify >>>>>> physical cores or logical cores....) >>>>>>=20 >>>>>> So if you were to over subscribe and have lets say 12 mappers = and 12 >>>>>> reducers, that's 24 slots. Which means that you would need 24GB = of >>>> memory >>>>>> reserved just for the child processes. This would leave 8GB for = DN, TT >>>> and >>>>>> the rest of the linux OS processes. >>>>>>=20 >>>>>> Can you live with that? Sure. >>>>>> Now add in R, HBase, Impala, or some other set of tools on top of = the >>>>>> cluster. >>>>>>=20 >>>>>> Ooops! Now you are in trouble because you will swap. >>>>>> Also adding in R, you may want to bump up those child procs from = 1GB >>>>>> to >>>> 2 >>>>>> GB. That means the 24 slots would now require 48GB. Now you have = swap >>>> and >>>>>> if that happens you will see HBase in a cascading failure. >>>>>>=20 >>>>>> So while you can do a rolling restart with the changed = configuration >>>>>> (reducing the number of mappers and reducers) you end up with = less >>>>>> slots >>>>>> which will mean in longer run time for your jobs. (Less slots =3D=3D= less >>>>>> parallelism ) >>>>>>=20 >>>>>> Looking at the price of memory... you can get 48GB or even 64GB = for >>>> around >>>>>> the same price point. (8GB chips) >>>>>>=20 >>>>>> And I didn't even talk about adding SOLR either again a memory = hog... >>>> ;-) >>>>>>=20 >>>>>> Note that I matched the number of mappers w reducers. You could = go >>>>>> with >>>>>> fewer reducers if you want. I tend to recommend a ratio of 2:1 = mappers >>>> to >>>>>> reducers, depending on the work flow.... >>>>>>=20 >>>>>> As to the disks... no 7200 SATA III drives are fine. SATA III >>>>>> interface >>>> is >>>>>> pretty much available in the new kit being shipped. >>>>>> Its just that you don't have enough drives. 8 cores should be 8 >>>> spindles if >>>>>> available. >>>>>> Otherwise you end up seeing your CPU load climb on wait states as = the >>>>>> processes wait for the disk i/o to catch up. >>>>>>=20 >>>>>> I mean you could build out a cluster w 4 x 3 3.5" 2TB drives in a = 1 U >>>>>> chassis based on price. You're making a trade off and you should = be >>>> aware of >>>>>> the performance hit you will take. >>>>>>=20 >>>>>> HTH >>>>>>=20 >>>>>> -Mike >>>>>>=20 >>>>>> On Nov 27, 2012, at 1:52 PM, Jean-Marc Spaggiari < >>>> jean-marc@spaggiari.org> >>>>>> wrote: >>>>>>=20 >>>>>>> Hi Michael, >>>>>>>=20 >>>>>>> so are you recommanding 32Gb per node? >>>>>>>=20 >>>>>>> What about the disks? SATA drives are to slow? >>>>>>>=20 >>>>>>> JM >>>>>>>=20 >>>>>>> 2012/11/26, Michael Segel : >>>>>>>> Uhm, those specs are actually now out of date. >>>>>>>>=20 >>>>>>>> If you're running HBase, or want to also run R on top of = Hadoop, you >>>>>>>> will >>>>>>>> need to add more memory. >>>>>>>> Also forget 1GBe got 10GBe, and w 2 SATA drives, you will be = disk >>>>>>>> i/o >>>>>>>> bound >>>>>>>> way too quickly. >>>>>>>>=20 >>>>>>>>=20 >>>>>>>> On Nov 26, 2012, at 8:05 AM, Marcos Ortiz = wrote: >>>>>>>>=20 >>>>>>>>> Are you asking about hardware recommendations? >>>>>>>>> Eric Sammer on his "Hadoop Operations" book, did a great job = about >>>>>>>>> this: >>>>>>>>> For middle size clusters (until 300 nodes): >>>>>>>>> Processor: A dual quad-core 2.6 Ghz >>>>>>>>> RAM: 24 GB DDR3 >>>>>>>>> Dual 1 Gb Ethernet NICs >>>>>>>>> a SAS drive controller >>>>>>>>> at least two SATA II drives in a JBOD configuration >>>>>>>>>=20 >>>>>>>>> The replication factor depends heavily of the primary use of = your >>>>>>>>> cluster. >>>>>>>>>=20 >>>>>>>>> On 11/26/2012 08:53 AM, David Charle wrote: >>>>>>>>>> hi >>>>>>>>>>=20 >>>>>>>>>> what's the recommended nodes for NN, hmaster and zk nodes for = a >>>> larger >>>>>>>>>> cluster, lets say 50-100+ >>>>>>>>>>=20 >>>>>>>>>> also, what would be the ideal replication factor for larger >>>>>>>>>> clusters >>>>>>>>>> when >>>>>>>>>> u have 3-4 racks ? >>>>>>>>>>=20 >>>>>>>>>> -- >>>>>>>>>> David >>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS = CIENCIAS >>>>>>>>>> INFORMATICAS... >>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION >>>>>>>>>>=20 >>>>>>>>>> http://www.uci.cu >>>>>>>>>> http://www.facebook.com/universidad.uci >>>>>>>>>> http://www.flickr.com/photos/universidad_uci >>>>>>>>>=20 >>>>>>>>> -- >>>>>>>>>=20 >>>>>>>>> Marcos Luis Ort=EDz Valmaseda >>>>>>>>> about.me/marcosortiz >>>>>>>>> @marcosluis2186 >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS = CIENCIAS >>>>>>>>> INFORMATICAS... >>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION >>>>>>>>>=20 >>>>>>>>> http://www.uci.cu >>>>>>>>> http://www.facebook.com/universidad.uci >>>>>>>>> http://www.flickr.com/photos/universidad_uci >>>=20 >>>=20 >>> -- >>> Adrien Mogenet >>> 06.59.16.64.22 >>> http://www.mogenet.me >>=20 >=20