Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 429CB951A for ; Fri, 12 Oct 2012 20:53:53 +0000 (UTC) Received: (qmail 52091 invoked by uid 500); 12 Oct 2012 20:53:49 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 51815 invoked by uid 500); 12 Oct 2012 20:53:48 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 51807 invoked by uid 99); 12 Oct 2012 20:53:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Oct 2012 20:53:48 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.219.48] (HELO mail-oa0-f48.google.com) (209.85.219.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Oct 2012 20:53:41 +0000 Received: by mail-oa0-f48.google.com with SMTP id h2so4057421oag.35 for ; Fri, 12 Oct 2012 13:53:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=iRMoYwVZKUHgvPA8ffvVYbVw6nw2b0AU6HSM+0ruJbk=; b=NsW4e0zBLuImmMtmvUKTEiqLAfWnTnjd34Bxc2aAngw5SqJ7EHF8l+F8pfzg9Gbbfc UVoMeEtgKPgMWjoSSE0jJ1Cg31hq1RK6WIglbRCdbbMDWkvQi6/gXcNmPjVLXX6y5ntC 0G/VbPkEQ1/iBueWA8zKeodbz6ToSwm7Tt/5DdT2VIHwsaSabjSUkXxDxHKMIlaRfn/R P25L24fh3uKPrTyAC0Tsjk2l9lCk62HlaSxwlAdJoNHUEwenUkPv3g7xCfNuUtJhLOWZ 9FhI4xgzU1TkHYHz92kWVZvKTnAGAbyiVAczuTS8r39heox8CZOahX0FDpqu9BcRiOB4 USGg== Received: by 10.60.169.202 with SMTP id ag10mr4549705oec.84.1350075200279; Fri, 12 Oct 2012 13:53:20 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.172.72 with HTTP; Fri, 12 Oct 2012 13:52:59 -0700 (PDT) In-Reply-To: <029C75A3482BE64594E21FC09BB19F411386C299@BY2PRD0711MB428.namprd07.prod.outlook.com> References: <029C75A3482BE64594E21FC09BB19F411386C299@BY2PRD0711MB428.namprd07.prod.outlook.com> From: Ted Dunning Date: Fri, 12 Oct 2012 13:52:59 -0700 Message-ID: Subject: Re: Spindle per Cores To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec54c52fcbb953904cbe2e05c X-Gm-Message-State: ALoCoQlgeREtbEJAbmXik6R6kilSTL3xosYPMUv171JnASsLEdMwQwqkmBjogX3jdMiBgI2R2Cs8 X-Virus-Checked: Checked by ClamAV on apache.org --bcaec54c52fcbb953904cbe2e05c Content-Type: text/plain; charset=ISO-8859-1 I think that this rule of thumb is to prevent people configuring 2 disk clusters with 16 cores or 48 disk machines with 4 cores. Both configurations could make sense in narrow applications, but both would most probably be sub-optimal. Within narrow bands, I doubt you will see huge changes. I like to be able to a) be able to saturate disk I/O which requires some CPU and a good control. Different distros vary a lot here b) have enough memory per slot. Lots of people go cheap on this and they wind up hamstringing performance c) make sure there is enough CPU left over for the application. This is hugely app dependent, obviously. On Fri, Oct 12, 2012 at 1:45 PM, Hank Cohen wrote: > What empirical evidence is there for this rule of thumb? > In other words, what tests or metrics would indicate an optimal > spindle/core ratio and how dependent is this on the nature of the data and > of the map/reduce computation? > > My understanding is that there are lots of clusters with more spindles > than cores. Specifically, typical 2U servers can hold 12 3.5" disk drives. > So lots of Hadoop clusters have dual 4 core processors and 12 spindles. > Would it be better to have 6 core processors if you are loading up the > boxes with 12 disks? And most importantly, how would one know that the mix > was optimal? > > Hank Cohen > Altior Inc. > > -----Original Message----- > From: Patai Sangbutsarakum [mailto:silvianhadoop@gmail.com] > Sent: Friday, October 12, 2012 10:46 AM > To: user@hadoop.apache.org > Subject: Spindle per Cores > > I have read around about the hardware recommendation for hadoop cluster. > One of them is recommend 1:1 ratio between spindle per core. > > Intel CPU come with Hyperthread which will double the number cores on one > physical CPU. eg. 8 cores with Hyperthread it because 16 which is where we > start to calculate about number of task slots per node. > > Once it come to spindle, i strongly believe I should pick 8 cores and > picks 8 disks in order to get 1:1 ratio. > > Please suggest > Patai > > > --bcaec54c52fcbb953904cbe2e05c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I think that this rule of thumb is to prevent people configuring 2 disk clu= sters with 16 cores or 48 disk machines with 4 cores. =A0Both configuration= s could make sense in narrow applications, but both would most probably be = sub-optimal.

Within narrow bands, I doubt you will see huge changes. =A0I= like to be able to=A0

a) be able to saturate disk= I/O which requires some CPU and a good control. =A0Different distros vary = a lot here

b) have enough memory per slot. =A0Lots of people go ch= eap on this and they wind up hamstringing performance

<= div>c) make sure there is enough CPU left over for the application. =A0This= is hugely app dependent, obviously.

On Fri, Oct 12, 2012 at 1:45 PM, Hank Cohen = <hank.cohen@altior.com> wrote:
What empirical evidence is there for this rule of thumb?
In other words, what tests or metrics would indicate an optimal spindle/cor= e ratio and how dependent is this on the nature of the data and of the map/= reduce computation?

My understanding is that there are lots of clusters with more spindles than= cores. =A0Specifically, typical 2U servers can hold 12 3.5" disk driv= es. =A0So lots of Hadoop clusters have dual 4 core processors and 12 spindl= es. =A0Would it be better to have 6 core processors if you are loading up t= he boxes with 12 disks? =A0And most importantly, how would one know that th= e mix was optimal?

Hank Cohen
Altior Inc.

-----Original Message-----
From: Patai Sangbutsarakum [mailto:silvianhadoop@gmail.com]
Sent: Friday, October 12, 2012 10:46 AM
To: user@hadoop.apache.org Subject: Spindle per Cores

I have read around about the hardware recommendation for hadoop cluster. One of them is recommend 1:1 ratio between spindle per core.

Intel CPU come with Hyperthread which will double the number cores on one p= hysical CPU. eg. 8 cores with Hyperthread it because 16 which is where we s= tart to calculate about number of task slots per node.

Once it come to spindle, i strongly believe I should pick 8 cores and picks= 8 disks in order to get 1:1 ratio.

Please suggest
Patai



--bcaec54c52fcbb953904cbe2e05c--