Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 76336D1CD for ; Thu, 11 Oct 2012 19:57:49 +0000 (UTC) Received: (qmail 41652 invoked by uid 500); 11 Oct 2012 19:57:44 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 41469 invoked by uid 500); 11 Oct 2012 19:57:44 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 41462 invoked by uid 99); 11 Oct 2012 19:57:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Oct 2012 19:57:44 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Oct 2012 19:57:38 +0000 Received: by mail-ob0-f176.google.com with SMTP id x4so2595929obh.35 for ; Thu, 11 Oct 2012 12:57:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=xkoqubhAuGzwRcgOW9MZ7x0FJdACFy7RgqrkrZXv8ek=; b=VWzRxsnY3mgDxD5J0AgK/MuEpieuTFhBFU4mN+kKDw7Yj+z6XXwKTHzLVj3U23HsHB YudHmqcLbvSc/mQkz6kesWGtH3IXEzDCR7ixruqbQrIB1Ub9gnyRYpwdLsdE5eYYs4DB jqT5whPYtSpi2owdP6WM1xUIt1orgt4m1jsqTtDEG/Vpoh0ZcZmBPqmLqMRn+E2aKGl1 3eX+6SJQDVBZHHmxXoKO8+tvlqnP+sVhIYm9cQjeDvdIfMLuY5vIOiRSMopEjTn+jQN0 hSCLdqLaps5qqS+EWamdtbk82Cbly7/yWVujgj9Jj4/sbAi+bth2rWqhxskM4ZGxZxLB B87A== Received: by 10.182.18.165 with SMTP id x5mr1667078obd.73.1349985436954; Thu, 11 Oct 2012 12:57:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.76.84.72 with HTTP; Thu, 11 Oct 2012 12:56:56 -0700 (PDT) In-Reply-To: References: From: Ted Dunning Date: Thu, 11 Oct 2012 12:56:56 -0700 Message-ID: Subject: Re: Why they recommend this (CPU) ? To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d043be10e6bee8704cbcdfac6 X-Gm-Message-State: ALoCoQmTj7v8uoLQCXEU6nCCF0BSE0+gey5Wp7yNXj7mt8JLvFRYcj1pkVNrHqNPRRwfiru+7qap --f46d043be10e6bee8704cbcdfac6 Content-Type: text/plain; charset=ISO-8859-1 Like I said, measure twice, cut once. On Thu, Oct 11, 2012 at 12:47 PM, Goldstone, Robin J. wrote: > Be sure you are comparing apples to apples. The E5-2650 has a larger > cache than the E5-2640, faster system bus and can support faster (1600Ghz > vs 1333Ghz) DRAM resulting in greater potential memory bandwidth. > > http://ark.intel.com/compare/64590,64591 > > > From: Patrick Angeles > Reply-To: "user@hadoop.apache.org" > Date: Thursday, October 11, 2012 12:36 PM > To: "user@hadoop.apache.org" > Subject: Re: Why they recommend this (CPU) ? > > If you look at comparable Intel parts: > > Intel E5-2640 > 6 cores @ 2.5 Ghz > 95W - $885 > > Intel E5-2650 > 8 cores @ 2.0 Ghz > 95W - $1107 > > So, for $400 more on a dual proc system -- which really isn't much -- > you get 2 more cores for a 20% drop in speed. I can believe that for some > scenarios, the faster cores would fare better. Gzip compression is one that > comes to mind, where you are aggressively trading CPU for lower storage > volume and IO. An HBase cluster is another example. > > On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney wrote: > >> My own clusters are too temporary and virtual for me to notice. I >> haven't thought of clock speed as having mattered in a long time, so I'm >> curious what kind of use cases might benefit from faster cores. Is there a >> category in some way where this sweet spot for faster cores occurs? >> >> Russell Jurney http://datasyndrome.com >> >> On Oct 11, 2012, at 11:39 AM, Ted Dunning wrote: >> >> You should measure your workload. Your experience will vary >> dramatically with different computations. >> >> On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney < >> russell.jurney@gmail.com> wrote: >> >>> Anyone got data on this? This is interesting, and somewhat >>> counter-intuitive. >>> >>> Russell Jurney http://datasyndrome.com >>> >>> On Oct 11, 2012, at 10:47 AM, Jay Vyas wrote: >>> >>> > Presumably, if you have a reasonable number of cores - speeding the >>> cores up will be better than forking a task into smaller and smaller chunks >>> - because at some point the overhead of multiple processes would be a >>> bottleneck - maybe due to streaming reads and writes? I'm sure each and >>> every problem has a different sweet spot. >>> >> >> > --f46d043be10e6bee8704cbcdfac6 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Like I said, measure twice, cut once.

On = Thu, Oct 11, 2012 at 12:47 PM, Goldstone, Robin J. <goldstone1@llnl.gov<= /a>> wrote:
Be sure you are comparing apples to apples. =A0The E5-2650 has a large= r cache than the E5-2640, faster system bus and can support faster (1600Ghz= vs 1333Ghz) DRAM resulting in greater potential memory bandwidth.



From: Patrick Angeles <patrick@cloudera.com= >
Reply-To: "user@hadoop.apache.org" &= lt;user@hadoop.= apache.org>
Date: Thursday, October 11, 2012 12= :36 PM
To: "user@hadoop.apache.org" <user@hadoop.apache= .org>
Subject: Re: Why they recommend thi= s (CPU) ?

If you look at comparable Intel parts:

Intel E5-2640
6 cores @ 2.5 Ghz
95W - $885

Intel E5-2650
8 cores @ 2.0 Ghz
95W -=A0$1107

So, for $400 more on a dual proc system -- which really isn't much= -- you get 2 more cores for a 20% drop in speed. I can believe that for so= me scenarios, the faster cores would fare better. Gzip compression is one t= hat comes to mind, where you are aggressively trading CPU for lower storage volume and IO. An HBase cluster is another e= xample.

On Thu, Oct 11, 2012 at 3:03 PM, Russell Jurney = <russell.j= urney@gmail.com> wrote:
My own clusters are too temporary and virtual for me to notice. I have= n't thought of clock speed as having mattered in a long time, so I'= m curious what kind of use cases might benefit from faster cores. Is there = a category in some way where this sweet spot for faster cores occurs?

Russell Jurney http:/= /datasyndrome.com

On Oct 11, 2012, at 11:39 AM, Ted Dunning <tdunning@maprtech.com> wrote:

You should measure your workload. =A0Your experience will vary dramati= cally with different computations.

On Thu, Oct 11, 2012 at 10:56 AM, Russell Jurney= <russell.j= urney@gmail.com> wrote:
Anyone got data on this? This is interesting, and somewhat counter-intuitiv= e.

Russell Jurney http:/= /datasyndrome.com

On Oct 11, 2012, at 10:47 AM, Jay Vyas <jayunit100@gmail.com> wrote:

> Presumably, if you have a reasonable number of cores - speeding the co= res up will be better than forking a task into smaller and smaller chunks -= because at some point the overhead of multiple processes would be a bottle= neck - maybe due to streaming reads and writes? =A0I'm sure each and every problem has a different sweet s= pot.



--f46d043be10e6bee8704cbcdfac6--