Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D0E43DB2A for ; Thu, 18 Oct 2012 13:49:43 +0000 (UTC) Received: (qmail 35914 invoked by uid 500); 18 Oct 2012 13:49:39 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 35312 invoked by uid 500); 18 Oct 2012 13:49:33 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 35283 invoked by uid 99); 18 Oct 2012 13:49:31 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Oct 2012 13:49:31 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jksingh26jun@gmail.com designates 74.125.83.48 as permitted sender) Received: from [74.125.83.48] (HELO mail-ee0-f48.google.com) (74.125.83.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Oct 2012 13:49:24 +0000 Received: by mail-ee0-f48.google.com with SMTP id b45so5276973eek.35 for ; Thu, 18 Oct 2012 06:49:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=k9wlYAT2dhm6jiC98eC3ohI9VyAQpiWJirjePKieHn8=; b=PL+xgvbtGsRfO9KjGum7XOxZDBFNd4rwx0Z9fWIpz0WqDlWvPL+a7Ghd5AJeVMBtv9 DbUq1+E59V5w9tRZ6JEwjee2aJzZKLskI6qavmo+NNIahcxvKmhII1ZjP0QTAM2TrmVs jQ67MZE3cZNicZ05t6S9iHhr4PJKkJveVrFh8EguNTanVY0YEV9xrwBu8QQiK4YxLwzT TdS8ZUJRuwqg0X7Fow3xldVaTGwAS910krH1yXxrJlbWac8fv2rMAl0pPGD+1UEikG2h y4XhIPDZbPPTMTI3qqQKIMphuJSxoYgkTTX3LL8ccSFcva/Uo5TVc7TBcfuqiPby2qGP bhaQ== Received: by 10.14.212.72 with SMTP id x48mr31516988eeo.40.1350568144634; Thu, 18 Oct 2012 06:49:04 -0700 (PDT) MIME-Version: 1.0 Received: by 10.14.194.195 with HTTP; Thu, 18 Oct 2012 06:48:44 -0700 (PDT) In-Reply-To: <507FF6F9.30708@crs4.it> References: <2E362ACC9493D747B488241C66B3B66520B716@RHV-EXRDA-S11.corp.ebay.com> <507FF6F9.30708@crs4.it> From: Jitendra Kumar Singh Date: Thu, 18 Oct 2012 19:18:44 +0530 Message-ID: Subject: Re: HDFS using SAN To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b621e4081816104cc55a6a4 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b621e4081816104cc55a6a4 Content-Type: text/plain; charset=ISO-8859-1 Hi, In the NetApp whitepaper on SAN solution (link given by Kevin) it makes following statement. Can someone please elaborate (or give a link that explains) how 12-disk in SAN can give 2000 IOPS while if used as JBOD would give 600 IOPS? "The E2660 can deliver up to 2,000 IOPS from a 12-disk stripe (the bottleneck being the 12 disks). This headroom translates into better read times for those 64KB blocks. Twelve copies of 12 MapReduce jobs reading from 12 SATA disks can at best never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five times the IOPS headroom, which translates into faster read times and high MapReduce throughput " Thanks and Regards, -- Jitendra Kumar Singh On Thu, Oct 18, 2012 at 6:02 PM, Luca Pireddu wrote: > On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote: > >> Tom >> >> Do you mean you are using GPFS instead of HDFS? Also, if you can share, >> are you deploying it as DAS set up or a SAN? >> >> Thanks, >> >> Abhishek >> >> > > Though I don't think I'd buy a SAN for a new Hadoop cluster, we have a SAN > and are using it *instead of HDFS* with a small/medium Hadoop MapReduce > cluster (up to 100 nodes or so, depending on our need). We still use the > local node disks for intermediate data (mapred local storage). Although > this set-up does limit our possibility to scale to a large number of nodes, > that's not a concern for us. On the plus, we gain the flexibility to be > able to share our cluster with non-Hadoop users at our centre. > > > -- > Luca Pireddu > CRS4 - Distributed Computing Group > Loc. Pixina Manna Edificio 1 > 09010 Pula (CA), Italy > Tel: +39 0709250452 > --047d7b621e4081816104cc55a6a4 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi,

In the NetApp whitepaper on SAN solution (link given= by Kevin) it makes following statement. Can someone please elaborate (or g= ive a link that explains) how 12-disk in SAN can give 2000 IOPS while if us= ed as JBOD would give 600 IOPS?=A0

"The E2660 can deliver up to 2,000 IOPS=A0
fr= om a 12-disk stripe (the bottleneck being the 12 disks). This headroom tran= slates into better read times=A0
for those 64KB blocks. Twelve co= pies of 12 MapReduce jobs reading from 12 SATA disks can at best=A0
never exceed 12 x 50 IOPS, or 600 IOPS. The E2660 volume has five time= s the IOPS headroom, which=A0
translates into faster read times a= nd high MapReduce throughput "=A0

Thanks and Reg= ards,
--
Jitendra Kumar Singh



On Thu, Oct 18, 2012 at 6:02 PM, Luca Pi= reddu <pireddu@crs4.it> wrote:
On 10/18/2012 02:21 AM, Pamecha, Abhishek wrote:
Tom

Do you mean you are using GPFS instead of HDFS? Also, if you can share,
are you deploying it as DAS set up or a SAN?

Thanks,

Abhishek



Though I don't think I'd buy a SAN for a new Hadoop cluster, we hav= e a SAN and are using it *instead of HDFS* with a small/medium Hadoop MapRe= duce cluster (up to 100 nodes or so, depending on our need). =A0We still us= e the local node disks for intermediate data (mapred local storage). =A0Alt= hough this set-up does limit our possibility to scale to a large number of = nodes, that's not a concern for us. =A0On the plus, we gain the flexibi= lity to be able to share our cluster with non-Hadoop users at our centre.


--
Luca Pireddu
CRS4 - Distributed Computing Group
Loc. Pixina Manna Edificio 1
09010 Pula (CA), Italy
Tel: +39 0709250452

--047d7b621e4081816104cc55a6a4--