Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1759411623 for ; Sat, 11 May 2013 06:31:05 +0000 (UTC) Received: (qmail 19295 invoked by uid 500); 11 May 2013 06:30:59 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 18776 invoked by uid 500); 11 May 2013 06:30:56 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 18750 invoked by uid 99); 11 May 2013 06:30:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 May 2013 06:30:55 +0000 X-ASF-Spam-Status: No, hits=3.5 required=5.0 tests=FORGED_YAHOO_RCVD,FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [98.139.212.171] (HELO nm12.bullet.mail.bf1.yahoo.com) (98.139.212.171) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 11 May 2013 06:30:47 +0000 Received: from [98.139.212.148] by nm12.bullet.mail.bf1.yahoo.com with NNFMP; 11 May 2013 06:30:26 -0000 Received: from [98.139.211.198] by tm5.bullet.mail.bf1.yahoo.com with NNFMP; 11 May 2013 06:30:26 -0000 Received: from [127.0.0.1] by smtp207.mail.bf1.yahoo.com with NNFMP; 11 May 2013 06:30:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1368253826; bh=4OMDb3hdEhFW6cwqu0tQPZW07Z2qyL6MAY3gCKTcwsI=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:X-Rocket-Received:From:To:Subject:Date:Message-ID:MIME-Version:Content-Type:X-Mailer:Thread-Index:Content-Language; b=znX+h947Z1ZIS2kXMnR7pQK6sxxWseODEHafwYm7g/dlSPQJFc/XRvrCju9EeyP7u61the6BQnoEQTBqsYFnhAgBCOsMpyfQhZB02FpSzlwMPCkzi+KbnwoSyNBif9a4RrRtIKL0aFOQ8a+5htX4GbzHn8yHFzaWJChdqh8n/hY= X-Yahoo-Newman-Id: 200335.71327.bm@smtp207.mail.bf1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: x.ySJqEVM1lUF5tEkMwX9E5YAS2dsjt8utiDNOr2FtNd54. raBLaR.FMaeeKxoeomCodIFdlkN4x4hFMuO5BwxmGYG.QwHKX4dObXtDvdA2 WY4IGo.3Hfj2ErAYS1xD3jQtu_6l6tWNsMd6rfwC9LQeUnclIF6C.8OoaOss rPf2dN46OrvFxsWggcGV2GFRbq_iFiO8h5z5NvMxYnWbfVFpSYL7EpTrYsSG 0VY4pP9RrKT8gXE7_zognFHXYFtNKpMTrKyc3BrPO_P6hasTo6i0PpiTVb_W KWZj8KcEzb8tCG.8CKlagMUpT5FpmiOycJ0wY4DobHrxUT4.JvkQQOxsJIJH FCuyTlulKASSgXJ.pL_3WLTASDQblquQeXHWQhfvzUU9hm6uV17kYfmnkOk9 A463KyD5FH2aYlBZs49i8YpBoI849.donmwptYb5q2TBGebH.Uw02hGmA8_y TOwSFAi6wL2BuOT6Z05KHY7lZdWs- X-Yahoo-SMTP: k2gD1GeswBAV_JFpZm8dmpTCwr4ufTKOyA-- X-Rocket-Received: from sattelite (davidparks21@113.161.75.108 with ) by smtp207.mail.bf1.yahoo.com with SMTP; 10 May 2013 23:30:26 -0700 PDT From: "David Parks" To: Subject: What's the best disk configuration for hadoop? SSD's Raid levels, etc? Date: Sat, 11 May 2013 13:30:11 +0700 Message-ID: <032c01ce4e10$ff2410c0$fd6c3240$@yahoo.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_032D_01CE4E4B.AB83FA30" X-Mailer: Microsoft Outlook 14.0 Thread-Index: Ac5OC+ISycSpQ1Z3SPGoXdDC78umaQ== Content-Language: en-us X-Virus-Checked: Checked by ClamAV on apache.org This is a multipart message in MIME format. ------=_NextPart_000_032D_01CE4E4B.AB83FA30 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit We've got a cluster of 10x 8core/24gb nodes, currently with 1 4TB disk (3 disk slots max), they chug away ok currently, only slightly IO bound on average. I'm going to upgrade the disk configuration at some point (we do need more space on HDFS) and I'm thinking about what's best hardware-wise: . Would it be wise to use one of the three disk slots for a 1TB SSD? I wouldn't use it for HDFS, but for map-output and sorting it might make a big difference no? . If I put in either 1 or 2 more 4TB disks for HDFS, should I RAID-0 them for speed, or will HDFS balance well across multiple partitions on its own? . Would anyone suggest 3 4TB disks and a RAID-5 configuration to guard against disk replacements over the above options? Dave ------=_NextPart_000_032D_01CE4E4B.AB83FA30 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

We’ve got a cluster of 10x 8core/24gb nodes, = currently with 1 4TB disk (3 disk slots max), they chug away ok = currently, only slightly IO bound on average.

 

I’m = going to upgrade the disk configuration at some point (we do need more = space on HDFS) and I’m thinking about what’s best = hardware-wise:

 

·         = Would it be wise to use one of the three = disk slots for a 1TB SSD?  I wouldn’t use it for HDFS, but = for map-output and sorting it might make a big difference = no?

·         = If I put in either 1 or 2 more 4TB disks = for HDFS, should I RAID-0 them for speed, or will HDFS balance well = across multiple partitions on its own?

·         = Would anyone suggest 3 4TB disks and a = RAID-5 configuration to guard against disk replacements over the above = options?

 

Dave

 

------=_NextPart_000_032D_01CE4E4B.AB83FA30--