Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 32B599637 for ; Mon, 30 Jan 2012 21:58:15 +0000 (UTC) Received: (qmail 90943 invoked by uid 500); 30 Jan 2012 21:58:14 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 90777 invoked by uid 500); 30 Jan 2012 21:58:12 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 90769 invoked by uid 99); 30 Jan 2012 21:58:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Jan 2012 21:58:12 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [64.78.22.19] (HELO EXHUB017-4.exch017.msoutlookonline.net) (64.78.22.19) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Jan 2012 21:58:06 +0000 Received: from [10.2.1.233] (206.205.249.130) by smtpx17.msoutlookonline.net (64.78.22.39) with Microsoft SMTP Server (TLS) id 8.3.213.0; Mon, 30 Jan 2012 13:57:45 -0800 Message-ID: <4F271256.90709@resonatenetworks.com> Date: Mon, 30 Jan 2012 16:57:42 -0500 From: Aaron Tokhy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:9.0) Gecko/20111229 Thunderbird/9.0 MIME-Version: 1.0 To: CC: Pavel Frolov , Neil Yalowitz Subject: Hardware/Software JBOD vs *.data.dir "JBOD" Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Given a HDFS slave node setup of 3 disks per node, should I have 3 filesystems (one file system per disk) in my dfs.data.dir listing, or should I have a single filesystem on a JBOD setup of 3 disks? Googling this problem suggests using "JBOD" instead of RAID 0, but I'm talking about two different kinds of JBOD: one managed by OS (mdadm) or firmware with a single filesystem, and the other managed by the DataNode (with multiple filesystems). I already have a preference to providing multiple filesystems in the dfs.data.dir listing since theoretically the DataNode should properly handle where it would place its blocks (instead of abstracting this to the OS or firmware). When a drive dies, I could also theoretically swap in a new drive without worrying about crashing an entire JBOD array (technically I only lose the blocks on the failing disk, not risking filesystem level corruption). In some ways, I may already know the answer to my question, I'm just looking for anyone's experience with this datacenter-wide decision, or if they have a preference of one method over another. I'm trying to go along the lines as what is being done in this post: http://old.nabble.com/forum/ViewPost.jtp?post=21423861&framed=y