Return-Path: Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: (qmail 71411 invoked from network); 8 Feb 2011 19:33:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Feb 2011 19:33:45 -0000 Received: (qmail 98415 invoked by uid 500); 8 Feb 2011 19:33:45 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 98175 invoked by uid 500); 8 Feb 2011 19:33:44 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 98163 invoked by uid 99); 8 Feb 2011 19:33:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Feb 2011 19:33:44 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of amp@opendns.com designates 67.215.68.163 as permitted sender) Received: from [67.215.68.163] (HELO mail.opendns.com) (67.215.68.163) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Feb 2011 19:33:35 +0000 Received: from Adams-Desktop.local ([67.215.69.42]) (authenticated bits=0) by mail.opendns.com (8.14.3/8.14.3/Debian-5) with ESMTP id p18JXDMT024013 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Tue, 8 Feb 2011 19:33:13 GMT Message-ID: <4D519A79.5050704@opendns.com> Date: Tue, 08 Feb 2011 11:33:13 -0800 From: Adam Phelps User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7 MIME-Version: 1.0 To: hdfs-user@hadoop.apache.org Subject: Re: HDFS drive, partition best practice References: <1524C476-E32B-4006-8EE2-77CF0D0766B4@parad.net> In-Reply-To: <1524C476-E32B-4006-8EE2-77CF0D0766B4@parad.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org On 2/7/11 2:06 PM, Jonathan Disher wrote: > Currently I have a 48 node cluster using Dell R710's with 12 disks - two > 250GB SATA drives in RAID1 for OS, and ten 1TB SATA disks as a JBOD > (mounted on /data/0 through /data/9) and listed separately in > hdfs-site.xml. It works... mostly. The big issues you will encounter is > losing a disk - the DataNode process will crash, and if you comment out > the affected drive, when you replace it you will have 9 disks full to N% > and one empty disk. If DataNode is going down after a single disk failure then you probably haven't set dfs.datanode.failed.volumes.tolerated in hdfs-site.xml. You can up that number to allow DataNode to tolerate dead drives. - Adam