Return-Path: Delivered-To: apmail-lucene-hadoop-user-archive@locus.apache.org Received: (qmail 51283 invoked from network); 13 Sep 2007 17:55:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Sep 2007 17:55:52 -0000 Received: (qmail 82044 invoked by uid 500); 13 Sep 2007 17:55:43 -0000 Delivered-To: apmail-lucene-hadoop-user-archive@lucene.apache.org Received: (qmail 82020 invoked by uid 500); 13 Sep 2007 17:55:43 -0000 Mailing-List: contact hadoop-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-user@lucene.apache.org Delivered-To: mailing list hadoop-user@lucene.apache.org Received: (qmail 82011 invoked by uid 99); 13 Sep 2007 17:55:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Sep 2007 10:55:43 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jssarma@facebook.com designates 204.15.23.140 as permitted sender) Received: from [204.15.23.140] (HELO sf2pmxf02.TheFacebook.com) (204.15.23.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Sep 2007 17:55:39 +0000 Received: from SF2PMXB01.TheFacebook.com ([192.168.16.15]) by sf2pmxf02.TheFacebook.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 13 Sep 2007 10:57:13 -0700 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Multiple HDFS paths and Multiple Masters... Date: Thu, 13 Sep 2007 10:55:21 -0700 Message-ID: In-Reply-To: <663745.92120.qm@web45405.mail.sp1.yahoo.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Multiple HDFS paths and Multiple Masters... Thread-Index: Acf2JgIA3l0bjFL1T4+/Ivtu6VgV3QABXuEw From: "Joydeep Sen Sarma" To: X-OriginalArrivalTime: 13 Sep 2007 17:57:13.0758 (UTC) FILETIME=[83BF0BE0:01C7F62F] X-Virus-Checked: Checked by ClamAV on apache.org I used to work in Netapp HA group - so can explain the single drive failure stuff a bit (although the right forum is the toasters@mathworks.com mailing list). The shelves are supposed to bypass failed drives (the shelf re-routes the FC loop when it detects failed drives). However, there were rare drive failure modes where the drive would malfunction - but not in a way detectable by the shelf - leading to the entire FC-loop malfunctioning - and leading to multi-disk failure. The disks are dual attached - but in this failure mode - they would take out both loops. That said - this is circa 2004/5. New generation shelves fixed the problems, as well as Netapp was asking shelf vendors for software interface to power-cycle failed drives (so that netapp software could take out bad drives by hard power-reset instead of relying on shelf firmware). I don't know the current status. In general - one of the big value adds of using Netapp (or EMC for that matter) is their extensive understanding of drive/shelf failure modes and the ability to proactively predict and take safeguard actions against such failures.=20 Regarding RAID mirroring - well - it actually protects against cases like this (since Netapp always puts mirrored copies on different shelves/loops - thereby protecting against shelf/loop failure). But RAID-4/5 (or netapp dual-parity) with backups and/or replication is a good alternative (with somewhat lower availability guarantees and performance) Hope this helps .. -----Original Message----- From: C G [mailto:parallelguy@yahoo.com]=20 Sent: Thursday, September 13, 2007 9:47 AM To: hadoop-user@lucene.apache.org Subject: Re: Multiple HDFS paths and Multiple Masters... Allen, Ted: =20 Good stuff...thanks for the information. Ted, a bit off-topic but your comment about netapp single drive failures gave me pause, particularly since we have a large one deployed now. Would you mind saying more on that...feel free to contact me direct since it is off-topic. =20 Thanks! C G =20 Ted Dunning wrote: =20 On 9/13/07 6:00 AM, "C G"=20 wrote: > I'd like to run nodes with around 2T of local disk set up as JBOD. So I > would have 4 separate file systems per machine, for example /hdfs_a, /hdfs_b, > /hdfs_c, /hdfs_d . Is it possible to configure things so that HDFS knows > about all 4 file systems? Yes. This is normally done to allow heterogeneity in data/task nodes. You make a list of all of the file systems that MIGHT be available and hadoop figures out which are available and which have space to use. > Since we're using HDFS replication I see no point in > using RAID-anything...to me that's the whole point of replication Comments? That is the intent! > Is it possible to set things up in Hadoop to run multiple masters? Not yet. Doug makes very good points on this topic that a single master will be fairly reliable and that it is the cluster that will have common failures and thus must be robust to node failure. There are lots of HA options. One that looks very nice to me (but that I haven't tried) is DRDB which is a block level disk replication service. See http://www.drbd.org/ for more information (and let us know how it looks). The secondary nameserver may be of some help in recovery as well, but it is unlikely to be as quick as a replicated disk and a CARP based IP address. > If you can't run multiple namenodes, then that sort of implies the machine > which is hosting *the* namenode needs to do all the traditional things to > protect against data loss/corruption, including frequent backups, RAID > mirroring, etc.=20 Some of these things are happening already, but the others are not a bad idea at all. Consider your hardware carefully. RAID mirroring can *decrease* reliability if you get a failure from either drive. Happened to me on my home machine and I have heard of other cases as well. Even in sophisticated implementations such as are done by Netapp, you can have drive failures that freeze an entire shelf. My preference any more is replicated simple machines rather than fancy machines. =20 --------------------------------- Don't let your dream ride pass you by. Make it a reality with Yahoo! Autos.=20