Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EB8A8DB98 for ; Mon, 13 Aug 2012 13:58:04 +0000 (UTC) Received: (qmail 64541 invoked by uid 500); 13 Aug 2012 13:57:59 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 64441 invoked by uid 500); 13 Aug 2012 13:57:59 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 64434 invoked by uid 99); 13 Aug 2012 13:57:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Aug 2012 13:57:59 +0000 X-ASF-Spam-Status: No, hits=1.8 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of aji1705@gmail.com designates 209.85.214.176 as permitted sender) Received: from [209.85.214.176] (HELO mail-ob0-f176.google.com) (209.85.214.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Aug 2012 13:57:53 +0000 Received: by obbtb18 with SMTP id tb18so8576856obb.35 for ; Mon, 13 Aug 2012 06:57:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=9a4QlvhxZviF/qMFAgqAfbKKt9ltXSwal0eMXpDdG/Q=; b=TwTZTudh/896cVM1fDXUfZKPkSfkZNflSLsJcxyXk+vrMJaPglwaReiLqmpUx+LSbJ Da4sbMzcojSmKGusTVVLyNdwmRfeMsJ3Uukva6jIKClqdk6uKgjdMbBYujerUF83HzCB Gw7U67ku/kbmbKAWIavrsnCyEgJznq8wfyszChXBaxQBvDDk4CGqQKLXKenwKn4YOii5 qpQIW9A7hG8yOhfEvyyiwhdZIn5GIqC6YTd3a5EL/W4fPjYafsP9DN3fK5oMYWOBre5t lfk2PFoaVyGEGQlr53gGTk9qZjkFDNZf6VsQAmtT+u7OsAVAeR2xcsefzzgj/r8/9Bnt krpg== MIME-Version: 1.0 Received: by 10.182.110.67 with SMTP id hy3mr11504738obb.52.1344866252373; Mon, 13 Aug 2012 06:57:32 -0700 (PDT) Received: by 10.182.232.101 with HTTP; Mon, 13 Aug 2012 06:57:32 -0700 (PDT) In-Reply-To: <299B9E77-367A-43FE-A7D3-509A6FBC7B9D@hortonworks.com> References: <299B9E77-367A-43FE-A7D3-509A6FBC7B9D@hortonworks.com> Date: Mon, 13 Aug 2012 09:57:32 -0400 Message-ID: Subject: Re: Hadoop hardware failure recovery From: Aji Janis To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d0445171d3e3f8104c7261345 --f46d0445171d3e3f8104c7261345 Content-Type: text/plain; charset=ISO-8859-1 Thank you everyone for all the feedback and suggestions. Its good to know these details as I move forward. Piling on to the question, I am curious if any of you have experience with Accumulo (a requirement for me hence not optional). I was wondering if the data loss (physical crash of the hard drive) in this case would be resolved by Hadoop (HDFS I should say). Any suggestions and/or where I could find some specs on this would be really appreciated! Thank you again for all the pointers. -Aji On Sun, Aug 12, 2012 at 3:07 PM, Arun C Murthy wrote: > Yep, hadoop-2 is alpha but is progressing nicely... > > However, if you have access to some 'enterprise HA' utilities (VMWare or > Linux HA) you can get *very decent* production-grade high-availability in > hadoop-1.x too (both NameNode for HDFS and JobTracker for MapReduce). > > Arun > > On Aug 10, 2012, at 12:12 PM, anil gupta wrote: > > Hi Aji, > > Adding onto whatever Mohammad Tariq said, If you use Hadoop 2.0.0-Alpha > then Namenode is not a single point of failure.However, Hadoop 2.0.0 is not > of production quality yet(its in Alpha). > Namenode use to be a Single Point of Failure in releases prior to Hadoop > 2.0.0. > > HTH, > Anil Gupta > > On Fri, Aug 10, 2012 at 11:55 AM, Ted Dunning wrote: > >> Hadoop's file system was (mostly) copied from the concepts of Google's >> old file system. >> >> The original paper is probably the best way to learn about that. >> >> http://research.google.com/archive/gfs.html >> >> >> >> On Fri, Aug 10, 2012 at 11:38 AM, Aji Janis wrote: >> >>> I am very new to Hadoop. I am considering setting up a Hadoop cluster >>> consisting of 5 nodes where each node has 3 internal hard drives. I >>> understand HDFS has a configurable redundancy feature but what happens if >>> an entire drive crashes (physically) for whatever reason? How does Hadoop >>> recover, if it can, from this situation? What else should I know before >>> setting up my cluster this way? Thanks in advance. >>> >>> >>> >> > > > -- > Thanks & Regards, > Anil Gupta > > > -- > Arun C. Murthy > Hortonworks Inc. > http://hortonworks.com/ > > > --f46d0445171d3e3f8104c7261345 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thank you everyone for all the feedback and suggestions. Its good to know t= hese details as I move forward.=A0

Piling on to the ques= tion, I am curious if any of you have experience with Accumulo (a requireme= nt for me hence not optional). I was wondering if the data loss (physical c= rash of the hard drive) in this case would be resolved by Hadoop (HDFS I sh= ould say). Any suggestions and/or where I could find some specs on this wou= ld be really appreciated!=A0


Thank you again for all the pointers.
-Aji






<= br>
On Sun, Aug 12, 2012 at 3:07 PM, Arun C= Murthy <acm@hortonworks.com> wrote:
Yep, had= oop-2 is alpha but is progressing nicely...

However, if = you have access to some 'enterprise HA' utilities (VMWare or Linux = HA) you can get *very decent* production-grade high-availability in hadoop-= 1.x too (both NameNode for HDFS and JobTracker for MapReduce).

Arun

On Aug 1= 0, 2012, at 12:12 PM, anil gupta wrote:

= Hi Aji,

Adding onto whatever Mohammad Tariq said, If you use Hadoop = 2.0.0-Alpha then Namenode is not a single point of failure.However, Hadoop = 2.0.0 is not of production quality yet(its in Alpha).
Namenode use to be a Single Point of Failure in releases prior to Hadoop 2.= 0.0.

HTH,
Anil Gupta

On Fri, Aug 10, 20= 12 at 11:55 AM, Ted Dunning <tdunning@maprtech.com> wrot= e:
Hadoop's file system was (mostly) c= opied from the concepts of Google's old file system.

The original paper is probably the best way to learn about that.
=

http://research.google.com/archive/gfs.html



On Fri, Aug 10, 2012 at 1= 1:38 AM, Aji Janis <aji1705@gmail.com> wrote:
I am very new to Hadoop. I am considering setting=A0up a Hadoop cluster con= sisting of 5 nodes where each node has 3 internal hard drives. I understand= HDFS has a configurable redundancy feature but what happens if an entire d= rive crashes (physically) for whatever reason? How does Hadoop recover, if = it can, from this situation? What else should I know before setting up my c= luster this way? Thanks in advance.






--
Thanks &= ; Regards,
Anil Gupta

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



--f46d0445171d3e3f8104c7261345--